Monday, August 27, 2012

Visual Studio blocks compilation

It happens to me a lot. After compiling and realizing that I was not done coding, I hit the compile button again just to find that something went wrong. The point is that Windows still has the file marked as in use, so that it cannot be removed. It sometimes locks it for quite a while.

Watch carefully for the message:
1>LINK : fatal error LNK1168: Could not open xxx.exe for writing
The solution, if you haven't figured it out, is to manually open the folder (directory) where xxx.exe is located and delete it. This folder defaults to the solution's debug folder. If it doesn't let you delete the file, turn off Windows' Application Experience Service.

Sunday, August 26, 2012

L. Wasserman's & O. Bousquet's blogs

I was reading UCL CSML news and an entry of Larry Wasserman's blog popped up. Yes, this is the Wasserman's responsible for the statistical bible that can be used to exercise your mind by learning statistics and to exercise your muscles by lifting and putting the book on the desktop.

Also, I stumbled upon Olivier Bousquet's blog. It contains some thoughts about Machine Learning that will be of interest to practitioners.

I added a link to Wasserman's and Bousquet's blogs to the right.

Ohh hell: I just found out that the last entry of Bousquet's blog is 5 years old. Shame.

Saturday, August 25, 2012

SVM plot decision function

In the paper I submitted, I have to deal with SVMs and I wanted to plot the decision function that my kernel made with a 2D dataset.

I stumbled upon a 2D problem that I am interested in visualizing (I already posted about it). The point is that I know now how to extract the decision function values for a grid so you are able to plot them. Again, I am surprised how difficult it is to find it on the Internet.

With e1071, you need the following code
im=predict(K.svm,  ... ,scale=F,decision.values=T)
im=matrix(attributes(im)$decision.values,nrow=100,byrow=F)
image(seq(0, 20, length.out=100), seq(0, 20, length.out=100), im,xlab="",ylab="")
points(y)
pdf.options(reset=T)
Notice that we are extracting the value of the decision function with the attribute decision.values from the prediction SVM object. This prediction SVM object, obviously, was created with a grid of points with a range larger than the input data.

Wednesday, August 22, 2012

Paper submitted!

At last!!!

It is very hard to make up a novel theory and write the most you can about it, the best you can, and build a manuscript that conforms to the journal's standards.

I have survived!

43 pages in total. I am happy with the result, though I reckon I could have told things differently. In any case, this allows me to move on. I am back writing on the blogs!

The paper contains a machine learning technique that exploits statistical properties of the data. I might later submit the paper to the arXiv. Today, I am officially on vacation, after having invested all summer working on the paper.

In the course of my research I have come across interesting things. I will comment on them starting tomorrow.

Saturday, August 4, 2012

Using whatsapp on your PC (II)

First of all, don't use cracks even if you find no virus. They use Whatsapp's verification of your phone number to send messages to I don't know who.

That being said, my fiancée Sali found what I had given all hope finding: a free and competitive Android emulator called Bluestacks App Player.

Even though it is little known, Bluestacks is the BEST Android emulator with a vengeance.

It runs Whatsapp seamlessly, totally recommended.

I know now what the rascals are saying at any time.

Friday, August 3, 2012

Graph combination software (Shin, Tsuda, Schölkopf)

Sometime ago I was driven to write a conference paper on positive semi-definite matrix combination bent towards binary classification and its application to gene functional prediction. The team leader (yes, AMG) made me program a competing method so we could test against it. The competing method was the one proposed in the paper "Protein functional class prediction with a combined graph" by Hyunjung Shin.

I have worked on a number of papers with my old team, but this one was the one with the most potential. Up until that date, I had rarely worked on these techniques before and due to poor management, that paper got rejected. The underlying idea was great, but I was left with all the work and with little experience, all to do within the week. The competing method I programmed was never used and that weighed on the reviewers' decision to reject the paper.

Considering that the nodes are present in all graphs (they refer to the same objects), but are interconnected differently, the method defines a diffusion process on a graph, but penalizes the dissimilarity between adjacent nodes. Let $L_k$ be the Laplacian matrix for each graph, $K$ the total number of graphs, and the vector $y$ that we try to approximate
$$
\min_{\beta,f} = \sum\limits_{k=1}^K \beta f^T L f - \log \det \left( I - \sum\limits_{k=1}^K \beta_k L_k \right) + \mu (f-y)^T C (f-y)
$$
subject to $\beta_k\geq 1$ and $\beta^T 1 < 0.5$
where $\mu$ regularizes the approximation term. Notice that the coefficient computed $f$ are the same for each Laplacian.

So here I left it to you. Try it with small graphs because there is something missing. If you modify it please let me know.

A testing program example is
load("mat.RData")

source("graph_comb_shin.R")

d1<-apply(G1,1,sum)
id1<-d1
id1[id1>0]<-1/id1[id1>0]
iD1<-diag(sqrt(id1))
D1<-diag(d1)
L1<-iD1%*%(D1-G1)%*%iD1
EL1<-eigen(L1)
rm(d1,id1,D1,iD1,G1)

d2<-(apply(G2,1,sum))
id2<-d2
id2[id2>0]<-1/id2[id2>0]
iD2<-diag(sqrt(id2))
D2<-diag(d2)
L2<-iD2%*%(D2-G2)%*%iD2
EL2<-eigen(L2)
rm(d2,id2,D2,iD2,G2)

d3<-(apply(G3,1,sum))
id3<-d3
id3[id3>0]<-1/id3[id3>0]
iD3<-diag(sqrt(id3))
D3<-diag(d3)
L3<-iD3%*%(D3-G3)%*%iD3
EL3<-eigen(L3)
rm(d3,id3,D3,iD3,G3)

LL=list(L1,L2,L3)
rm(L1,L2,L3)

for (c in c(1:50)) {gc()}

mu=10
delta=0.4
tol=0.1

y=as.matrix(variables[,1]==1)
res=graph_comb_shin(LL,y,mu,delta,tol)

save.image("shin.Rdata")

Download the software.
Checkout the paper.