Lejon.GitHub.io

Welcome to Leif Jonssons GitHub Pages.

For fun I keep myself busy implementing some cool algorithms. I also have a Blog (who doesn't?)
where I post stuff related to mostly computing. When I get time I'll up some more info here, for now you'll have to
settle for my work on the very cool visualization algorithm by Laurens van der Maaten and Geoffrey Hinton called
t-SNE. As of 2016-11-03 the implementation supports the Barnes Hut optimization.

Java t-SNE graph (MNIST large)

t-SNE on 60000 images from MNIST using v2.4.0 of Barnes Hut t-SNE (execution took 18.3 minutes on a 2013 Macbook Pro).

Tips for visualizing with t-SNE

If you have problems getting any results from t-SNE (typically getting the 't-SNE ball'), there are some tricks you can try. Observe that this is not a specific problem with t-SNE, but rather
a standard problem in the Machine Learning area.

The first thing you can try is to center and scale your data. This can be done with the MatrixOps.centerAndScale()
method. What this does is subtracting the mean and dividing with the standard deviation of the matrix from each element of the matrix. This is
a usual 'trick' to get the data to be on the same scale.

Another trick you can try, and this is typically if you have counts in the matrix and lots of zero counts (so called zero inflated data)
is to take the log of all the values in the matrix. Now, you have to take care that the zeros are preserved when doing this and the
log operation will not typically do this for you. The t-SNE package have a helper method for this called MatrixOps.log(matrix, true);
where the 'true' means, keep zeros as zero and not -Infinity.