Other sites

Monthly Archives: October 2012

One Paragraph Summary Always explore your data visually. Whatever specific hypothesis you have when you go out to collect data is likely to be worse than any of the hypotheses you’ll form after looking at just a few simple visualizations of that data. The most effective hypothesis testing framework in existence is the test of

R-bloggers provides a great service, aggregating a universe of blogs which contribute aRticles on R and using R (marked using an "R"-tag.This is a nice community service creating a one-stop shop for readers to learn about R, but also a great idea for a...

I gave a short talk today to the about ggplot. This what I presented. Additional resources at the bottom of this post
ggplot is an R package for data exploration and producing plots. It produces fantastic-looking graphics and allows one to slice and dice one’s data in many different ways.
Comparing with base...

As we noted last month, the new Themes feature in ggplot2 helps you customize the design of R charts to your liking. Now, R user Jeffrey Arnold has built on this feature to create standardized themes to make R graphics looks like those from major publications and other software systems. You can use his ggthemes package to make your...

BSMAP is an aligner for bisulfite sequencing reads. It outputs aligned reads as well as methylation ratios per base (via methratio.py script). The methylation ratios can be read into R via methylKit package and regular methylKit analysis can ...

Since F-Secure was #spiffy enough to provide us with GeoIP data for mapping the scope of the ZeroAccess botnet, I thought that some aspiring infosec data scientists might want to see how to use something besides Google Maps & Google Earth to view the data. If you look at the CSV file, it’s formatted as

Henry John-Alder told me once that in a marathon, twice as runners cross the line at 2h 59m than at 3h 00m. He pointed out that this anomaly in the distribution of finishers per minute (roughly normal shaped) is due … Continue reading →

This Gist is mostly for my future self, as a reminder of how to find distances between each row in two different matrices. To create a distance matrix from a single matrix, the function dist(), from the stats package is sufficient.
There are times, ho...

Open your sources.list file in geditsudo gedit /etc/apt/sources.listand add the following line:deb http://cran.cnr.berkeley.edu/bin/linux/ubuntu/ precise/Note that you don't have to use that mirror. You may use any mirror from the list here : http://cran.r-project.org/mirrors.htmlAdd the secure APT key to your system with one commandsudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9Update your sources and upgrade your installationsudo apt-get update...

I have previously described and back-tested the Permanent Portfolio strategy based on the series of posts at the GestaltU blog. Today I want to show how we can improve the Permanent Portfolio strategy perfromance using following simple tools: Volatility targeting Risk allocation Tactical market filter First, let’s load the historical prices for the stocks(SPY), gold(GLD),