Cloud services useful for Stat researchers

This post is not on statistics, but merely on a few tools that can be convenient for many researchers, in my opinion. At least some researchers here at CREST use them everyday! These tools belong to the cloud computing/service trend.

One thing that any people working on a computer is afraid of is losing his files. It’s probably the #1 nightmare of any PhD student. Many solutions make this nothing but a bad memory, a disaster from an Ancient time, like the Plagues of Egypt, the Behemoth or Claude François.

A good solution is to synchronize your files with a secure, reliable server on which you can count. For instance you can count on Google’s servers for not losing your documents, presumably (although you cannot count on them for not reading them but that’s a different story). So you can store your files on Google Docs (or a similar service) for instance. It’s free, but it gets painful if you have to do that every day. An interesting and very convenient solution is Dropbox. It consists of a little program that synchronizes the files of a given folder of your computer with servers provided by Dropbox Inc, a three years old startup. Their free offer allows you to store 2Gb, and of course you have to pay if you want more space. The good points are the following:

it works on Windows, Mac OS and Linux, and you don’t need the administrator privilege to install it on windows (therefore you can install it at CREST for instance),

if you have it on all your computers, your working directory gets synchronized seamlessly at startup; if you use to send emails to yourself or if you keep sync’ing your files on a USB key, you’ll definitely find it convenient,

you can share a sub-folder with other Dropbox users, which is convenient for a team project,

2Gb is not a lot if you store pictures and music files, but it’s a lot of TeX and program files,

you can set a file “public”, and get an URL for it, which is really useful if you want to share a file that is too big to be sent by email.

Overall my (office) life has improved since I use it, but the big catch here is that your data is stored on a private company’s server, so you have to trust them at least to some extent. Since I don’t work on sensitive matters I don’t mind, but that can obviously be prohibitive.

Other startups provide equivalent offers, although I didn’t test them: box.net, Sugar Sync… I’m surprised that this kind of service is not proposed by the main Internet service providers (like Google or Yahoo), but maybe their generally bad reputation of not respecting their customers’ privacy would make it hard for them to propose such a service.

One step further in these cloud offers, and more focused on statistics, is the ability to launch stat programs online. Some startups propose this service as well, like Monkey Analytics. On this site you can store Matlab, R and Python programs and launch them on their servers. This way they provide both a storing and a computing service. You can then access the results online, even from a smartphone. There is no free offer, but a 30-day free trial. I suppose it’s interesting for travelling statisticians, or for statisticians who need a lot of computing power (though they don’t give a lot of information on their clusters). Then again, you have to trust the private company behind the service…