Archive for March, 2011

Welcome to the March 2011 edition of A Month of Math Software (MMS) where I take you on a brief tour of new things in the world of mathematical software. If you like what you see then you may also be interested in last month’s edition and possibly January’s too. If I’ve missed anything then contact me and let me know.

SAGE, the open-source mathematics package based on Python, has seen a new minor release. Version 4.6.2 was released just after I published February’s edition of MMS so I’ve included it here. For a list of all things new see this thread.

Version 12 of EuMaT (Euler Math Toolbox) has been released but I can’t find a changelog (update: changelog is here) . If you’ve never used this software before then it’s a bit MATLAB-like and uses Maxima for symbolic stuff.

Version 4.0 of the free MATLAB toolbox, Chebfun, has been released. Chebfun is a collection of algorithms, and a software system in object-oriented MATLAB, which extends familiar powerful methods of numerical computation involving numbers to continuous or piecewise-continuous functions. Chebfun is a very interesting project as can be seen from the wide array of examples.

Update: In case it isn’t clear. None of the three projects above are my work, they are other people’s work! I just think they are cool!

Want to have a play?
If you’d like to use MATLAB as a way into physical computing then maybe the following resources will help you. I don’t think that they were used in the projects above but if I were to start playing with such things then I probably begin with one of these.

In my previous blog post I mentioned that I am a member of a team that supports High Throughput Computing (HTC) at The University of Manchester via a 1600+ core ‘condor pool’. In order to make it as easy as possible for our researchers to make use of this resource one of my colleagues, Ian Cottam, created a system called DropAndCompute. In this guest blog post, Ian describes DropAndCompute and how it evolved into the system we use at Manchester today.

The Evolution of “DropAndCompute” by Ian Cottam

DropAndCompute, as used at The University of Manchester’s Faculty of Engineering and Physical Sciences, is an approach to using network (or grid or cloud based) computational resources without having to know the operating system of the resource’s gateway or any command line tools of either the resource itself —Condor in our case — or in general. Most such gateways run a flavour of Unix, often Linux. Many of our users are either unfamiliar with Linux or just prefer a drag-and-drop interface, as I do myself despite using various flavours of Unix since Version 6 in the late 70s.

A simple and uniform drag-and-drop graphical user interface, potentially, to many resource pools.

No use of terminal windows or command lines.

No need to login to remote hosts or install complicated grid-enabling software locally.

No need for the user to have an account on the remote resources (instead they are accounted by having a shared folder allocated). Of course, nothing stops the users from having accounts should that be preferred.

No need for complicated Virtual Private Networks, IP Tunnelling, connection brokers, or similar, in order to access grid resources on private subnets (provided at least one node is on the public Internet, which is the norm).

Pop-ups notify users of important events (basically, log and output files being created when a job has been accepted, and when the generated result files arrive).

Somewhat increased security as the user only has (indirect) access to a small subset of the computational resource’s commands.

Please do take the time to look at this video as it shows clearly how, for example, Condor can be used via this type of interface.

This version was notable for using the commercial service: Dropbox and, in fact, my being a Dropbox user inspired the approach and its name. Dropbox is trivial to install on any of the main platforms, on any number of computers owned by a user, and has a free version giving 2GB of synchronised and shared storage. In theory, only the computational resource supplier need pay for a 100GB account with Dropbox, have a local Condor submitting account, and share folders out with users of the free Dropbox-based service.

David De Roure, then at the University of Southampton and now Oxford, reviewed this approach here at blog.openwetware.org/deroure/?p=97, and offers his view as to why it is important in helping scientists start on the ‘ramp’ to using what can be daunting, if powerful, computational facilities.

Version Two

Quickly the approach migrated to our full, faculty-wide Condor Pool and the first modification was made. Now we used separate accounts for each user of the service on our submitting nodes; Dropbox still made this sharing scheme trivial to set up and manage, whilst giving us much better usage accounting information. The first minor problem came when some users needed more –much more in fact– than 2GB of space. This was solved by them purchasing their own 50GB or 100GB accounts from Dropbox.

Problems and objections

However, two more serious problems impacted our Dropbox based approach. First, the large volume of network traffic across the world to Dropbox’s USA based servers and then back down to local machines here in Manchester resulted in severe bottlenecks once our Condor Pool had reached the dizzy heights of over a thousand processor cores. We could have ameliorated this by extra resources, such as multiple submit nodes, but the second problem proved to be more of a showstopper.

Since the introduction of DropAndCompute several people –at Manchester and beyond– have been concerned about research data passing through commercial, USA-based servers. In fact, the UK’s National Grid Service (NGS) who have implemented their own flavour of DropAndCompute did not use Dropbox for this very reason. The US Patriot Act means that US companies must surrender any data they hold if officially requested to do so by Federal Government agencies. Now one approach to this is to do user-level encryption of the data before it enters the user’s dropbox. I have demonstrated this approach, but it complicates the model and it is not so straightforward to use exactly the same method on all of the popular platforms (Windows, Mac, Linux).

Version Three

To tackle the above issues we implemented a ‘local version’ of DropAndCompute that is not Dropbox based. It is similar to the NGS approach, but, in my opinion, much simpler to setup. The user merely has to mount a folder on the submit node on their local computer(s), and then use the same drag-and-drop approach to get the job initiated, debugged and run (or even killed, when necessary). This solves the above issues, but could be regarded as inferior to the Dropbox based approach in five ways:

1. The convenience and transparency of ‘offline’ use. That is, Dropbox jobs can be prepared on, say, a laptop with or without net access, and when the laptop next connects the job submissions just happens. Ditto for the results coming back.

2. When online and submitting or waiting for results with the local version, the folder windows do not update to give the user an indication of progress.

3. Users must remember to use an email notification that a job has finished, or poll to check its status.

4. The initial setup is a little harder for the local version compared with using Dropbox.

5. The computation’s result files are not copied back automatically.

So far, only item 5 has been remarked on by some of our users, and it, and the others, could be improved with some programming effort.

A movie of this version is shown below; it doesn’t have any commentary, but essentially follows the same steps as the Dropbox based video. You will see the network folder’s window having to be refreshed manually –this is necessary on a Mac (but could be scripted); other platforms may be better– and results having to be dragged back from the mounted folder.

I welcome comments on any aspect of this –still evolving– approach to easing the entry ‘cost’ to using distributed computing resources.

Acknowledgments
Our Condor Pool is supported by three colleagues besides myself: Mark Whidby, Mike Croucher and Chris Paul. Mark, inter alia, maintains the current version of DropAndCompute that can operate locally or via Dropbox. Thanks also to Mike for letting me be a guest on Walking Randomly.

Part of my job at the University of Manchester is to help support the use of High Throughput Computing (HTC) services. I am part of a team that works within Manchester’s Faculty of Engineering and Physical Sciences (Physics,Chemistry,Maths,Engineering,Computer Science,Earth Sciences) but we don’t just work with ‘our’ schools; we also collaborate with teams from all over the University along with the central Research Support team.

For example, say you are a University of Manchester researcher and have written a monte-carlo simulation in MATLAB where a typical run takes 5 hours to complete. In order to get good results you might want to run this simulation 1000 times which is going to take your desktop machine quite a while; about 7 months in fact! You can either wait for 7 months or you can contact us for assistance. We’ll then do the following for you:

Give you access to our Condor pool (currently peaking at 1600 processor cores, more expected in the near future)

Assist you in modifying your code and writing script wrappers to make best use of those 1600 cores.

If appropriate, put you in touch with colleagues who support traditional HPC (High Performance Computing) supercomputers, GPUs (Graphics Processing Units) and similar technology.

Once we’ve done our bit, you throw your code at our Condor pool and get your results in less than an evening! Obviously we are not confined to MATLAB; we’ve also assisted users of Python, Mathematica, C, FORTRAN, Amber, Morphy and many more. Our job is to help researchers do research more quickly!

For a mathy science geek with a penchant for mathematical software it really doesn’t get much better than this! I get to play with high end hardware and the latest software, I get to learn new areas of science and mathematics and I get to work with world-leading experts in a multitude of fields. More importantly, I get to make a real difference.

Yep, this is one part of my job that I really love. Expect to see more articles on High Throughput Computing and technologies such as Condor on WalkingRandomly in the very near future.

I have been a huge fan of the Wolfram Demonstrations project ever since it was launched and have even contributed a few simple demonstrations myself. The project contains thousands of fully interactive mathematical demonstrations that anyone can play with using Wolfram’s free player software and it’s just got a whole lot better.

Now, you can interact with these demonstrations right from the web-browser!

Take the image below, for example. If you don’t have a copy of Wolfram’s new cdf (computable document format) player installed then you’ll just see a static image. Install the player (or Mathematica 8 and the browser-plug in), however, and this image will turn into a fully interactive example.

At the moment the browser plug-in only works on Windows and Macintosh but hopefully a Linux version will be on the way soon. Stay tuned to WalkingRandomly for in-depth tutorials on how to go from Mathematica code to fully interactive mathematics in your browser. The results can be incorporated into the Wolfram Demonstrations project or embedded in your own website like I’ve done here.

Apple make a big deal out of the fact that their app stores for iPhone and iPad contain thousands upon thousands of apps (or applications for relative oldies such as myself). Some of them are free of change, many of them cost money but I got to wondering how many of them were open source.

When I say ‘open source’ here I mean ‘The source code is available’. If there is a recognised license attached to the source code (such as GPL or BSD) then all the better. So, what do we have?

One of the things you’ll notice about iOS open source apps is that they often cost money and sometimes quite a lot which is in stark contrast to what you may be used to. For example, Battle for Wesnorth can be had for no money at all on platforms such as Linux and Windows but the iPad version costs $5.99 at the time of writing. The more serious, SCI-15C Scientific calculator costs $19.99 right now which is rather steep for any iPhone app let alone an open source one.

Charging money for open source software may upset some people but doing so is usually not against the terms and conditions of the underlying license. The Free Software Foundation (inventors of the GPL, one of the most popular forms of open source license) has the following to say on the matter (original source)

“Free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer.

Personally, I am happy to pay a few dollars for the iPad version of an open-source app if the developer has done a good job of the port. What does surprise me, however, is that it seems like no one has taken the source-code of these apps, recompiled them and then released free-of-charge versions on the app store. This wouldn’t be against the license conditions of licenses such as the GPL so why hasn’t it been done? I wouldn’t do it because I feel that it would be unfair to the developer of the iOS version but I would be surprised if everyone felt this way.

What’s next?

There are many open source applications that I’d love to see ported to iPad. Here’s my top three wants: