All the software a geoscientist needs. For free!

All of my research for the past 5 years was done with free software. In this post I describe the free programs that I use every day, and what I use them for. I do not use them simply to conform to stereotypes about cheap Scotsmen. As you will see, I use them because they are portable and very powerful.

Free/Libre Open Source Software (FLOSS) allows you to make as many copies the programs as you need and distribute them as you please. This makes it portable. Any workflows or methods can be taken to different computers, different institutions or sent to friends in different countries without worries about expensive licences. For example, I use GRASS GIS instead of ArcMap, Python instead of Matlab, Zotero instead of Endnote. I also use some free (gratis) proprietary software such as Google Earth. While philosophically different to FLOSS software, for practical purposes the advantages are the same.

My main reason to run Linux is the command line interface (CLI), which can be used to carry out tasks very quickly and precisely. It has the HUGE advantage that once you know the commands to do what you need, you can write them in a script and repeat the task 1000 times with very little extra effort. This makes it very powerful. It feels like your computer is working for you and most of my workflows now take advantage of this.

Of all the different Linux flavours, I chose Linux Mint 14 XFCE. It is based on the popular Ubuntu distribution so it has a wide range of software available in easily-installed packages and there are lots of helpful tutorials for it online. The latest versions of Ubuntu have a tablet-style interface; I prefer the way that Mint sets things up for the desktop. You could also try Xubuntu or Linux Mint Cinnamon instead as both are the same under the hood. Each comes as a LiveCD, so you try them out without altering your system.

The names of the Ubuntu software packages for each program are given below so that you can install them easily from the Software Centre or via the command line. Windows and Mac versions exist for most and can be found with a quick Google search.

Maps and Geographic Information Systems

GRASS (grass): Fully-featured and extremely-powerful GIS package with both GUI and command line interfaces. It handles raster and vector data in all formats and is easily scriptable to automate workflows. I use it to create new GIS datasets from raw data e.g. by processing LiDAR point clouds, digitising field maps, image analysis of multispectral remote-sensing data.

Quantum GIS (qgis, qgis-plugin-grass): Easy-to-use GUI-based GIS package. It is ideal for making and printing maps from pre-existing datasets. It also has a nice georeferencing tool and can be used as an interface to GRASS GIS.

GDAL (gdal-bin): A command-line swiss-army-knife for GIS files allowing you to convert formats, change projection, join, crop and alter the resolution of raster files and much more. Includes OGR, which does the same with vector files (e.g. shape files). This is what actually does a lot of work behind-the-scenes in GRASS and QGIS.

Proj4 (proj-bin, proj-data): Command line tools for reprojecting data points in different map projections (cs2cs). This works behind-the-scenes of GDAL.

Generic Mapping Tools (gmt, gmt-coast-low, gmt-doc): These command-line tools for plotting publication-quality maps of geophysical data are very popular among oceanographers and seismologists. You won’t see an issue of Journal of Geophysical Research that doesn’t contain at least one figure made in GMT.

Google Earth(instructions here): A 3D globe in your computer showing everything from the submarine mountains of the Mid-Atlantic ridge to the car parked in your street. Not FLOSS.

GPS Babel (gpsbabel): Communicate with any handheld GPS unit, and convert formats between gpx, kml, garmin and anything else that you can think of. The Windows version has a graphical user interface.

GPS Prune (gpsprune): GUI-based tool for editing GPS point and track data. The best feature is the ability to geotag photos then view them in Google Earth (see video here).

Data Processing and Plotting

Python (python): An open source, cross-platform programming language. It is widely-used by scientists and is extremely versatile because it can be easily extended using addon modules such as these below. Some of the other advantages are described here. Everything that I used to do in Matlab, I now do in Python, safe in the knowledge I can take the scripts with me wherever I go. The easiest way to get Python and most of the following packages onto a Windows machine is by installing Python(x,y).

IPython (ipython): Excellent interactive interface for Python. In particular, the IPython Notebook lets you write Python in your web browser, combining it with text, LaTeX, images, hyperlinks and videos. There are great examples the people have shared on the nbviewer website. It is going to revolutionise teaching students how to code.

Spyder (spyder): A development environment for Python, giving it a Matlab-like appearance and with features such as code-checking, command completion and automatic display of documentation for the current command / object.

Numpy and SciPy (python-numpy, python-numpy-doc, python-scipy, python-netcdf): Scientific and numerical computing modules for Python, allowing it to handle arrays of numbers, and the NetCDF data format.

Matplotlib (python-matplotlib, python-matplotlib-doc): Plotting modules for Python allowing you to make all kinds of publication-quality 2D and 3D figures such as these.

Basemap (python-mpltoolkits.basemap, python-mpltoolkits.basemap-data): Add-on for Matplotlib giving Python similar map-plotting functions to those of GMT e.g. plotting in different projections, adding coastlines or the Blue Marble image). See some examples here. It also contains the pyproj module which allows easy conversion between coordinate systems. See my post for a quick intro.

R (r-recommended): An open source, cross-platform programming environment, with a strong emphasis on statistics. Also very powerful for geospatial data. [Added in 2012 after overwhelming support in comments below. See them for useful links.]

SQLite (sqlite, sqlite3, sqliteman): This an open source database format. It can be accessed via the same Structured Query Language used by cutting-edge data servers, but the data are stored in a single, portable file. This allows you to perform cool queries such as getting a list of photos of samples that were collected on a Tuesday, in Scotland, and had ash in them. I am switching to storing sample data here because the data can then be accessed directly by GRASS and by Python.

SQLiteManager (Firefox plugin): A nice viewer that lets you edit and perform queries on SQLite databases.

LibreOffice Calc (libreoffice-calc): An open source spreadsheet program, and a viable substitute for Excel. LibreOffice is a slightly more independent version of Open Office. I don’t use spreadsheets that much, but it seems to do everything that I need it to. Gnumeric (gnumeric) is a quicker, but less featured, spreadsheet program.

Writing Journal Articles

Zotero (Firefox plugin): Reference manager software. It runs in Firefox and lets you add articles to the database directly from the journal website or the results page of a Web of Science query. It has a plugin that lets you put references into Word or Writer documents and can export BibTex files, too. Also, it syncs with the cloud, so your reference library is constant across different computers.

LaTeX (texlive, texlive-latex-extra, texlive-fonts-extra, texlive-humanities + others): LaTeX is an open source typesetting program. It is used to produce beautifully laid-out pdf documents from plain text files containing the text and some simple formatting codes e.g. section{Introduction}. The best thing is that it does referencing, section numbering, figure captions and tables of contents for you automatically. If you are about to write a thesis, then learning LaTeX will be one of the best things that you ever did. For a graphical-user-interface, try Lyx or MiKTeX.

LibreOffice Writer (libreoffice-writer): This is an open source word processing program. This is an ideal substitute for Microsoft Word on all platforms, as it can read and write .doc and .docx files. The most important features for me, comments and track changes, work perfectly. I need these to collaborate on work with my co-authors. It also prints straight to pdf, which is nice.

Conference Presentations

Scribus (scribus): I use this professional quality desktop publishing package to make conference posters. It is very easy to create good-looking layouts, align images and set font-themes, but that just scratches the surface of what it can do. The output is a pdf file that you can print anywhere. Read my quick-start guide here.

Beamer (latex-beamer): Make pdf-format conference slides in LaTeX. It has all the advantages of LaTeX e.g. beautiful results, no-fussing about layout, referencing and contents all taken care of. Plus the pdf files don’t get messed up between Mac/Windows/Linux versions like Powerpoint slides can.

LibreOffice Impress (libreoffice-impress): This is an open source Powerpoint substitute. It is definitely the weakest of the LibreOffice family. It can read and write Powerpoint files but sometimes the fonts and layouts come out differently, and it is generally a lot less slick. It does a decent job, though, and I have written a couple of lecture courses with it.

Programming Tools

GVIM (vim-gtk): Geeky text editor. Steep learning curve, but if you love keyboard shortcuts then give it a go. Once you discover macros you’ll be flying. Learn more here. So far, I mainly use it for LaTeX, but have recently found that it can connect to iPython to become a Python IDE.

git (git): Distributed version control for working offline and online.

meld (meld): Compare and merge differences between two text files.

Images, Graphics and Photos

Gimp (gimp): The Gnu Image Manipulation Programme is equivalent to Adobe Photoshop or Corel Photopaint. The interface takes some getting used-to, but it is very powerful.

Inkscape (inkscape): Inkscape is a vector graphics package equivalent to Adobe Illustrator or Corel Draw. It’s fast, light, and a joy to use.

Shotwell (shotwell): Photo viewing programme a bit like iPhoto on a Mac, allowing you to view your images using tags, ratings and events. Ideal for organising field photos.

Hugin (hugin): Panorama / photo stitching software. If you have to scan a map in many parts, it’s good for joining them up again.

Dia (dia): Software for drawing flowcharts and other structured diagrams.

Videos and Media

VideoLan Player (vlc): Play video files in almost any format that you can think of.

Openshot (openshot, openshot-doc): Simple video editing.

avconv (libav-tools): Command-line tool to change the size, framerate, format etc. of videos. Good for extracting the soundtrack as an mp3. Great for chopping out clips of sounds or videos. This used to be known as ffmpeg.

Get iPlayer (get-iplayer): Command line tool to download BBC iPlayer programmes to watch offline (only works within the UK).

Computer Administration Tools

Ubuntu Restricted Extras (ubuntu-restricted-extras): By default, Ubuntu only ships with open source software. This package installs commonly-used the proprietary tools such as Flash video, Microsoft fonts and MP3 codecs.

Open SSH (openssh-client, openssh-server): Connect securely to your machine across the internet without the fuss of a VPN. Log in with a terminal to see how jobs are getting on, or use a secure FTP program such as WinSCP to copy files.

Rsync (rsync): One-way synchronisation over SSH. I use this to automatically back up my desktop to the department server. It knows which files have changed and only sends the differences, so it runs very quickly.

Unison (unison): Two-way synchronisation between computers over SSH. I use this to sync the files on my netbook with my desktop machine each day.

Baobab (baobab): Nice graphical disk usage program. See which folders are taking up most space.

WINE (wine): Lets you run Windows programs on a Linux machine. Some people use it to play games or other complicated software, but it can be a bit hit-and-miss. I use to run the simple panorama-making software, Autostitch, which works perfectly.

Miscellaneous

Skype (skype): Free phone calls (with video) over the internet. The “Partners” repository should be enabled in the Software Centre before installation. Not FLOSS.

Adobe Acrobat Reader (instructions below): Evince, the pdf reader that comes as standard with Ubuntu is great for reading pdfs. But to add comments, make corrections, or fill in forms you need the Adobe version. Not FLOSS.

Stellarium (stellarium): See what’s in the night sky above. Still cool despite the invention of the Google Sky Map app.

Hotot (hotot): Twitter client that lets you view your lists in different columns.

Adblock Plus (Firefox plugin): The internet is a much faster and less cluttered place without adverts.

Pocket (Firefox plugin): Save articles from the internet to read later, and have them synchronised with your phone.

Installation script

The following script will install most of the above software onto a freshly-installed Ubuntu 12.10 machine. First ensure that the ‘universe’, ‘multiverse’ and ‘partner’ repositories are enabled in the Software Centre.

What have I missed?

These are the tools that I use in my day-to-day work as an academic geologist. I’m sure that there are plenty more for things like processing seismic data that I have missed. If you know any, please add them in the comments.

68 Comments

You seem to be missing a section for internet tools. What web browser are you using? Blogging software? Twitter client? Et cetera. Unless, of course, the internet is not something you use day-to-day for science.

In response to the questions I use:
– Firefox, because I like a lot of the plugins.
– I type up blog posts in Libre-Office Writer then tweak them in WordPress. I’d love recommendations for blogging software, especially if it sorts out images and captions. It can take ages to do the tweaking by hand.
– For Twitter, I used to use the Echofon plugin for Firefox, but since upgrading to Ubuntu 11.10 I use the built-in client, Gwibber.

Looks nice! Have been thinking about getting my Ubuntu on for a while, so will have a play and see.

Other programs you may like for your list are Mendeley a dynamite referencing program which kicks Endnote’s bum. As soon as I download a pdf through my browser, Mendeley picks it up, copies it to its own referencing subfolders, and searches for the doi. It usually gets the more modern papers correct first time, but the older ones may need a hand. Simply copying the doi from the paper does it quickly. Mendeley is especially useful if you are a LaTeX user, as you can easily use it to compile a BibTex file as you go. I therefore store all my references in one library file and point my LaTeX document to it. The actions I have just described need setting up, but are easy to do so.

Secondly, I hear (but have yet to play with) that Blender as a piece of freeware renders some beautiful graphics, both stills and animations, which I would use in talks to get the ooo’s and ahhh’s from the audience like a collection of candy floss induced kids on fireworks night. I am told the initial learning curve is very steep and I have yet to undertake it myself so I can’t offer help, but having seen the results, it is definitely something that you’d appreciate.

John, a great review. Python is quite a steep learning curve to eclipse Matlab. In this instance I would suggest that R has now come of age, and with ggplot, is now a viable replacement for most of the work that we ever did in Matlab. It is especially powerful in its use of frames for loosely structured data and I also like the way it can built into a broader, simpler scripting environment such as python if necessary. This abstracts the computational part of the knowledge discovery from the software engineering part.

Awesome summary John, I’m particularly happy to see the OSGeo.org apps in there 🙂 You have really quite a toolkit in your list – imagine how hard it would be to find even a good set of non-free tools and manage them in a similar fashion! The efficiency of setting up a system with Linux tools especially is no comparison for the download, click, install … one .. at .. a .. time .. approach for Windows/Mac.

Thank you John for your post. With gvSIG CE (www.gvsigce.org) you can also use functionalities of Grass GIS and SAGA. gvSIG CE is a community driven Open Source GIS project based on gvSIG OADE.
Functionalities of GRASS GIS and SAGA can be added to the SEXTANTE Toolbox: http://gvsigce.sourceforge.net/forums/viewtopic.php?f=3&t=547. You will see all the available raster- and vector tools from SEXTANTE, SAGA, GRASS GIS and gvSIG CE which all together are currently 760 algorithms.
Best regards!
Jose

The Zotero Style Repository has files for lots of different journals. Most journals that I use are published by the AGU, Springer or Elsevier, so if you can’t find the specific journal, then you can search for publishing house styles e.g. American Geophysical Union, Springer Author Date, Elsevier Harvard with Titles. You can use these within a word processor.

For LaTeX articles, you can download templates and bibliography files from the Instructions for Authors sections of the publishers’ websites. Follow the links to AGU, Springer and Elsevier.

Excellent post, John. There is not much to add here. I tend to think that OpenOffice/LibreOffice is quite a bloated (and who needs a spreadsheet anyway when you can use ASCII files and awk). However, whenever I really want to use a spreadsheet program I tend to use gnumeric. Another invaluable tool is meld a graphical diff tool which can also talk to subversion repos. Since others also mentioned R, I’d like to point out that you can use R from python.
Cheers
magi

Great summary – I use a lot of these myself, but on Mac OSX; also GNU Octave as a Matlab compatible alternative (runs most .m scripts) – amazingly powerful, but command line only. Having used both, I’d say Octave is an easier transition than moving to R unless you’re doing some heavy duty statistical work. Hugin is – a nice panorama stitcher – also good for stitching scanned historical maps, pdfsam (pdf split and merge) is indispensable. I have a largely unused Ubuntu partition – maybe I’ll try your installation script ……. nice
Best, Stu

I can only reiterate and reinforce what Stuart said:
hugin (http://hugin.sourceforge.net/) is a really great stitching program, for both panoramas and mosaics. It gives you full control OR does each step automatically, which is really cool!

Another vote in favour of Hugin as a panorama-editor and photostitcher. Can be a bit of a resource hog in the windows ports, but just use the native Linux packages. Is a photo-stitcher possible that isn’t a resource-hog? FLOSS! Arrr!

Great compilation.
For those who are happy using Windows Operating System and not sure of trying a different OS, GRASS may not be a cake walk, though I personally bet on it.
Starting with OpenJUMP, gvSIG and Qgis may convince the novice, who later might end up using GRASS as it is THE most versatile FOSS GIS.

Excellent post. About the transition to R, I would like to point out that there is an excellent GUI version available to the Debian family (to which Ubuntu belongs, as well as MEPIS 11 that I use) called RKWard.

In the wiki vein, I’d also recommend installing your own local instance of MediaWiki as a way to store all your notes and research. It might sound a bit over-the-top, but I’ve found it a tremendous way to build and maintain a minimally organized, densely linked cluster of notes. It stores and renders images and equations nicely too. There’s a little overhead though, so it’s not recommended for Linux or HTML noobs (then check out TiddlyWiki).

I like this post.
The one bit of geoscience software I haven’t been able to find as FOSS is 3D viewers (not 2.5D), and resource estimation software in the vein of Datamine/Surpac/Vulcan. Has anyone seen anything like that around?

NIce list John
For geosci plotting, our Geosci IT team have packaged openstereo. This is written by the Brazillian group on Python, Matplot and numpy and *rocks* despite lack of an instruc. man. (that’s open source for you!).

Great list! I just finished teach a semester long “research tools” course for first year graduate students in an Ocean Mapping program. Way to much material to really get depth on the topics and you list many I didn’t get to. Bash, Python, Emacs, org-mode, qgis, google earth, SQlite, XML, proj, gdal, matplotlib/numpy and parsing binary data. Sadly, I missed some big tools like GMT, MB-System, basemap for matplotlib, using checksums to validate data sets, etc.

Dear John
Iam from Iraq and now I continue my PhD study in Geophysics in Mosul University . I need to free software which can make and draw geological cross-section and geological fence diagrams for sub-surface lithology.

I’m not sure what the best choice for this would be. Matt Hall made a list of other geological software on Wikipedia (see his comment above).

GRASS GIS can plot 3D data e.g. points, planes, volumes, so you might be able to do what you want with that. If you want software to just ‘draw’ the sections, rather than ‘plotting’ real data, then I would use Inkscape, which is a vector graphics package (see above).

Great box of resources. Thanks for sharing.
Do any of you know freeware to draft stratigraphic columns and logs of boreholes to be compiled in cross sections?
I will be posting some when I find the locations…
Thanks.

Such a great list with useful and free software tools, thanks for sharing! In terms of conversions so users can have routes displayed, I would recommend this online program http://kml2gpx.com/ that works also vice versa, to make gpx to kml format.

Flow by Proquest is a new reference manager and by far my favorite that I’ve come across. I’m someone that spends more time researching ways to be productive then actually being productive, and Flow has been the best reference manager I’ve tried so far.

Great resources for geoscientists. Recently came across an application called FreeMAT, a free application similar to commercial applications such as MATLAB. FreeMAT website http://freemat.sourceforge.net