Thursday, March 19, 2015

Introduction to Julia

"Julia is a fresh approach to technical computing." boasts the startup message, flourished with colorful circles hovering above a bubbly ASCII Julia logo. The formatting effort is not wasted, it's an exuberant promise: Julia will make the command line fun again.

Julia was created by four Data Scientists from MIT who began working on it around 2011. The language is beginning to mature at a time when the Data Scientist job title is popping up on resumes as fast as Data Scientist jobs appear. The timing is excellent. R programming, an offshoot of S Programming , is the language of choice for today's mathematical programmer. But it feels clunky, like a car from the last century. While Julia may not unseat R in the world of Data Analysis, plans don't stop there.

If you want to code along with the examples in this article, jump to Getting Started with Julia and chose one of the three options to start coding.

Julia is a general purpose programming language. It's creators have noble goals. They want a language that is fast like C, they want it flexible with cool metaprograming capabilities like Ruby, they want parallel and distributed computing like Scala, and true Mathematical equations like MATLAB.

Why program in Julia?

1) Julia is Fast

Julia already boasts faster matrix multiplication and sorting than Go and Java. It uses the LLVM compiler, which languages like GO use for fast compilation. Julia uses just in time (JIT) compilation to machine code , and often achieves C like performance numbers.

2) Julia is written in Julia

Contributors need only work with a single language, which makes it easier for Julia users to become core contributors.

"As a policy, we try to never resort to implementing things in C. This keeps us honest – we have to make Julia fast enough to allow us to do that" -Stephan Karpinski

And, as the languages co-creator Karpinski notes in the comments of the referenced post, Writing the language itself in Julia means that when improvements are made to the compiler, both the system and user code gets faster.

3) Julia is Powerful

Like most programming languages, it's implementation is Open Source. Anyone can work on the language or the documentation. And like most modern programming languages, Julia has extensive metaprogramming support. It's creators attribute the Lisp language for their inspiration:

Like Lisp, Julia represents its own code as a data structure of the language itself.

a) Optional Strong Typing
Using strong typing can speed up compiling, but Julia keeps strong typing optional, which frees up programmers who want to write dynamic routines that work on multiple types.

c) Multiple Dispatch

Multiple dispatch allows Object Oriented behavior. Each function can have several methods designed to operate on the types of the method parameters. The appropriate method is dispatched at runtime based on the parameter types.

c) Data Frames

Data Frames are afundamental part of any Data Science language In Julia, they behave similar to the way they work in R Programming. Speaking of R, the RDatasets package makes many datasets available from R.

Here we get a subset of rows that have Ozone levels greater than 90. The period before the greaer than symbol tells Julia to broadcast the comparison to each member of the Array of Ozone values. Notice that the subset logic between the [] brackets is on the left side of the colon. Filters left of the colon select rows, filters right of the colon select columns.

5) Julia is a general purpose programming language.

As impressive as Julia is for Data Science, perhaps the most exciting thing about learning it is the potential to use it often. Packages for web technologies are highlighted well in the Julia Webstack. Hosting a RESTful webservice similar to running one in Ruby Sinatra takes just a few lines of code.

c) Packages and Deployment

Deployment is somewhat uncharted, bu thanks to Docker, Deploying Julia can be surprisingly easy. Still as the author notes, there is still work to be done creating deployment standards. Deployment best practices will evolve as Julia grows in popularity.

Getting started with Julia.

a) Code in the browser
One of the coolest things you'll see when learning Julia is Juliabox. You can mix markup and executable code in the same document. If you try Juliabox, you can upload the Introduction to Julia code used in this article to your own instance.

b) Cloud 9 IDEhttps://c9.io/
Cloud 9 is and excellent IDE. It has some predefined Julia runners and Julia code formatting built in. It's also free to create an account.

c) Install JuliaJulia installs easily on your local machine. I run it on a Chromebook/Linux using Crouton .

2 comments:

Interesting to see Julia is already ruffling feathers:https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/

It's true that the old niche platforms like R weren't really designed for todays multi-core processors and cloud strategies. But Julia is designed to be a general purpose programming that happens to be very good at technical computing. It's much more than a statistical package, but is a general purpose programming language. R is more comparable to it's statistical peers such as Matlab, Scilab and Octave.