Which programming language would you recommend to learn about data structures and algorithms in?

Considering the following:

Personal experience

Language features (pointers, OO, etc)

Suitability for learning DS & A concepts

I ask because there are some books out there that are programming language-agnostic (written from a Mathematical perspective, and use pseudocode). If I learn from one of these, I would like to choose a programming language to code and run the algorithms in.

Then, there are other books which introduce DS & A concepts with examples written in a particular programming laguage - and I would like to code these algorithms as well - thus, to a certain extent, the language picks the book too.

Either way, I have to pick a language, and I would prefer to stick to one throughout. Setting aside personal language preferences, which one is best for this purpose?

There's no possible way to answer this question except in the specific, and that needs more information.
–
David ThornleyMay 12 '10 at 20:53

@David Thornley : I do understand it is a bit open-ended, but it has gotten a lot of really great answers!
–
bguizMay 17 '10 at 8:07

There is much ado about the energy(used to be: time) efficiency of algorithms&data structures: if and when you want to measure this, look for systems where meaningful numbers are reproducably (and easily) obtained.
–
greybeardDec 24 '14 at 17:07

14 Answers
14

The answer to this question depends on exactly what you want to learn.

Python and Ruby

High level languages like Python and Ruby are often suggested because they are high level and the syntax is quite readable. However these languages all have abstractions for the common data structures. There's nothing stopping you implementing your own versions as a learning exercise but you may find that you're building high level data structures on top of other high level data structures, which isn't necessarily useful.

Also, Ruby and Python are dynamically typed languages. This can be good but it can also be confusing for the beginner and it can be harder (initially) to catch errors since they typically won't be apparent until runtime.

C

C is at the other extreme. It's good if you want to learn really low-level details like how the memory is managed but memory management is suddenly an important consideration, as in correct usage of malloc()/free(). That can be distracting. Also, C isn't object-oriented. That's not a bad thing but simply worth noting.

C++

C++ has been mentioned. As I said in the comment, I think this is a terrible choice. C++ is hideously complicated even in simple usage and has a ridiculous amount of "gotchas". Also, C++ has no common base class. This is important because data structures like hash tables rely on there being a common base class. You could implement a version for a nominal base class but it's a little less useful.

Java

Java has also been mentioned. Many people like to hate Java and it's true that the language is extremely verbose and lacking in some of the more modern language features (eg closures) but none of that really matters. Java is statically typed and has garbage collection. This means the Java compiler will catch many errors that dynamically typed languages won't (until runtime) and there's no dealing with segmentation faults (which isn't to say you can't leak memory in Java; obviously you can). I think Java is a fine choice.

C#

C# the language is like a more modern version of Java. Like Java it is a managed (garbage collected) intermediate compiled language that runs on a virtual machine. Every other language listed here apart form C/C++ also run on a virtual machine but Python, Ruby, etc are interpreted directly rather than compiled to bytecode.

C# has the same pros and cons as Java, basically.

Haskell (etc)

Lastly you have the functional languages: Haskell, OCaml, Scheme/Lisp, Clojure, F#, etc. These think about all problems in a very different way and are worth learning at some point but again it comes down to what you want to learn: functional programming or data structures? I'd stick to learning one thing at a time rather than confusing the issue. If you do learn a functional language at some point (which I would recommend), Haskell is a safe and fine choice.

My Advice

Pick Java or C#. Both have free, excellent IDEs (Eclipse, Netbeans and IntelliJ Community Edition for Java, Visual Studio Express for C#) that make writing and running code a snap. If you use no native data structure more complex than an array and any object you yourself write you'll learn basically the same thing as you would in C/C++ but without having to actually manage memory.

Let me explain: an extensible hash table needs to be resized if sufficient elements are added. In any implementation that will mean doing something like doubling the size of the backing data structure (typically an array) and copying in the existing elements. The implementation is basically the same in all imperative languages but in C/C++ you have to deal with segmentation fauls when you don't allocate or deallocate something correctly.

Python or Ruby (it doesn't really matter which) would be my next choice (and very close to the other two) just because the dynamic typing could be problematic at first.

I would NOT use Java and C# mainly because of the strict OO-orientation which is just unnecessary for this. Furthermore: who cares about writing Generic code when the point is learning the data structure ? In my mind, either you pick a scripting language (Python) and focus on the high-level or you pick a low-level language C/C++ and try to see how it's implemented at machine level. Stopping in-between does not seems worth it.
–
Matthieu M.Apr 19 '10 at 7:30

In my opinion, C would be the best language to learn data structures and algorithms because it will force you to write your own. It will force you to understand pointers, dynamic memory allocation, and the implementations behind the popular data structures like linked lists, hash tables, etc. Many of which are things you can take for granted in higher level languages (Java, C#, etc.).

I think downvoting this because you don't like Java (which seems to be happening) is irresponsible. You may not like Java but it's simple enough to use as a learning language. So +1 from me.
–
cletusApr 17 '10 at 6:45

7

+1. Not my choice, but really, it's not terrible. The vote score is like you suggested COBOL.
–
Rob LachlanApr 17 '10 at 6:46

2

+1 because for a beginner: 1. Exactly, you don't have to worry about memory alloc/dealloc (for small programs at least). You could instead focus on what you got to learn for the moment. 2. Yes, no sneaky pointers, or pointers to pointers. Don't get me wrong, I love C++. 3. The collections in Java are probably the most refined set of datastructures I've seen. They should really be in the dictionary under datastructures. :)
–
crunchdogApr 17 '10 at 6:58

3

If you don't understand how to do your own resource management, you haven't learned much about data structures.
–
AlanApr 17 '10 at 16:12

Python is great. Easy to read, fully featured. If you are going to work with pseudocode, Python will look pretty familiar.

Python is already the algorithms language of choice at UC Irvine, where it is described like so:
"Python represents an algorithm-oriented language that has been sorely needed in education. The advantages of Python include its textbook-like syntax and interactivity that encourages experimentation."

Python also works in a beginner friendly way with Gato, a graph making tool. Learning Algorithms and Data Structures is one top that can help by being made visual, something that Gato makes it easy to do (without learning any complex graphing libraries)

I have a personal dislike for Python because of its syntactic use of indentations. I find it's more difficult to find silly bugs due to misalignment than it would for syntax based on curly braces and such, supported with additional nroff-like alignment.
–
MichaelDec 22 '13 at 20:55

If the purpose is to only learn about data structures and algorithms, I would say JavaScript. You can run your code in a browser. You have a very flexible object handling and you can focus entirely on the data structures and algorithms and not memory management, language constructs or other stuff that will take the focus away from the actual computer science you are learning.

The bonus is also that you can easily visualize various data structures by using the browser to render graphs and trees using DOM and Canvas.

CS courses over the years tend to change the language in which the subject is taught, simply because newer and better implementations of languages that ease learning has arrived which makes it easier to focus on the actual problem.

I would suggest Ada. It has features for data constructs not found in other languages, such as range checks type Day is range 1 .. 31; Also it has very strict compile-time and run-time checking (unless you choose to turn it off), making it easier to find bugs in your implementation.

Einstein once said "Make it as simple as possible, but not simpler." This phrase was chosen by Prof. Niklaus Wirth as epigraph to the original Oberon language report. And it's true for Oberon's descendants mentioned above.

When it comes to the perfection of programming language I like to quote Antoine de Saint-Exupéry: "A designer knows he has arrived at perfection not when there is no longer anything to add, but wen there is no longer anything to take away.". Wirth, even if not achieved this, is on the right path. In "Wirth programming languages line" (Algol -> Pascal -> Modula-2 -> Oberon -> Oberon-2) each subsequent language is simpler and at the same time more powerful than the previous one.

Powerful but simple languages following the principle of least surprise. Strong static typing, easy object-oriented facilities, garbage collection. The feature list is not big but it's enough to be productive and not to complicate things especially on the initial stages.

When you want to learn algorithms and data structures, you mean it. But if your language is "powerful" (has a lot of features like C++, C#, Java, Python, ...) you will waste a lot of time learning language, not algorithms and data structures. You will not see the forest for the trees. =) You can think of trees as syntax elements (and any other features) and of forest as important concept (any algorithm, data structure, may be OOP, whatever). The more features (trees) you have in your language the more complicated become the task to step back and to understand the concepts (to see the forest).

But if language is really powerful (has small set well proven features) the language itself goes to second place. There not so many trees so you can do a couple of steps back and ... Well I think that's enough analogies. =)

Also many books on algorithms and data structures use Algol/Pascal-like pseudocode and it will be easy to convert examples in this languages. And you can directly use examples from Wirth's "Algorithms and Data Structures" book. Oberon edition (2004), PDF (1.2 MB).

If you want to take the path of least resistance, then Python. It'll have the minimum amount of unnecessary boiler plate and such like.

Ideally, I'd want to learn algorithms in C, so you can learn what's going on at the memory level; I'd also want to learn algorithms in a functional language, so you can see how similar algorithms work with persistent data structures.

Knuth's famous books contain large amounts of (invented platform) assembler code. This is recommended if you want to be super hardcore. Personally, though, I worked in C when I was working through my algorithms class (disclosure: this was only a couple of years ago). I'm sometimes work on some problems in Knuth, but I don't know if I'd go with MMIX entirely as my language of choice for learning algorithms. It's a bit overkill, I'd feel.

EDIT:
It also depends on what you're familiar with. If you want to start working through an algorithms text right now, and you've never worked much with C, then Python is far and away the correct answer. You want the language not to be a huge hurdle to overcome, because you want to enjoy this. I know I did.

Last point: at least when I was learning algorithms, I spent a hell of a lot of time working on paper. I think that's important -- I mean you want to learn about asymptotics, etc. Spending all of your time implementing algorithms in whatever language is not the thing to do.

@Rob Lachlan : Is there a python-based DS & A book you would recommend?
–
bguizApr 17 '10 at 7:31

@bguiz: most of the decent books on algorithms that I like are lanugage agnostic -- Cormen et al., Kleinberg and Tardos. I really wouldn't pick one on the basis of language.
–
Rob LachlanApr 17 '10 at 15:53

"If your only tool is a hammer then all of your problems will tend to look like nails"

Learn a least a few languages.

Also, your choice depends on your purpose.

Hobby? Job in Windows world? Linux/UNIX family?

Type of applications: business versus scientific; hardware drivers or applications?

Desktop applications or web applications?

I have several suggestions for you.

(a) definitely learn some J (free from jsoftware.com; successor to APL; both J and APL are creations of Ken Iverson, Turing winner ... Turing award is like Nobel prize in computing).

(b) if you are in Windows world, start with c# because so much in .NET runs on c#. If you can, get a copy of Tom Archer's "Inside c#" from Microsoft Press. You can get a free c# development system by downloading Microsoft's express version.

(c) learn to use TDD/BDD ... regardless of language, first you write a small test called a unit test; next you write the production code to pass the unit test; one small step at a time ... it's not just the language that you use, it's also the methodology.

(d) learn some assembler language ... assembler is low level, almost machine language, it will give you a good understanding of what is going on behind the scenes.

(e) outside of the Windows world, I'd recommend c++.

There is no best language.

If it were only about language, programming would be easier.

Not only do you want to learn algorithms which are very specific, you also want to learn patterns which are more general and can help you in selecting the approach to solving a given problem.

One thing is for certain: you will likely never run out of things to learn if you're going to become a programmer.

@bguiz data structures can be for all intents fully idependent of language; this is one reason for learning different languages. You will also encounter subtle differences that can cause frustration and even grief; example, data type naming: bit for SQL Server is bool for c# and Boolean for vb. data type size varies too; example, int in c# is fixed at 32 bit where in c++ its size and therefore its storage capacity depends on platform. Character sets also impact your data structure size; examples, 7-bit ASCII, 8-bit ASCII, Unicode. Then there's fixed size vs. varying, et cetera.
–
gerryLowryApr 18 '10 at 15:55

"data structures can be for all intents fully idependent of language". To implement most purely functional data structures in a language that does not provide garbage collection you will basically have to write a garbage collector. That is a serious impediment.
–
Jon HarropDec 12 '14 at 21:17

You may appreciate a language with algebraic datatypes and pattern matching such as Standard ML, OCaml, F# or Haskell. For example, here is a function to rebalance a red-black binary search tree written in OCaml/F#:

My first university programming course was in Lisp. Before that I had been writing programs in several languages for 10 years. I thought that the first programming course would be boring, but I was wrong.

Lisp is a very interesting language because it has a very simple syntax. Focus shifts from syntax to functionality. The functional programming style is also an extremely valuable thing to learn. After my Lisp course I found myself writing programs in C++ in a completely new, better way, thanks to the new concepts Lisp had taught me.

Lisp also uses the same representation for code and data, which opens up for interesting algorithm design with code generated on the fly and then executed.

I may be wrong, but aren't data structures and algorithms independent of the programming languages?

In the end, data structures are just a way of organizing data; any high level language will support that. Sure, certain languages will have mechanisms implementing basic data structures (such as Collections Framework in Java or C++ STL), but it does not stop you from programming data structure in the programming language of your choice. Moreover, algorithms are written in pseudocode, making them language independent.

I realize it's not really answering your question, but I'm having trouble grasping what you are looking for; learning data structures/algorithms or learning a new language.

@Pran : I know that algorithms are in pseudocode - but pseudocode won't compile. I am a hands-on type learner, so to truly understand the concepts, I would need to code it in a language that can compile and run. Therefore, my question really is what is th best suited language for this, in the sense that each language ould have it's own pros and cons, making some of them better suited to learning DS&A than others.
–
bguizApr 17 '10 at 9:36

@Pran: "I may be wrong, but aren't data structures and algorithms independent of the programming languages?". If the language doesn't provide a GC then you may have to write one.
–
Jon HarropDec 12 '14 at 21:18

-1 C++ is a terrible learning language (and arguably you can drop the "learning" qualifier from that statement).
–
cletusApr 17 '10 at 6:35

3

He need not learn the dark corners of C++ in order to code his algorithms in C++. C++ is perfectly fine.
–
Prasoon SauravApr 17 '10 at 6:45

7

so you think it's a good idea to try and code a class for a DS in C++ and get caught up on the differences between a copy constructor vs overriding the equals operator, references and pointers going out of scope, memory leakage from incorrect new/delete usage, etc? All of which are pretty fundamental to C++.
–
cletusApr 17 '10 at 6:47

Yes these are just basic stuffs and he can learn these in a few days. I just don't see the harm in using C++.
–
Prasoon SauravApr 17 '10 at 6:55

5

but the point is you don't HAVE to learn those things in python or java. C++ requires a much larger initial investment, and is in no way more valuable to the OP's requested function
–
mvidApr 17 '10 at 6:58