As a molecular biology graduate student I have decided to learn some basic programming and bioinformatics since everybody says that it is crucial. For example, what would you learn if you need to work with RNA-Seq data, compare and interpret them?

Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise.
If this question can be reworded to fit the rules in the help center, please edit the question.

I think that you should try to be more specific with your question. Right now it is too broad (you can do an entire degree in bioinformatics). What do you do now? What do you want to be able to do?
–
kmmFeb 23 '14 at 23:11

I know its too broad and thats why I am asking actually. Now I am a masters student in neurosciences and will continue with a PhD in the same area. A post-doc from my lab was saying to me that I need to be able to understand basic stuff about programming and bioinformatics. Lets say I am going to compare RNA-seq results and interpret them.
–
golgicikFeb 23 '14 at 23:13

Then the question might be better suited for academia.SE or biostars. I think you're going to get too many discussion-type answers here.
–
kmmFeb 23 '14 at 23:14

1

The coursera classes are a good introduction - one is starting next week! the focus on python a lot I think...
–
shigetaFeb 23 '14 at 23:30

1 Answer
1

Indeed the question is broad and quite hard to answer I think. I'll give a try. I very welcome editing to improve this answer.

The field of bioinformatics is a big field. Bioinformaticians need basic knowledge in biology, molecular genetics, programming and statistics.

You may find courses on statistics applied to bioinformatics here (R-language) and here (I haven't watched these sources).

You seem to be mostly interested in is programming. I think that Python (or Julia) is a very good start to get in touch with programming. Programming might looks a bit scary when you don't really know what it is about but you can easily, in few days, acquire basic knowledge in this field and already solve some pretty neat problems. Many people actually have lots of fun learning how to program. And you'll probably be amazed by all the power this tool will offer to you. I personally really enjoyed learning programming in Python. I did it (I was mostly interested into object-oriented programming, you'll learn what it means) in a day or two with a very good source but unfortunately this source is not available in english. But there are tons of introductory documents, you'll have no difficulty to find a good one. I'd counsel you to directly download Python and to look at online courses on khan academy or EdX (I haven't watch them).

While Python is a much more powerful programming language than R, I think that, as a biologist, it is very important that you know about R. R is a programming language which is slow (compare to Python, C, Java, …) but it is very useful for statistical analysis and visual display of data. Also many people use R in bioinformatics (for phylogenetic analysis typically). I think that acquiring basic knowledge in R takes more time than in Python because we tend to use R because of its huge amount of already existing functions and therefore, we have to learn many of these functions before understanding that R can indeed be much more useful than Python for some tasks.

You also ask about the usefulness of programming. Well it is used in pretty much all areas of biology. It is used for analyzing empirical data, computer simulations in population genetics, graph theory, annotating DNA sequences, … I guess that 98% of biologists have at least some basic knowledge in programming. The main point about programming is that it performs calculation much faster than anything you could ever realized with your calculator. Typically, in bioinformatics, analysis of DNA sequences often asks for very intense calculation and asks for big computation power. Processes such as constructing phylogenetic trees, determining goodness of fit of evolutionary models, annotating DNA, aligning DNA sequences, analyzing microarray and many other things are all sorts of tasks that require programming.

This is the answer I was looking for actually. Thanks a lot!
–
golgicikFeb 24 '14 at 12:19

2

I'm not at all sure that R is slower than python. Both are interpreted scripting languages. C and Java are completely different.
–
terdonFeb 24 '14 at 14:00

2

What you need to remember is that R is more of a domain-specific language (statistics and related fields like visualization), while Python is more of a "generic" programming language like the C superfamily, Java, Ruby, etc. What sets Python apart is its comparative ease of learning and use, the "batteries included" philosophy of the standard library, and the huge number of 3rd-party modules available for everything from bioinformatics (Biopython, etc.) to visualization (matplotlib) to numerical analysis (numpy/scipy) to web frameworks to natural language analysis and more...
–
MattDMoFeb 24 '14 at 16:14

2

Python also has interfaces to other languages (R, C/C++, Fortran, Java, etc.) so you can do domain-specific work in the best language for that project, and use Python as the "glue" to piece it all together.
–
MattDMoFeb 24 '14 at 16:18