Best Big Data Books: Our Top 5 Choices

Big data has been a huge part of infrastructure in the past couple of years, but it’s new enough that not many people are fully versed in its intricacies. To help out in that regard, here are some of our favorite big data books that have come out recently that can help you be your office’s Hadoop Hero (or other alliterative pun!):

1. Hadoop in Practice

By Alex Holmes

Hadoop in Practice makes my list for Big Data because it’s not necessarily just a Hadoop manual that explains the ins and outs of Hadoop – it’s more of a guide for someone out in the trenches. The book is less a technical reference manual than it is a list of techniques, problems, and solutions- something that might benefit someone who’s at their wit’s end trying to deal with LZO compression or serialization.

The book’s only weakness- if it can be called that- is that you do in fact get what it says in the tin. This isn’t for a beginner in any of the subjects covered- you’re expected to know some programming, and a good deal of Hadoop, including how to get it working. This isn’t a learning book in the traditional sense- it’s a volume of how-tos and problem solving, something that can be immensely handy when you need to learn something and get it done as quickly as possible! Definitely worth having for anyone currently immersed in a Big Data environment with Hadoop.

2. MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems

By Donald Miner, Adam Shook

On the flip side of our first book, Miner and Shook’s MapReduce Design Patterns is a bit more of a dense text. It’s designed to help you understand how to model MapReduce design patterns – something handy considering that such things can be tough to find outside of a classroom or random technical blogs. Miner and Shook go into more general ideas and theories while they explain the patterns – each pattern is contextualized, and they offer a pretty good amount of pitfalls and mistakes to avoid as you model your data architecture.

The book is designed to be bit more generic as far as languages go, and so you won’t get as much immediate practical benefit out of the book as you might from some others. Where the book excels, however, is in getting you ready for designing big data models: in that case, it’s a future investment as opposed to an immediate solution. Definitely worth a buy for anyone looking to improve their knowledge on big data frameworks – even if it’s not necessarily Hadoop!

3. Hadoop: The Definitive Guide

By Tom White

This is still one of the best books on Hadoop in print at the moment – Tom White’s guide is comprehensive in the utmost: it goes all the way from what Hadoop is to in-depth examinations of Hadoop’s core features and functions- it even has a great deal of design philosophy, something that’s often neglected or shoved into a separate book entirely. White shows you not only how something works but why it does as well, an invaluable teaching method.

We have more in-depth coverage of the book here (http://www.learncomputer.com/hadoop-the-definitive-guide/), but it needs to be on our top 5 list as well: Tom White’s excellent tome is invaluable for anyone looking to get into Hadoop and learning how to apply it to their own big data problem in the office!

4. HBase: The Definitive Guide

By Lars George

HBase is often mentioned in the same breath as Hadoop, being the database that very neatly complements Hadoop’s distributed filesystem. HBase is non-relational, and NoSQL has been on the lips of just about every IT executive in recent years due to scalability and cost-effectiveness. Lars George does a great job of giving details about HBase while also teaching you how to integrate it with MapReduce for massive big data deployment scenarios.

Like the Hadoop Definitive Guide, the HBase: The Definitive Guide pulls no punches: it’s essential reading for anyone looking to set up and deploy a Hadoop / HBase setup in a production environment. The only weakness the book has is that it’s a bit dry, but in fairness it’s tough to present such dense subject material in a lighter fashion: whether you’re simply in the market for a non-relational database or are looking to implement something as soon as possible, Lars George’s book will do you a great deal of good!

Marz and Warren’s book is quite interesting, and not least of all because Marz was one of the three original engineers behind Twitter’s BackType search engine – in “Big Data” Marz and Warren take a hard look at practical principles behind behind designing and implementing scalable real-time data systems. In particular, they do their best to teach a method of design that they call “Lambda Architecture”, a first principles approach to the scalability problem that offers interesting insights into the way Big Data should be tackled.

Of all the books on this list, Big Data is perhaps the most theoretical- but it could also be the most useful. Marz and Warren don’t get into too many specifics when it comes to database, filesystem, or language, which may not be what some people are looking for or need, but they do get into the principles you should consider when choosing your tools and implementing your system- an invaluable lesson for anyone to learn before they start a big data project.

Conclusion

Big Data is becoming something that many companies are requiring: as data sets and points grow to terabytes or even petabytes in size, sophisticated frameworks are required to be able to handle them and process them effectively. Don’t be left out of the loop when it comes to this rapidly growing part of the IT field- grab some of these books and educate yourself today!

Don’t Miss Subscriber-only Content!

Just wanted to say THANK YOU SO MUCH for the classes this week! You did a great job and it was very informative! I've been an Oracle developer for almost 20 years now and with my busy work could never find time to get my hands on one of these new technologies. It was an eye opener.

- Dmitry, EMC

It was just what I needed!

- Brian, EMC

Good practical android course. Lots of material but if you pay attention in class you will get your money's worth. Instructor knows his stuff.

- Gene, Verizon

The Apache Trainer was extremely knowledgeable and personable to make the experience worth the time, expense and effort.

- Rick, GTech

The instructor was very knowledgeable, also in the iOS area, what enabled me to get answers about platform differences and similarities.

- Adam, Roche

Instructor Guy Cole was excellent!

- Intel

The Android Application Development course was very well delivered and left me with a wealth of real code I can use at work.

Instructor knew all details and explained everything with extreme patience

- Kyocera

Great instructor, down to earth and very knowledgeable. Taught in a manner that was easy to pick up on. Provided a ton of great code examples that I will always be able to look back to.

- Joe

The instructor was extremely knowledgeable and made it a great learning enviroment.

- Paul, American Thermal Instruments

I enjoyed the exposure to Eclipse and exploring interactions within the Android environment.

- Hollis, TCI

Very knowledgable, motivated, and responsive instructor

- Intel Corporation

Great Pace, Great Faculty, Great Topic

- Ashish, Meltwater Group

Instructor was excellent and made the course interesting.

- Elbert, AO Smith WPC

Serge was willing and able to jump off the scheduled presentation and answer specific questions relevant to our organization, which really helped answer some important questions we had.

- Bill, 4Info

Excellent instructor. Patient and diligent - methodically going over the material until the students have a full grasp on it.

- Derek, NSi

What I liked about this training was the professionalism of the course layout and Andre was full of knowledge. Andre took the time to answer all of my questions and made sure I was understanding everything we covered.

- Melissa

I am an experienced OOP programmer/developer, and I believe that the programming examples were extremely relevant. Mr. Cole put in much effort and ensured that the programming templates were relevant and workeable.

- MAJ Jarrod, Fort Gordon School of Information Technology

Great presentation of material and engagement of the class. Learned a lot about services that I have been aligned with for years and got a better and deeper understanding of the content and data behind these transactions.

- Greg, Intel Corporation

Guy is a knowledgeable instructor and skillful presenter. He made this course really come together with exercises and practical project. I hope he teaches other courses!!!

- Andrew, Kyocera

The instructor is very patient to explain, I think that is great. I liked the course, very good!

- Alex, LogicStudio

Android Application Development class is cutting edge. It covers best of both worlds - the basics and advanced SDK functions. The project is very relevant to the course.

- Josh, Stanford University

Like the teacher shared with us his experience and insights.

- Echo, Disney

I really liked the exercises where we were asked to do some coding and assignments. Also the last day hands on lab was really good and enjoyed it throughly!

- Intel

Thank you very much - it was very informative! Ken and Boris were patient and tried their best to answer our questions - very stimulating...

- Abhijit

Good training material and lots of labs and samples relevent to the course. The instructor spoke very clearly and at a pace that was comfortable.

- Douglas

I was looking for an Android bootcamp that solidified some of the basics that I already knew and progressed quickly to more advanced topics. This course definitely did that. Overall, I am very happy and I have a large example set to draw from to continue to build my skills. The lab was very practical and robust. Although it was difficult to complete, I managed to get most of it finished and the example for the lab that was provided is an excellent example as well. Also, the instructor was well-spoken and easy to listen to. This is a huge plus.

- David, Gateway Church

Serge has good knowledge and answers all the questions.

- harsh

I come from a Java background, and Android seems natural for me to pick up. Guy made the transition from Java to Android very easy. He is a very good teacher. I enjoyed this training.

- Leonid

Good introduction to Android development with lots of practical examples. Instructor is knowledgeable and jovial.

- Kyocera

Goal of training met , in terms of the course objectives set my course managers

- CPT Peter Johnson, U.S.ARMY 53A ISM course

Boris, you were just fantastic at delivering this course to us! Hope Italy treated you as well in return!

- Paolo, Telecom Italia

Guy is a great "guy", and did an excellent job presenting the material and ensuring people "got" it. I looked at bootcamps provided by a number of organizations, and this one was the most thorough, and had the least fluff. I don't think I've ever been as pleased with a course.

- Winston

This was an intense 3-day course. The great part though is you don't need to remember everything. As long as you complete the class project, you will learn many valuable lessons. I highly recommend this course!

- Pradeep, US Government

A lot of great examples! The instructor is an Android expert and skillful presenter.

- Krystian, Roche Polska

Serge, Boris, thank you very much. Very good class!

- EMC

Guy Cole is both an expert Android instructor and a great entertainer. I thoroughly enjoyed this course!

- Chris, Rockwell Collins

The Android App Development class was very effective for me. In just two days, I learned enough material to get me started on my own. Instructor and facility were both top notch!

- Shekhar, MIPS Technologies

Instructor was very knowledgeable, helpful, and clear.

- Franklin, Time Warner Inc.

Trainer was extremely knowledgeable. I really appreciate as trainer helped me understand avro files and how to load them which was one of my expectations out of this course.

- ankush, EMC

Instructor was extremely knowledgeable about the topic. He doesn't just teach it, he uses it. That makes all the difference.

- Deborah, City of Arlington

This course could easily have taken longer than two days, but Boris did an awesome job breaking it down into a shorter course. He explains and demonstrates extremely well!

- Gregory, LSI

I liked the interactive approach the Guy used during chapter demos and sample app. We all shared our products and learned from each others' blunders. ;-)

This course was excellent! Guy Cole was able to create a great learning environment. He is technical, eloquent, and funny all at the same time. I'd take this course again any time!

- Regina, IBM

I learned alot more in 3 days and could do alot more than i thought possible.

- Joe, Mattel

Examples were easy to understand and practical. Instructor was candid about challenges in development.

- Robert

Instructor was knowledgeable, systematic and responsive to questions. I have enjoyed the course and have learned a lot about Hadoop. GoToMeeting is an effective medium for presentations and was used very well for communication and resolving problems.