The People Who Support Linux: PhD Student Powers Big Data with Linux

By Libby Clark - December 9, 2013 - 3:36pm

Open source technologies are powering the current trend toward big data and Michiel Van Herwegen, a PhD student in analytical CRM (customer relationship management) at Ghent University in Belgium, has a front row seat.

Van Herwegen works in the school's marketing department as a member of the modeling cluster where he uses Hadoop clusters running on top of Scientific Linux to help companies crunch customer data and make better business decisions. He also teaches the school's new introductory course on big data.

“(The class) fits in our larger trend of shifting attention more and more towards open source technologies,” Van Herwegen said.

At home, both of his machines run Debian: one for backups and maintaining a couple of Git repositories through Gitolite, and the bulkier one for all kinds of tinkering, from modeling with R or Python to figuring out GCC cross-compiling.

“My first Linux experience is now about 10 years behind me when I got hold of a copy of Mandrake and wreaked havoc on the family desktop. At the time it was pure curiosity for the unknown, which did not last long because I did not get X working,” he said.

“But the appeal of finally being in control myself made me come back several years later,” he said. “Debian and - once discovered - its package manager made sure I stayed this time. Certainly in the last 4 to 5 years, Linux has grown on me. Even to the extent that I prefer to get things done via the command line.”

Building an ARM Cluster

Though he finds his marketing work interesting and enjoys applying his knowledge to private sector projects, it does limit his time for research, he says. In his spare time he's been attempting to get a small cluster of ARM devices running a tiny Linux system.

His ARM cluster should be just capable of running as Disco data nodes, he said. He's hoping a torture test will help him figure out what to work on next to make it workable as a teaching tool.

“I'm still trying to get my head around cross-compiling, so most of this is still a vague, long-term dream,” he said. “I strongly believe that Linux, MapReduce and ARM are natural bedfellows."

“Software is more and more developed with distributed workloads in mind. At the hardware side, ARM is a great fit for this," he said. "With support for the architecture now finding its way into the Linux kernel, I have high hopes for seeing it all come together.”

Van Herwegen recently joined The Linux Foundation as an individual member[1] as one way of contributing to the Linux community.

“Reciprocity is one of the most important principles in life,” Van Herwegen says. “Much has been given by the Linux ecosystem and becoming a member of the Linux Foundation is my way to give back a tiny bit. Although here is hoping that over time, contributing in other ways will become feasible as well.”

Welcome, Michiel!

The Linux Foundation will donate $25 to the World Wildlife Fund for the emperor penguin for each individual member who joins The Linux Foundation through Dec. 10, 2013. Join today! [2]