Roger Magoulas

Roger Magoulas is the director of market research at O'Reilly Media. Magoulas runs a team that is building an open source analysis infrastucture and provides analysis services, including technology trend analysis, to business decision-makers at O'Reilly and beyond. In previous incarnations, Magoulas designed and implemented data warehouse projects for organizations ranging from the San Francisco Opera to the Alberta Motor Club.

How generating conversations can become one of the most important data assets for any organization.

At O’Reilly Research, we focus our attention on trends in technology adoption — which tools are adopted and in which industries. In doing so, we uncover interesting cross-disciplinary opportunities and discover what we can learn from innovations in other fields.

We’ve recently learned about the increasing role of data in the fashion industry, so we set out to uncover some of the players who are making disruptive changes using technology and analytics.

At our successful Strata + Hadoop World conference (including successfully avoiding Sandy), a few themes emerged that resonated with my interests and experience as a hands-on data analyst and as a researcher who tracks technology adoption trends. Keep in mind that these themes reflect my personal biases. Others will have a different take on their own key takeaways from the conference.

1. In-memory data storage for faster queries and visualization

Interactive or real-time query for large datasets is seen as a key to analyst productivity (real-time as in query times fast enough to keep the user in the flow of analysis, from sub-second to less than a few minutes). The existing large-scale data management schemes aren’t fast enough and reduce analytical effectiveness when users can’t explore the data by quickly iterating through various query schemes. We see companies with large data stores building out their own in-memory tools, e.g., Dremel at Google, Druid at Metamarkets, and Sting at Netflix, and new tools, like Cloudera’s Impala announcement at the conference, UC Berkeley’s AMPLab’s Spark, SAP Hana, and Platfora.

We saw this coming a few years ago when analysts we pay attention to started building their own in-memory data store sandboxes, often in key/value data management tools like Redis, when trying to make sense of new, large-scale data stores. I know from my own work that there’s no better way to explore a new or unstructured data set than to be able to quickly run off a series of iterative queries, each informed by the last. Read more…

What does winning look like? No enemy has been vanquished, but open source is now mainstream and a new norm.

I heard the comments a few times at the 14th OSCON: The conference has lost its edge. The comments resonated with my own experience — a shift in demeanor, a more purposeful, optimistic attitude, less itching for a fight. Yes, the conference has lost its edge, it doesn’t need one anymore.

Open source won. It’s not that an enemy has been vanquished or that proprietary software is dead, there’s not much regarding adopting open source to argue about anymore. After more than a decade of the low-cost, lean startup culture successfully developing on open source tools, it’s clearly a legitimate, mainstream option for technology tools and innovation.

And open source is not just for hackers and startups. A new class of innovative, widely adopted technologies has emerged from the open source culture of collaboration and sharing — turning the old model of replicating proprietary software as open source projects on its head. Think Git, D3, Storm, Node.js, Rails, Mongo, Mesos or Spark.

We see more enterprise and government folks intermingling with the stalwart open source crowd who have been attending OSCON for years. And, these large organizations are actively adopting many of the open source technologies we track, e.g., web development frameworks, programming languages, content management, data management and analysis tools.

Agility, simplicity, and curiosity will define the next generation of apps and devices.

The speakers at the recent Webstock conference in New Zealand gravitated toward many of the same themes. Taken together, these themes create a framework for building the next generation of services, applications and devices.

How a days-long data process was completed in minutes.

We recently faced the type of big data challenge we expect to become increasingly common: scaling up the performance of a machine learning classifier for a large set of unstructured data. In this post, we explain how a set-oriented approach led to huge performance gains.

Will internal constituencies bias how publishers value print book and ebook business models? Roger Magoulas examines that question and looks at the complementary relationship between print and electronic forms.

Featured Video

The growing role of software architects: “Architecture has become much more interesting now because it’s become more encompassing," says Neal Ford, software architect and meme wrangler at ThoughtWorks.