Hortonworks’ IPO filing on Monday shows that Hadoop is still a resource- and risk-intensive business, but also suggests it’s one that public market investors will be willing to back. It might also start the ball rolling for long-anticipated moves in Hadoop.

There has been a spate of product announcements and integrations over the past few weeks signaling that many big data workloads — including, and especially, Hadoop — will soon be ready to run reliably in the cloud.

Some members of the Hadoop community are proposing a new object storage environment for Hadoop, which would let the big data platform store data in a manner similar to popular cloud data stores such as Amazon S3, Microsoft Azure Storage and OpenStack Swift.

Teradata and Cloudera have signed a deal that includes some technology integrations as well as joint sales and marketing efforts. Teradata is also launching its own Hadoop cloud service, which will be available by the year’s end.

MapR has integrated the Apache Drill SQL-on-Hadoop engine into its big data platform. MapR led the development of Drill, which is part of a larger movement within the company toward building a stronger open-source culture.

Hortonworks is working on a new initiative called Stinger.next, which it hopes will remake Apache Hive into a much more-capable SQL engine within the next year and a half. A greatly improved Hive could put other vendors on the defensive explainging why their products are better.

Hortonworks has added Apache Kafka to tis Hadoop software platform as a technical preview. Kafka isn’t the most popular tool in the world, but it’s widely used among large web companies, making it a useful add-on for luring customers of that ilk.

Report

New features included in Hadoop’s latest releases go some way towards freeing an increasingly capable data platform from the constraints of its early dependence on one specific technical approach: MapReduce.

Hortonworks CEO Rob Bearden came on the Structure Show this week to discuss many thing, including HP’s $50 million investment in the Hadoop startup, the competitive landscape in Hadoop and why the time is now right for big data applications to succeed.

MapR has raised $110 million, $80 million of which is equity financing, in order to fuel its growing Hadoop business in the face of better-known rivals Cloudera and Hortonworks. Like those companies, MapR says it has the winning strategy and aims to be a publc company.

Hadoop is a complex technology, so it helps to have friends in high places when you’re trying to develop it and integrate webscale tooling into enterprise environments. For Hortonworks, that friend is Yahoo, with which it continues a deep engineering partnership.

Hortonworks has bought a startup called XA Secure that’s focused on securing the entire Hadoop cluster and enforcing compliance with company-wide policies. The technology should be open sourced as an Apache project by the year’s end.

Cloudera and Intel have entered into an agreement that makes Intel Cloudera’s largest strategic investor and makes Cloudera Intel’s preferred partner for Hadoop distributions. It will forego its own distribution and start selling and engineering for Cloudera’s software.

Hadoop vendor Hortonworks has closed a fourth round of venture capital worth $100 million. It follows up on news last week that competitor Cloudera had closed its own $160 million round. Everyone agrees there’s huge opportunity in Hadoop, but capitalizing on it takes capital.

Hadoop pioneer Cloudera is reportedly raising “at least $200 million” from a group of investors that includes Hadoop competitor Intel. If true, it raises some interesting questions about how the two companies might decide to co-exist.

It didn’t take long for the Hadoop market to become a juggernaut, and it won’t take long for it to undergo some significant technological changes. Cloudera co-founder and chief strategy officer Mike Olson came on the Structure Show podcast to break it down.

Cloudera is touting the speed of its Impala query engine compared to Hive and a leading relational database system, but those aren’t really apples-to-apples comparisons. The real question is how all the SQL-on-Hadoop options stack up against one another.

You can’t talk about data without talking Hadoop. That’s why three CEOs — Rob Bearden of Hortonworks, Tom Reilly of Cloudera and Paul Maritz of Pivotal — will take the stage to talk about where the market it headed and how their companies are helping steer its direction.

Another day, another set of choice words hurled at one Hadoop vendor by another. This time, it’s Hortonworks doing the hurling, claiming that Cloudera’s business model isn’t designed for today’s big data market.

Hadoop 2 has brought along with it new ways of processing data and storing even more data. It suggests the Hadoop community understands the challenge in front of it — to keep innovating so users don’t have to look elsewhere.

Rackspace is now doing Hadoop, Cloudera just announced a handful of partners — Hadoop is everywhere in the cloud these days. Here’s a quick breakdown of what cloud providers are offering which distributions of Hadoop as managed services.

Rackspace has opened its Hortonworks-powered Hadoop service for early access customers, about a year after announcing it would be building the offering. It’s neither the first nor the last managed Hadoop service we’ll see this week.

Hortonworks is working to integrate the Storm stream-processing engine with its Hadoop distro, and hopes to have it ready for enterprise apps within a year’s time. It’s the latest non-batch functionality for Hadoop thanks to YARN, which lets Hadoop run all sorts of processing frameworks.

Cloudera, Hortonworks, MapR and others are battling to lock down market share for commercial Hadoop software, but they’re inherently limited when it comes to innovation. Why not take advantage of the work already done by big Hadoop users like Facebook, Twitter and LinkedIn?

Hortonworks has released a set of icons for illustrating the roles of various Hadoop-ecosystem components in flow charts and other architectural diagrams. Earth-shattering? No. Helpful if you’re stuck trying to build a PowerPoint slide about your big data environment? Probably.

In a candid interview last week, Hortonworks CEO Rob Bearden discussed a variety of topics — including personnel, profitability and a public offering — in some detail. Hortonworks is a Hadoop startup that spun out of Yahoo in June 2011.

Hortonworks lost both a co-founder and a CTO this week: Who’s going to right the ship? GE supports industrial data in the cloud, but only to a point: Hear why the company thinks AWS won’t ever be place for nuclear power plant data.

In our second cloud-and-data podcast, we hear from Facebook’s top analytics guy about how the company deals with all that data; we discuss the drama at Hortonworks and IBM oh, and why Infochimps and CSC may be a match made in heaven.

Hortonworks CEO Rob Bearden has confirmed that co-founder and CTO Eric Baldeschwieler has left the company. No word as to why, but his departure is the latest event in a busy few months at Hortonworks.

If the corporate website is any indication, Hortonworks co-founder Eric Baldeschwieler is no longer with the company. The former Hadoop boss at Yahoo was Hortonworks’ first CEO and was most recently CTO.

Incumbent database vendors aren’t exactly struggling to make ends meet, but the smart ones know that resting on their laurels might get them there someday. That’s because open source technologies like NoSQL and Hadoop are coming after their business.

The largest players in the Hadoop market are already raising money and sky-high valuations, employing hundreds of people and, in some cases, looking at nine-figure revenues. If you’re trying to get a sense of whether Hadoop is for real, these details might help.

Cloudera has joined the fray of Hadoop companies trying to turn the big data platform into an engine for exploring data interactively using standard SQL. As the biggest company in the space, its new technology called Impala could go a long way toward changing Hadoop’s image.

Now six years old, the Apache Hadoop platform for storing and processing huge amounts of data, perhaps the catalyst of the current big data movement, appears ready for its closeup. According to the companies leading the Hadoop charge, they’re already beating away customers with a stick.

One year after launching into the Hadoop market with much anticipation, Yahoo spinoff Hortonworks finally has a product available. The company announced version 1.0 of its flagship Hortonworks Data Platform on Tuesday, as well as a High Availability version designed with new partner VMware.

It’s neither easy nor glamorous — data scientists get all the love — but making sure your Hadoop cluster is properly configured and applications are running optimally is necessary, especially as applications move into production. Here are five tools to help you do it.

Market research firm IDC released the first legitimate market forecast for Hadoop on Monday, claiming the ecosystem around the de facto big data platform will sell almost $813 million worth of software by 2016. But Hadoop’s actual economic impact is likely much, much larger.

IBM’s big data platform will support the Cloudera Hadoop distribution, a surprising decision given the reservations the two companies had expressed about each other before. That gives IBM and rival Oracle at least one thing in common: Oracle’s Big Data Appliance runs Cloudera too.

Last summer, Yahoo’s investments in big data brought us Hadoop startup Hortonworks, and now those same investments brought us predictive-marketing startup InsightsOne. The company, founded by the team responsible for building Yahoo’s consumer analytics platform, launched on Thursday with $4.3 million in Series A funding.

Report

One major solution to the big data skills shortage has been the emergence of consulting and outsourcing firms specializing in deploying big data systems that companies need in order to actually derive value from their information. These companies will continue to play a vital role in helping the greater corporate world make sense of the mountains of data they are collecting. However, if the current wave of democratizing big data lives up to its ultimate potential, today’s consultants and outsourcers will have to find a way to keep a few steps ahead of the game in order to remain relevant.

Matt Howard of Norwest Venture Partners predicts that 2012 and 2013 will be Hadoop’s breakout years. Howard gives us insight into the five factors that will accelerate Hadoop’s mainstream adoption over the next 18 months.

Rob Bearden, CEO of Hortonworks, the Hadoop startup that spun out of Yahoo in June 2011, knows a thing or two about making open source software profitable. And he thinks Hadoop has an opportunity to be bigger than the markets for JBoss, SpringSource and MySQL combined.

Data-warehouse veteran Teradata has tightened its embrace of the Hadoop big data platform via a partnership with Hortonworks. The goal is to give customers big data environments that integrate everything from the Teradata Database for advanced SQL analytics and the Hortonworks Data Platform Hadoop distribution.

There has been a series of significant, but unannounced management changes at Hortonworks, the Hadoop startup that Yahoo spun off in June. Former COO Rob Bearden is taking over the top executive role, with some thinking his task will be whipping the company into shape.

Hadoop features front and center in the discussion of how to implement a big data strategy, one of the biggest trends in IT. There’s just one problem that keeps cropping up: many people don’t seem to know exactly what it means when somebody says “Hadoop.”

Although the first couple years of commercial Hadoop attention have been characterized by an attitude of “Hadoop is great, but …”, the tone is changing as Hadoop vendors increase the platform’s palatableness with each new iteration. No longer is Hadoop necessarily an epic undertaking rife with pitfalls.

Oracle’s Big Data Appliance is now for sale, featuring Cloudera’s Hadoop distribution and management tools. Regardless of what anybody thinks about Oracle’s strategy, the deal is a coup for Cloudera as it tries to fend off competition from fellow Hadoop startups Hortonworks and MapR.

Money has turned the Hadoop community, once united under the Apache banner and the cuddly stuffed-toy-elephant logo, into something resembling a frat house: Everyone’s under the same roof, but there’s plenty of machismo to go around. If it’s not good business; it is good theater.

Hadoop gets plenty of attention from investors and the IT press, but it’s very possible we haven’t seen anything yet. All the action of the last year has just been setting the stage for what should be a big year.

Targeting customers demanding more-reliable and efficient Hadoop clusters to power their big data efforts, NetApp has partnered with Cloudera to deliver a Hadoop storage system called the NetApp Open Solution for Hadoop. It combines Cloudera’s Hadoop distribution and management software with a NetApp-built RAID architecture.

Hortonworks is getting into software after all with today’s release of the open-source Hortonworks Data Platform. This is a smart move by Hortonworks, which will need more than name recognition to displace Hadoop elder statesman Cloudera as well as mega-vendors EMC, Oracle and IBM.

Cloudera and Hortonworks have been playing a game of oneupsmanship over the past few weeks in an attempt to prove whose contributions to the Apache Hadoop project matter most. Reputation matters to both companies, but maybe not as much as fending off encroachments to their turf.

All the speculation about how Yahoo’s Hadoop spinoff company, Hortonworks, will affect Cloudera and other companies providing Hadoop-based products might have been overblown. The company is still figuring out its strategy around offering a Hadoop distribution, which could be good news for competitors such as Cloudera.

The fight for Hadoop dominance is officially on. While Hortonworks is busy answering questions about its product strategy, Cloudera and MapR will demonstrate new versions of their distributions overflowing with bells and whistles. And there are several other competitive products lurking in the background.