I agree to TechTarget’s Terms of Use, Privacy Policy, and the transfer of my information to the United States for processing to provide me with relevant information as described in our Privacy Policy.

Please check the box if you want to proceed.

I agree to my information being processed by TechTarget and its Partners to contact me via phone, email, or other means regarding information relevant to my professional interests. I may unsubscribe at any time.

Please check the box if you want to proceed.

By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent.

servers. That could change as containers and microservices gain traction in application development circles.

Both containers and microservices break up monolithic application code into more finely grained pieces. That streamlines development and makes for easier testing, which is one of the keys to more flexible application deployment and code reuse.

It is early on for such techniques to be applied to big data, but, for new jobs like data streaming, microservices shows promise. For a technology manager at a leading European e-commerce company, the microservices approach simplifies development and enables code reuse.

With microservices, "you can very much economize on what you're doing," according to Rupert Steffner, chief platform architect for business intelligence systems at Otto GmbH, a multichannel retailer based in Hamburg, Germany. He goes further: For some types of applications, not using microservices "is stupid. You're building the same functionality over and over again."

The types of applications Steffner is talking about are multiple artificial intelligence (AI) bots that run various real-time analytics jobs on the company's online retail site. Otto uses a combination of microservices, Docker containers and stream processing technologies to power these AI bots.

Containers and microservices, oh my

Cloud computing has been one of the drivers edging Hadoop, Spark and other big data technologies toward virtualization, containers and microservices. There is still much infrastructure to build out, but companies are working on technologies to ease the evolution.

"Hadoop was largely run on bare metal, but it runs also on virtual machines; for example, on the Amazon cloud and Azure cloud and via OpenStack. Now it is moving to containers," said Tom Phelan, co-founder and chief architect at BlueData Software Inc., maker of a platform that automatically spawns Hadoop or Spark clusters.

"It used to be that performance of Hadoop clusters on bare metal was better, but that is changing," he said. Containers need to gain maturity, he acknowledged, adding that Hadoop, as it was originally designed, is not a microservices-style architecture. Santa Clara, Calif.-based BlueData recently updated its software to improve container support, rolling out automated Kerberos setups for Hadoop clusters and Linux privileged access management tools.

Agility and streaming are other drivers of microservices interest, according to a manager at Hadoop distribution vendor MapR Technologies Inc. Jack Norris, senior vice president of data and applications at MapR, said customers building bots and the like need to adapt quickly to data and machine learning models.

We see a need to open up to a broader set of applications.
Jack Norrissenior vice president of data and applications, MapR Technologies

That is especially true in applications that include what he described as "event-driven" architectures. Such architectures increasingly include data streaming components.

Norris said that, as Hadoop and Spark application flows become more complex, they become harder to update. But, he continued, microservices narrowly focused on events in the data pipeline can bring more flexibility to such developments. This is a change from the original Hadoop development style.

"We see a need to open up to a broader set of applications," Norris said. At the same time, he pledged that MapR will continue to support the existing style of monolithic applications as well.

Last month, MapR sought to further the microservices cause in big data with microservices-specific volumes for application versioning, and dedicated microservices for A/B testing of machine learning models. Also, a new reference architecture is available to guide developers through microservices for converged streaming data and real-time analytics applications, according to Norris.

Each of the AI bots in the Otto data architecture handles a particular task, said Steffner, who spoke at the Strata +Hadoop World 2016 conference in New York last month. For example, one AI bot looks for fraudulent transactions, another does analytical modeling to drive real-time ad placements and a third checks for empty online shopping carts to trigger last-gasp promotional offers before customers leave the site without buying anything.

The company accomplished this via Docker-based microservices architecture in October 2015 after a more conventional big data platform launched two years earlier didn't fully meet its needs, according to Steffner.

The Docker containers are also a good fit for the bot concept, Steffner said. At the back end, Otto has installed a mix of open source stream processing engines, including Storm, Spark Streaming, Flink and Ignite. But Steffner said Ignite, an in-memory data fabric technology originally developed by GridGain Systems Inc., is handling the bulk of the real-time processing work in the current environment.

1 comment

Register

Login

Forgot your password?

Your password has been sent to:

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy