Role in IT decision-making process:Align Business & IT GoalsCreate IT StrategyDetermine IT NeedsManage Vendor RelationshipsEvaluate/Specify Brands or VendorsOther RoleAuthorize PurchasesNot Involved

Work Phone:

Company:

Company Size:

Industry:

Street Address

City:

Zip/postal code

State/Province:

Country:

Occasionally, we send subscribers special offers from select partners. Would you like to receive these special partner offers via e-mail?YesNo

Your registration with Eweek will include the following free email newsletter(s):News & Views

By submitting your wireless number, you agree that eWEEK, its related properties, and vendor partners providing content you view may contact you using contact center technology. Your consent is not required to view content or use site features.

By clicking on the "Register" button below, I agree that I have carefully read the Terms of Service and the Privacy Policy and I agree to be legally bound by all such terms.

How Podium Data Speeds Self-Service Access to Data Lakes

Podium 3.0 is the latest edition of the company's enterprise data-lake management platform that works primarily with Hadoop installations.

It's one thing to coordinate, monitor and secure all of a company's data streams to servers, cloud applications and storage in an enterprise IT system. It's another thing to make that huge lake of raw data become easy and fast to access and process by employees who need the data "yesterday."

There are apps for this, as one might imagine. We're focusing on one of them here in this article: Podium Data.

Ostensibly, the 3-year-old company's software can become a Chief Data Officer's best non-human friend very quickly because it makes the connection between line-of-business users and their worlds of big data fast and user-friendly without having to detour through the IT department.

Self-Service Way to Obtain Data for Doing the Job

Further reading

Lowell, Mass.-based Podium empowers data analysts and other analytically oriented business people to obtain the data they need to do their jobs on a self-service, on-demand basis quickly and efficiently by eliminating hurdles in the data delivery process.

To this end, Podium Data on Jan. 19 released Podium 3.0, the latest edition of its enterprise data-lake management platform that works primarily with Hadoop installations. This features expanded data preparation and publishing capabilities that enable business users and data analysts to bring new data sources into a secure enterprise lake in less than a week and build and retrieve custom data sets in an hour.It offers a range of improved capabilities over previous versions: self-service publishing, a new intuitive user interface, user-defined datasets, support for Spark and Spark SQL, and operational, security, and governance reporting based on Podium's metadata.

"My colleagues and I started Podium Data three years ago because we saw an opportunity to totally change the way enterprises approached data management," CEO and co-founder Paul Barth told eWEEK. "We decided to deal with their really big problem, which was always agility.

Common Problems: 'Politics and Dirty Data'

"They would come up with an idea about how to use information analytics to improve a business process, whether it was customer service or marketing, or even financials. Then they'd go to IT and [find out] it's 18 months and $10 million and they'd have to buy a new data warehouse appliance, rewire everything, and so on. It turns a business opportunity into a massive project and an investment.

"We also saw that when you made those investments, they atrophied pretty quickly, because business requirements were changing all the time," Barth said

Barth and his team determined that the real key to success here was not Hadoop itself but deploying the open source, massively parallel technology Podium selected, he said.

"What I saw as the most common problems [in getting a data lake/analytics initiative up and running were two tings: politics and 'dirty data,'" Barth said. "Those were always what got in the way. You were spending all of your time trying to clean it up.

"When we looked at Hadoop, we said we could reverse the way that data management is deployed in an enterprise. We said there's room for a product here that can automate and leverage all the Hadoop economics but make it accessible to mere mortals and not just programmers."

Seeing Major Speed-Up of Data Movement

As a result, Barth contends, Podium's customers have achieved a 25-fold acceleration in delivery of new data to business users--from six months to less than a week--and a 40 percent reduction in data delivery costs by simplifying and speeding up the delivery of data to the business through a secure enterprise scale data lake.

--replicate to another Hadoop cluster, including partitions and Hive objects; and

--publish to RDBMS supporting any protocol via the Podium Open Connector.

Podium offers enterprise-grade security and ensures that access to data-as well as data protection and authentication-is in accordance with organizational protocols.

Podium 3.0 integrates with Active Directory and Kerberos, as well as provides for entity-level authorization via Podium impersonation in combination with Sentry/Ranger policies and/or HDFS access control lists.