Overview

As its name implies, data is core to Automatic Data Processing (ADP)’s business; the 60-year-old Fortune 500 provider of human capital management (HCM) solutions is responsible for getting one in six Americans paid today. This puts tremendous data in ADP’s hands—and payroll is only one small piece of the work it does.

ADP is now putting that data to use and generating a new revenue stream by aggregating information across its 600,000 clients into a disruptive product offering, powered by data science on Apache Hadoop, that helps clients prevent employee churn, ensure salary equality, and maximize human resources. The product is DataCloud.

Impact

Whenever we speak to HR professionals at our clients, I ask the same question...What is the absolute most important part of your job? Why are you here? The answer is always the same: to find and keep the best talent possible.”

Marc Rind, Vice President of Product Development and Chief Data Scientist at ADP

Provides information about the at-risk employee’s job type, location, duration in current role, and management organization

Delivers tools and information to share with the employee’s manager so they can address the situation

In a pilot at one account, employee turnover was at 17 percent. Using DataCloud, ADP was able to identify the top one percent of at-risk employees, and learned that within that group, turnover was actually 50 percent. When removing that top one percent from the overall analysis, average turnover dropped to nine percent. DataCloud helped the client focus on a small population of at-risk employees where they could make a meaningful impact that would drastically improve the company’s overall churn; without this insight, they would have spread retention efforts across the employee base, requiring more time and resources with a less targeted approach and having a lower impact overall.

Reducing employee churn has far-reaching business impacts. The cost of losing one employee is more than a simple hiring replacement. Recruiting and interviewing for that person’s replacement is costly. Productivity is lost while the new hire gets up to speed. Risk of others on the team leaving increases when they’re forced to pick up the slack. It’s a ripple effect.

DataCloud not only allows clients to look at their own data around human capital and employee retention, but also helps them understand how they compare to similar organizations and where/how they can make improvements. They can answer questions like:

Are we paying our workers appropriately based on industry averages for similar jobs, regardless of race, age, and gender?

What are bonus expectations?

What should our overtime rate be?

The value DataCloud offers is evidenced by the massive growth ADP has seen throughout its client base, driving greater success for ADP via this new revenue channel.

Business Drivers

DataCloud stemmed from a strategic shift at ADP to move from primarily processing transactions to also providing insights based on its greatest asset: data.

Upon considering building this product, ADP reached out to clients to:

Gauge their interest in gaining insights based on aggregated and anonymized benchmarks developed from the data spanning ADP’s customer base

Give them the opportunity to opt out of participating in such a program

When only seven percent opted out, ADP knew the opportunity was real.

Making the vision a reality presented a technological challenge. The data was spread across data centers and applications. It needed to be brought together for processing, exploration, and analysis. It wouldn’t be feasible using traditional relational database technology.

Solution

DataCloud runs on Cloudera Enterprise, comprising a 200-terabyte (TB) lab and two 400-TB production data centers, each with replication for disaster recovery. Tools including Apache Impala, Apache Spark, and Tableau are used to process data and benchmarks, and to facilitate data exploration and analysis.

600,000-plus client databases capturing information on 29 million people

Mainframe-based data from the 30 to 35 million pay cycles ADP executes annually, including compensation, time card punches, bonuses, overtime, and salary increases

Oracle-based data from the 15 million HR functions managed by ADP annually, such as benefit deductions and elections, performance scores, and recruiting processes

Data from 15 other ADP departments—such as Marketing, Sales, Implementations, and Service—who leverage the platform as their enterprise data hub (EDH) so they may build their own data products

Client data sets such as point-of-sale transactions and revenues

DataCloud conforms job title and role categorizations across 600,000 companies into a comparable standard from which 500 billion aggregates are created. Those aggregates are used to build the benchmarks that are delivered to clients.

The data is drawing everybody together.…Sometimes I call it ‘the little cluster that can’ because it’s just amazing what goes on in there in a day.

Jim Haas, Principal Architect, ADP

Why Cloudera

Stability and support: By packaging, integrating, testing, and maintaining many Hadoop components into the most stable distribution, Cloudera allows ADP’s technical resources to focus on what matters to them: building product.

Security: ADP deals with highly sensitive data, so ensuring its data would be safe, anonymized, and secure was critical.

Partnership: With the right blend of technical acumen and business vision, Cloudera provides a strategic partnership, not just a product. Together, ADP and Cloudera built and deployed a platform within a few months that delivered immediate value and supports future use cases.

Cloudera has been one of the best companies I’ve ever seen in terms of customer support,” said Haas. “Whether talking about architecture or if there are issues with a framework, I get communicated to immediately....If I need a fix, they provide it overnight.