Meet Kedro, McKinsey’s first open-source software tool

June 6, 2019QuantumBlack, the advanced analytics firm we acquired in 2015, has now launched Kedro, an open source tool created specifically for data scientists and engineers. It is a library of code that can be used to create data and machine-learning pipelines. For our non-developer readers, these are the building blocks of an analytics or machine-learning project.

“Kedro can change the way data scientists and engineers work,” explains product manager Yetunde Dada, “making it easier to manage large workflows and ensuring a consistent quality of code throughout a project.”

McKinsey has never before created a publicly available, open source tool. “It represents a significant shift for the firm,” notes Jeremy Palmer, CEO of QuantumBlack, “as we continue to balance the value of our proprietary assets with opportunities to engage as part of the developer community, and accelerate as well as share our learning.”

The name Kedro, which derives from the Greek word meaning center or core, signifies that this open-source software provides crucial code for ‘productionizing’ advanced analytics projects.

Kedro has two major benefits: it allows teams to collaborate more easily by structuring analytics code in a uniform way so that it flows seamlessly through all stages of a project. This can include consolidating data sources, cleaning data, creating features and feeding the data into machine-learning models for explanatory or predictive analytics.

Kedro also helps deliver code that is ‘production-ready,’ making it easier to integrate into a business process. “Data scientists are trained in mathematics, statistics and modeling—not necessarily in the software engineering principles required to write production code,” explains Yetunde. “Often, converting a pilot project into production code can add weeks to a timeline, a pain point with clients. Now, they can spend less time on the code, and more time focused on applying analytics to solving their clients’ problems.”

“More importantly, the same code can make the transition from a single developer’s laptop to an enterprise-level project using cloud computing,” explains Ivan Danov, Kedro’s technical lead. “And it is agnostic, working across industries, models and data sources.”

Two years in the making, Kedro was the brainchild of two QuantumBlack engineers – Nikolaos Tsaousis and Aris Valtazanos, and QuantumBlack alumnus, Peteris Erins,
who created it to manage their numerous workstreams. Kedro had started as a prototype library and was being quickly adapted by different teams when they brought it to Quantum Black Labs, a technical innovation group led by Michele Battelli.

We strive to provide individuals with disabilities equal access to our website. If you would like information about this content we will be happy to work with you. Please email us at: [email protected]

The Kedro team over two years

“Client teams can rotate into our lab and have the resources to convert a one-off piece of software or database [such as Kedro] into a viable product that can be used across industries, and that will be continually improved,” explains Michele. “It is a powerful way of innovating; our tech teams can move faster, more efficiently, and make a lasting contribution.”

McKinsey has used Kedro on more than 50 projects, to date. According to Nikolaos, clients especially like its pipeline visualization. He explains that Kedro makes conversations much easier, as clients immediately see the different transformation stages, types of models involved, and can backtrack outputs all the way to the raw data source.

“Kedro began as a proprietary program, but when a project was over, clients couldn’t access the tool any more. We had created a technical debt,” Nikolaos said. “By converting Kedro into an open source tool, clients can use it after we leave a project—it is one way we are giving back."

“There is a lot of work ahead, but our hope and vision is that Kedro should help advance the standard for how data and modelling pipelines are built around the world, while enabling continuous and accelerated learning. There are huge opportunities for organizations to improve their performance and decision-making based on data, but capturing these opportunities at scale, and safely, is extremely complex and requires intense collaboration,” says Jeremy. “We’re keenly interested to see what the community does with this and how we can work and learn faster together.”

Learn more about Kedro at Github, where you can engage with our team and watch for new features in coming months.

Related

We’re delighted to announce the acquisition of QuantumBlack, a London-based company with roots in Formula 1 motor racing that is pioneering the use of big data and advanced analytics to improve organizational performance.

QuantumBlack is opening its first outpost in Canada, in Montreal, which is fast becoming a leader of AI. The team of 30+ analytics experts will undertake a mix of client service, research, teaching and pro bono work.

Stay current on your favorite topics

McKinsey uses cookies to improve site functionality, provide you
with a better browsing experience, and to enable our partners to advertise to you. Detailed information on the
use of cookies on this Site, and how you can decline them, is provided in our cookie policy. By using
this Site or clicking on "OK", you consent to the use of cookies.