Vertical Trail and Dataiku: Partners in Data Science

Date

Tags

Category

Vertical Trail announces formal partnership with Dataiku – a collaborative data science software platform enabling self-service analytics. Vertical Trail is leveraging Dataiku for operationalizing machine learning and analytics to generate high-value business insights for our clients. We are now pleased to announce that we are officially a Dataiku Certified Partner!

“We are very excited to be stepping up our partnership with Dataiku to offer our clients a complete data science solution,” says Larry Pollastrini, Managing Partner at Vertical Trail. “By integrating our Troodon Analytics Hub with their data science platform, we can rapidly operationalize and scale data science capabilities for our experienced data science clients, while also offering a rapid path of entry for organizations that are just getting in to the data science game.”

Maturing to Modern Data Science

Traditionally, analytics was seen as either the process of data exploration, or the expression of data discovery via standard reporting. With the rapid expansion of Machine Learning and other advanced analytics techniques, many organizations find themselves in a morass of analytic models, model versions, interconnections, and data quality problems.

Vertical Trail believes that solving these problems requires maturing the traditional business intelligence (BI) function into a modern Data Science practice. Maturity in this area requires improved processes, skills, and technology. Vertical Trail relies on Dataiku and their Data Science Studio (DSS) to provide the central tooling to support some core requirements, including:

Model Interdependency – As models are layered on other models, managing the dependencies and ensuring the quality of data is critical. Using DSS, we can automatically apply changes to one model and seamlessly propagate those changes through all dependent models, quickly identifying and repairing issues as necessary.

Model Development Life Cycle (MDLC) – Models are developed and deployed into a complex environment of other models. Managing model quality, data quality, and user acceptance are all a key part of the new MDLC. DSS supports the MDLC by leveraging modern source control (GIT) for code versioning and rollbacks. Additionally, DSS supports the process of model promotion through development, testing, and production to ensure that changes to production analytics are of the highest quality.

Monitoring and Alerting – Models or analytics in production need a mature monitoring platform. Failure in one area may impact the quality of other models. We rely on the monitoring and alerts built into the Dataiku Automation Nodes to notify us of issues related to any of the analytic models or data flows that we build there. These Nodes can notify us of an issue with data ingestion, data enrichment, or model execution. We rely on traditional tools to monitor key performance indicators for infrastructure components.

Model Democratization – Tools for building complex models are now accessible to a wider user base. We use DSS to provide an intuitive workbench for a variety of clients – from data scientists and data engineers to business analysts – allowing them to experiment and discover new insights. DSS allows users to work with a variety of data sources; using the DSS security model along with Amazon Web Services (AWS) means they can do so without jeopardizing existing production models and data.

Community-Building – One of the less well-known areas of maturity is how builders and consumers come together to collaborate on models, data, and more. DSS provides an online community space within the tool itself, encouraging consumers to communicate with model builders, and allowing the builder to provide input. It even gives the users the ability to rate models, allowing the wider community of analytics consumers to more readily discover and evaluate models.

Security and Data Governance – This defines how to guarantee the quality of data, as well as set and enforce data access policies. As data privacy and access control become more and more critical, the ability to implement fine-grained controls that meet specific business needs is a necessity. Integrating existing customer frameworks with Kerberos and AWS security roles to provide the necessary controls is one successful technique.

Together, We Ascend.

We have been working with Dataiku for some time to continually innovate and deliver new data-driven solutions to our clients, and we’re thrilled to make this partnership formal. Together with Dataiku, we create advanced analytics solutions for the maturation of our clients’ data science practice.