What is Data Integration

Data integration is the process of consolidating data from different sources. Data integration is often a prerequisite to other processes including analysis, reporting, and forecasting.

Data integration vs. application integration vs. ETL

Data integration is often confused with application integration and ETL. While they are closely related, there are important distinctions between the three terms.

Data integration is a process where data from many sources goes to a single centralized location, which is often a data warehouse. The end location needs to be flexible enough to handle lots of different kinds of data at potentially large volumes. Data integration is deal for powering analytical use cases.

Application integration involves moving data back and forth between individual applications to keep them in sync. Typically, each individual application has a particular way it emits and accepts data, and this data moves in smaller volumes. Application integration is ideal for powering operational use cases. One example is ensuring that a customer support system has the same customer records as the accounting system.

ETL stands for extract, transform, and load. This refers to the process of extracting data from source systems, transforming it into a different structure or format, and loading it into a destination. Data integration and application integration are two types of ETL.

Want to learn about setting the data strategy for your organization?

Sign up for a free 30-day course to learn what you need in order to succeed with data. We've worked with more than 800 companies of all sizes and helped them build their data infrastructure, run analytics, and make data-driven decisions. Learn how the data landscape has changed and what that means for your company.

Email Address

We will never share your email

Data integration example

Let's take the example of a company called See Food, Inc. (SFI). SFI's product is a mobile app where users can take pictures of different items and identify whether the item in the picture is, or is not, a hot dog. SFI uses a lot of tools to run its business:

Each of those applications has a silo of information about SFI's operations. For SFI to get a 360-degree view of the business, all of that data needs to be combined in one place. That process is data integration.

Data integration ROI

Getting a 360-degree view sounds nice, but before undertaking any data integration project, it's important to understand what the return on investment will be. Your use case will vary, but here's an example of the value data integration can bring.

Suppose SFI is considering increasing its advertising budget, but it's not sure if it should spend more on Facebook or Google. It could ask whether the cost of acquisition is lower on Facebook or Google, but that misses out on whether there are differences between the kinds of users they acquire on the two different channels. Some additional questions the company might want to ask are:

Do users from Facebook post more photos of hot dogs?

Do users from Google file more customer support tickets?

Which users are more likely to refer friends?

Each of these questions can be combined and further segmented to individual campaigns and variations on ad creative. These questions can only be answered when the data is integrated.

Want to learn about setting the data strategy for your organization?

Signup for a free 30-day course to learn what you need in order to succeed with data. We've worked with more than 800 companies of all sizes and helped them build their data infrastructure, run analytics, and make data-driven decisions. Learn how the data landscape has changed and what that means for your company.

Email Address

We will never share your email

In-house data integration

Note - This section is for companies that are comfortable writing code and using the command line. If that's not for you, skip to the section on simple data integration with Stitch.

If you have software engineers on your team, you may want to initiate an in-house data integration project. Software engineers who specialize in building the systems that transmit data throughout a company are often called data engineers.

While you can start your data integration project from scratch, it's often helpful to leverage an open source project to save time. One option is the Singer open source ETL project. Singer leverages reusable components for pulling from data sources (taps) and sending to destinations (targets). That means if you need one additional source that hasn't been built before, you only need to build a single tap and it will automatically work with all of the other taps and targets.

Simple data integration with Stitch

Stitch is a cloud data integration service. Stitch connects to today’s most popular business tools – including Salesforce, Facebook Ads, and more than 60 others – and automatically replicates the raw data to a data warehouse. There's no code to write, and it automatically keeps your data up to date.

Stitch was built to solve data integration. With just a few clicks, Stitch will extract your data from wherever it lives and get it ready to be analyzed, understood, and acted upon.