Data Loading with Relationships

The Data Loader is an easy to use graphical tool that helps you to get your data into Salesforce objects. The Data Loader can also be used to extract data from database objects into any of the destinations mentioned above. You can even use the Data Loader to perform bulk deletions by exporting the ID fields for the data you wish to delete and using that source to specify deletions through the Data Loader.via DeveloperForce

One of the main challenges when working with the Data Loader is having to deal with relationships. Loading in objects that have no relationships is simple. You simply generate the CSV file and Insert those records. However, when these objects have relationships, it can be a bit more difficult. To make it simpler, Salesforce provides an external ID. An external ID can be used in an upsert to properly avoid creating duplicate records and it can be used to properly associate child records with parent records. Let’s take a look at it in action.

Example using Load Ids

In this example, we will look at moving data from one Salesforce org to another. This is incredibly useful when generating a new sandbox, for instance a developer sandbox, and then adding some sample data from production. Now, you can also use this to import data from other systems. This first thing you are going to want to do is make sure both the parent and the child object have a field called Load Id. That field should be Text field with 18 length with the most crucial piece being that the field is marked as an External Id.

In this very simple scenario, I will be loading in two objects: Test Object and Child Object. That means two separate CSVs. The export from production of Test Object looks like:

Id

Name

Load_Id__c

a00i0000005pRiNAAU

T-006

a00i0000005pRjoAAE

T-005

While the Child Object export looks like:

Id

Test_Object__c

Name

Load_Id__c

a01i0000005wYVIAA2

a00i0000005pRiNAAU

CO-0001

a01i0000005wYVNAA2

a00i0000005pRiNAAU

CO-0002

a01i0000005wYVSAA2

a00i0000005pRjoAAE

CO-0003

a01i0000005wYVXAA2

a00i0000005pRjoAAE

CO-0004

There are two important things to notice here. First, the Load_Id__c does not have a value in production. If we were importing data into production from an external system, this field may already be populated but for this example we are just putting production data into the sandbox. Second, I just wanted to point out that CO-0001 and CO-0002 are child objects of T-006 and CO-0003 and CO-0004 are child objects of T-005. Now, let’s take a look at how we would load this data into the Sandbox using the Data Loader.

The first thing we need to do is load in the parent object, or in this case, Test Object. It should be a fairly simple import, with one very important modification. When loading in Test Object, we need to ensure that the Id field is mapped to load in as the Load_Id__c field. This will allow us to upsert on the External Id properly.

For the first object, in this scenario Test_Object__c, select to Insert it. Select Test_Object__c and select the CSV you will be uploading.

The Data Loader will confirm the CSV format and verify the amount of records that are about to be uploaded.

On the next screen, select Create or Edit Map to open up the Mapping Dialog. Here is where it is important to map the Id field to the Load_Id__c field.

In this scenario, the final mapping looks like:

Once you are ready you can select Next to specify where your results should be saved and then click Finish to start the upload process. In this scenario, both objects were uploaded successfully.

We have now successfully loaded in the objects changing the original unique Ids Salesforce assigned the object from production as the Load Id. This is key as it will provide us with an easy way to look those objects up when we load in their respective child objects. Let’s take a look at that in action.

For any child objects, you must select Upsert.

You now need to select Child_Object__c and the CSV with the data.

That CSV will be verified with 4 objects.

Now, differently than your standard Insert, Upsert requires that you select either your Id or External Id fields to use as the mapping between the data that is being loaded in and the data that is currently in the system. In this scenario, it doesn’t really matter since there isn’t any data in the system and we aren’t using this piece of functionality.

The next step is where we select the Id or External Id we want to use to map our relationships. It is important to select Load_Id__c here to ensure that we match our previously mapped field.

The rest is pretty much the same as before. We need to Select or Edit a Map to set up the mapping. Note that the field we map the Test_Object__c with is the Test_Object__r:Load_Id__c and not the Test_Object__c field (which would try to map on the Id)

We can verify our mapping and then proceed.

We quickly verify where we want to save the success/error reports and then proceed to upload the four objects successfully.

We have successfully loaded two parent objects with four child objects in a relatively pain free manner.

Final Thoughts

Now, this may have appeared to be overkill for this simple scenario, and to be honest, I agree. We could have recreated the data manually quicker. However, when the data gets slightly larger and more objects come into play, it can be very useful. I recently had to load in 67 different custom objects, several of which had 4+ lookup relationships on them. On top of that, I loaded in 400,000+ records. Oh, did I mention, I did this all in a single day? Doing that manually, or without an external Id like above, would be impossible. Good luck!

Note: All screenshots were taken of LexiLoader, the unofficial Mac OS X equivalent of Salesforce’s Data Loader.

7 Responses to “Data Loading with Relationships”

take a look at this ANT tool that I’ve been using for the past few years. It will save you a lot of time with this technique. The doc is a bit sparse but, if you know ANT, you should be able to work it out.

If you are interested in lookup relationships check out dataloader.io if you haven’t already. It provides ability to do lookups across objects using any field. Neat trick is using a formula field for related object lookups.