Rapidminer to Salesforce Using DataDirect Cloud JDBC

RapidMiner is a powerful analytics provider, but it becomes more powerful with Salesforce. Learn how to connect your data sources and use RapidMiner to the fullest.

RapidMiner is leading the predictive analytics platform space. It provides a logical GUI to help provide easy visualization, transformation and analysis of data. It focuses on data mining, text mining, and predictive analytics.

RapidMiner is empowering organizations to include predictive analytics in any business process. Salesforce, which is now almost synonymous with SaaS, is a web-based CRM solution. Salesforce specializes in managing the sales cycle—that includes managing leads/customers, organizing marketing campaigns, analyzing performance and tracking revenues.

There is a lot of customer information in Salesforce that an organization will need in their data analysis. Our DataDirect Cloud connectivity service allows you to merge a number of SaaS and on-premises sources into RapidMiner, thus increasing the capabilities of RapidMiner multifold. We have included a tutorial below to help integrate Salesforce data into your RapidMiner.

Setting Up a DataDirect Cloud Salesforce Connection

Setting up a Salesforce Account

If you do not have a Salesforce account, you can register for a free trial here.

Please go ahead and set the security token by navigating through the following path:

[Your Name] -> Setup -> Personal Information -> Reset Security Token

Save the security Token that you receive to your registered email Id.

Setting up a DataDirect Cloud account

If you do not already have a DataDirect Cloud account, you can register for a free trial here.

Once you log in to the Progress Pacific dashboard, choose "Connect Data"

Configuring Salesforce access from DataDirect Cloud

The first step is to create a data source definition. Choose ‘Data Sources’ on the left pane. And then click the ‘+New Data Source’ button.

From the list of available sources, choose Salesforce as your cloud data source.

In the next window as shown below, provide:

Name: SalesforceDB (or a name of your choice)

User Id and Password associated with your Salesforce account

Security token of your account

Salesforce Login URL: login.salesforce.com

Now, click on Test Connection. If it is successful, save this data source.

Your DataDirect Cloud can now access your Salesforce data. For more information you can refer to the following blog, which provides more details on Salesforce connectivity.

Database System: D2C Salesforce Driver (Or the name you gave to your driver)

Host: service.datadirectcloud.com

Port: 443

Database Scheme: databaseName=SalesforceDB (replace SalesforceDB with the DB name you chose while configuring Salesforce data source in DataDirect Cloud)

Provide the Username and Password for your DataDirect Cloud account

Test this connection. Once you have successfully established a connection with DataDirect Cloud, your Salesforce data is available for analysis in RapidMiner.

Analyzing Salesforce Data in RapidMiner

Create a new process: File -> New Process.

In the repository pane, navigate to the data you are looking for. In this example I will access Salesforce Account information: DB -> Salesforce D2C -> Example Sets -> SFORCE.ACCOUNT

You can click the table to view the information. Please note the BILLINGCITY and BILLINGSTATE information of Row No. 1

Go back to the design view. Next drag this table into the Process panel.
Note: You may have to open the process panel from View -> Show Panel -> Process

You can rename the retrieve operator if you would like.

Next type Set Data in the Operators Panel. Drag the ‘Set Data’ Operator into the process panel.

Connect the output of retrieve operator to this operator.

Set Example Index as 1, attribute name as BILLINGCITY, value as Austin as shown below:

Click the Edit List(0) and add entries as shown below:

Connect the output of Set Data to the res button at the right corner of the Process Panel. The design will look as shown below:

Click the play button and you will see that the data is modified in the example table as shown below:

This application leverages our powerful DataDirect Cloud Connectivity Service. Whether you are connecting to other SaaS sources or to an on-premises data source behind a firewall, DataDirect Cloud lets you do it. The connectivity service currently supports 50+ different data sources including SaaS/Cloud sources, Relational databases and Big Data sources. You can connect any of those sources to your RapidMiner account without having to change any of the application code. For more information,please get in touch with one of our experts.

Nishanth Kadiyala is a Technical Marketing Manager at Progress. He has worked as a Software IT professional for 3 years during which he actively pursued several technologies including database designing, SQL querying and Cloud Computing. He is currently pursuing his MBA at UNC Chapel Hill and concentrating in Marketing. At UNC, he became proficient with data analytic tools such as MEXL and R. He is interested in the SaaS, Cloud and data integration technologies that are revolutionizing the world we live in.

Progress, Telerik, and certain product names used herein are trademarks or registered trademarks of Progress Software Corporation and/or one of its subsidiaries or affiliates in the U.S. and/or other countries. See Trademarks or appropriate markings.