Tom's playground

Sqoop

Sqoop is a tool that allows you to ship data from a RDBMS to a Hadoop platform. Let us take an example to clarify this. One may have some data in a MySQL table persons, within database thom. This database is stored on server 62.51.51.999. The data can be accessed with the knowledge of the server (62.51.51.999), the database (thom), tablename (persons), the userid (thom) and a password. One may then transfer these data with this sqoop command:

[pivhdsne:sqoop]$ hadoop dfs -cat /user/gpadmin/persons/part-m-00000
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Peter,Griffin,35
Glenn,Quagmire,33
tom,vanmaanen,99

After this, I tried to retrieve data from Oracle to a file on HDFS. This proved quite similar. The only difference being that another jar was used (understandable) and the columns had to be mentioned (less obvious).
However, I got it working via:

On another environment I undertook a similar exercise with transferral of an Oracle table. The first thing was to store an ojdbc.jar into the correct environment. I stored this jar in the same environment as the mysql*.jar:

I understood than the addition “–m 1” was necessary as the target table did not have a primary key. In that case one mapper needs to be included.
I then added one record to the Oracle table. This one record could be copied to HDFS with:

Introducing SQL Server meant another jar has to added. It is the sqljdbc4.jar that can be downloaded from microsoft. There are two such files, the sqljdbc.jar class library that provides support for JDBC 3.0 and the sqljdbc4.jar class library that provides support for JDBC 4.0. After downloading this jar file and storing it in the sqoop lib directory, I could fire next line:

This allowed me to download data from table tomtest that is stored in the master database of a SQL server that resides on 192.168.2.26, listening to port 1433.
And this is how you get data from a PostgreSQL server (using the postgresql-9.4-1203-jdbc4.jar):