Configure RDBMS Writer

Last Updated: Mar 21, 2018

Overview

The RDBMSWriter plug‑in provides the ability to write data into the target table of the master RDBMS database. At the underlying implementation level, RDBMSWriter connects to a remote RDBMS database through JDBC, and runs the “insert into…” SQL statement to write data into RDBMS. RDBMSWriter is a common plug-in for writing data into relational databases, and you can add any write support for relational databases by registering a database driver.

RDBMSWriter is designed for ETL developers to import data from data warehouses to RDBMS. RDBMSWriter can also be used as a data migration tool by DBA and other users.

How it works

RDBMSWriter obtains the protocol data generated by Reader by using the data synchronization framework. Then it connects to a remote RDBMS database through JDBC, and runs the “insert into…” SQL statement to write data into RDBMS.

Features

Configuration samples

Configure a job for writing data into RDBMS.

{

"job":{

"setting":{

"speed":{

"channel":1

}

},

"content":[

{

"reader":{

"name":"streamreader",

"parameter":{

"column":[

{

"value":"DataX",

"type":"string"

},

{

"value":19880808,

"type":"long"

},

{

"value":"1988-08-08 08:08:08",

"type":"date"

},

{

"value":true,

"type":"bool"

},

{

"value":"test",

"type":"bytes"

}

],

"sliceRecordCount":1000

}

},

"writer":{

"name":"rdbmswriter",

"parameter":{

"connection":[

{

"jdbcUrl":"jdbc:dm://ip:port/database",

"table":[

"table"

]

}

],

"username":"username",

"password":"password",

"table":"table",

"column":[

"*"

],

"preSql":[

"delete from XXX;"

]

}

}

}

]

}

}

Parameter description

jdbcUrl

Description: Information of the JDBC connection to the opposite-end database. The format of jdbcUrl is in accordance with the RDBMS official specification, and the URL attachment control information can be entered. The JDBC format varies with different databases. DataX selects an appropriate database driver for data reading based on the specific JDBC format.

DM jdbc:dm://ip:port/database

db2 jdbc:db2://ip:port/database

PPAS jdbc:edb://ip:port/database

How to add new supported database with rdbmswriter:

Go to the rdbmswriter directory. The ${DATAX_HOME} here is the home directory of DataX. ${DATAX_HOME}/plugin/writer/rdbmswriter

Under the rdbmswriter directory, you can find the plugin.json configuration file. Use this file to register your specific database driver, which is placed in the drivers array. The RDBMSWriter plug-in dynamically selects the appropriate database driver to connect to the database when running the job.

{

"name":"rdbmswriter",

"class":"com.alibaba.datax.plugin.reader.rdbmswriter.RdbmsWriter",

"description":"useScene: prod. mechanism: Jdbc connection using the database, execute select sql, retrieve data from the ResultSet. warn: The more you know about the database, the less problems you encounter.",

"developer":"alibaba",

"drivers":[

"dm.jdbc.driver.DmDriver",

"com.ibm.db2.jcc.DB2Driver",

"com.sybase.jdbc3.jdbc.SybDriver",

"com.edb.Driver"

]

}

The rdbmswriter directory contains the libs sub-directory, under which you must put your specific database driver.

$tree

.

|-- libs

||--Dm7JdbcDriver16.jar

||-- commons-collections-3.0.jar

||-- commons-io-2.4.jar

||-- commons-lang3-3.3.2.jar

||-- commons-math3-3.1.1.jar

||-- datax-common-0.0.1-SNAPSHOT.jar

||-- datax-service-face-1.0.23-20160120.024328-1.jar

||-- db2jcc4.jar

||-- druid-1.0.15.jar

||-- edb-jdbc16.jar

||-- fastjson-1.1.46.sec01.jar

||-- guava-r05.jar

||-- hamcrest-core-1.3.jar

||-- jconn3-1.0.0-SNAPSHOT.jar

||-- logback-classic-1.0.13.jar

||-- logback-core-1.0.13.jar

||-- plugin-rdbms-util-0.0.1-SNAPSHOT.jar

|`-- slf4j-api-1.7.10.jar

|-- plugin.json

|-- plugin_job_template.json

`-- rdbmswriter-0.0.1-SNAPSHOT.jar

Required: Yes

Default value: None

username

Description: Username for the data source.

Required: Yes

Default value: None

password

Description: Password corresponding to the specified username for the data source.

Required: Yes

Default value: None

table

Description: Target table name. If the schema information of table is not consistent with the username in the preceding configuration, enter the table information in the schema.table format.

Required: Yes

Default value: None

column

Description: The column name set to be synchronized in the configured table, separated by commas (,).It is not recommended to use the default columns.

Required: Yes

Default value: None

preSql

Description: The SQL statement that is run before the data synchronization task is run. Currently, you can run only one SQL statement, for example, clear old data.

Required: No

Default value: None

postSql

Description: The SQL statement that is run before the data synchronization task is run. Currently, you can run only one SQL statement, for example, add a timestamp.

Required: No

Default value: None

batchSize

Description: The quantity of records submitted in one operation. Setting this parameter can greatly reduce the interactions between DataX and RDBMS over the network, and increase the overall throughput. However, an excessively large value may cause the running process of DataX to become out of memory (OOM).

Required: No

Default value: 1024

Type conversion

RDBMSReader supports most general-purpose relational database types such as numbers and characters. Make your decision based on your specific database type.