Dataduct is a wrapper built
on top of AWS
Datapipeline
which makes it easy to create ETL jobs. All jobs can be specified as a
series of steps in a YAML file and would automatically be translated
into datapipeline with appropriate pipeline objects.

Features include:

Visualizing pipeline activities

Extracting data from different sources such as RDS, S3, local files

Transforming data using EC2 and EMR

Loading data into redshift

Transforming data inside redshift

QA data between the source system and warehouse

It is easy to create custom steps to augment the DSL as per the
requirements. As well as running a backfill with the command line
interface.