io.parsek

Parsek is a Scala library for building ETL pipelines in functional way.

Overview

The main goal is to provide tools to work with data in generic form independently of source or target formats. For the initial idea was taken JSON AST and methods to work with it from libraries like Circe and Play Json.

So why not original Circe and JSON? The main problem with JSON is limited type support. For example there missing important for ETL types like Date, DateTime, Byte Array. Also common tasks in ETL are data cleaning, validation and transforming from one form to another. Circe and especially Play will required a lot of boilerplate code.

Parsek has modular architecture with minimum external dependencies. Yes we know what is dependency hell! It explains why no dependency on Scalaz/Cats or Monocle.