Meltano provides a command line interface (CLI) to kick start and help you manage the configuration and orchestration of all the components in the data life cycle. It provides a single source of truth for the entire data pipeline. The CLI makes it easy to develop, run, and debug every step of the data life cycle.
Meltano provides a CLI to kick start and help you manage the configuration and orchestration of all the components in the data life cycle.

meltano add [transform | transformer]: Adds transform to your meltano.yml and updates the dbt packages and project configuration, so that the transform can run. Also used to install the dbt transformer for enabling transformations to run after extracting and loading data.

meltano add model [name_of_model]: Adds a model bundle to your meltano.yml so that you can interactively generate SQL. They are installed inside the .meltano directory and are available to use in the Meltano UI.

Transforms in Meltano are implemented by using dbt. All Meltano generated projects have a transform/ directory, which is populated with the required configuration, models, packages, etc in order to run the transformations.

When Meltano elt runs with the --transform run option, the default dbt transformations for the extractor used are run.

As an example, assume that the following command runs:

meltano elt tap-carbon-intensity target-postgres --transform run

After the Extract and Load steps are successfuly completed and data have been extracted from the Carbon Intensity API and loaded to a Postgres DB, the dbt transform runs.

Meltano uses the convention that the transform has the same name as the extractor it is for. Transforms are automatically added the first time an elt operation that requires them runs, but they can also be discovered and added to a Meltano project manually:

Transforms are basically dbt packages that reside in their own repositories. If you want to see in more details how such a package can be defined, you can check the dbt documentation on Package Management and dbt-tap-carbon-intensity, the project used for defining the default transforms for tap-carbon-intensity.

When a transform is added to a project, it is added as a dbt package in transform/packages.yml, enabled in transform/dbt_project.yml, and loaded for usage the next time dbt runs.

The format of the meltano.yml entries for transforms can have additional parameters. For example, the tap-carbon-intensity dbt package requires three variables, which are used for finding the tables where the raw Carbon Intensity data have been loaded during the Extract-Load phase:

Those entries may follow dbt's syntax in order to fetch values from environment variables. In this case, $PG_SCHEMA must be available in order for the transformations to know in which Postgres schema to find the tables with the Carbon Intensity data. Meltano uses $PG_SCHEMA by default as it is the same default schema also used by the Postgres Loader.

You can keep those parameters as they are and provide the schema as an environment variable or set the schema manually in meltano.yml:

When Meltano runs a new transformation, transform/dbt_project.yml is always kept up to date with whatever is provided in meltano.yml.

Finally, dbt can be configured by updating transform/profile/profiles.yml. By default, Meltano sets up dbt to use the same database and user as the Postgres Loader and store the results of the transformations in the analytics schema.

This is an optional tool for users who want to configure permissions if they're using Snowflake as the data warehouse and want to granularly set who has access to which data at the warehouse level. As we improve Meltano, this may become a first level concept within user roles but that is not the case today.

Use this command to check and manage the permissions of a Snowflake account.

meltano permissions grant <spec_file> --db snowflake [--dry][--diff]

Given the parameters to connect to a Snowflake account and a YAML file (a "spec") representing the desired database configuration, this command makes sure that the configuration of that database matches the spec. If there are differences, it will return the sql commands required to make it match the spec.

We currently support only Snowflake, as pgbedrock can be used for managing the permissions in a Postgres database.

The YAML specification file is used to define in a declarative way the databases, roles, users and warehouses in a Snowflake account, together with the permissions for databases, schemas and tables for the same account.

Its syntax is inspired by pgbedrock, with additional options for Snowflake.

All permissions are abreviated as read or write permissions, with Meltano generating the proper grants for each type of object.

Tables and views are listed under tables and handled properly behind the scenes.

If * is provided as the parameter for tables the grant statement will use the ALL <object_type>S in SCHEMA syntax. It will also grant to future tables and views. See Snowflake documenation for ON FUTURE

Currently, meltano invoke gives you raw access to the underlying plugin after any configuration hooks.

View 'meltano' dags:

meltano invoke airflow list_dags

Manually trigger a task to run:

meltano invoke airflow run --raw meltano extract_load $(date -I)

Start the airflow ui - currently starts in a separate browser:

meltano invoke airflow webserver -D

Start the airflow scheduler, enabling background job processing:

meltano invoke airflow scheduler -D

Trigger a dag run:

meltano invoke airflow trigger_dag meltano

Airflow is a full-featured orchestrator that has a lot of features that are currently outside of Meltano's scope. As we are improving this integration, Meltano will facade more of these feature to create a seamless experience using this orchestrator. Please refer to the Airflow documentation for more in-depth knowledge about Airflow.