Design the Pipeline

Let’s assume we have a simple scenario: events are streaming to Kafka, and we want to consume the events in our pipeline, making some transformations and writing the results to BigQuery tables, to make the data available for analytics.

The BigQuery table can be created before the job has started, or, the Beam itself can create it.

And how we can process elements by tag? Let’s change the pipeline, and make the split. The MAIN elements are going to the BigQuery table and the DEADLETTER_OUT elements will get sent to the error table.

In the above snippet, we get the failed TableRows with their error from BigQueryIO. Now we can transform them into another TableRow and write them to an error table. In this case, we let the job to create the table when needed.