Google BigQuery

EDIS Online User Guide – Google BigQuery

EDIS has built in support for Google BigQuery to import and export data, as well as perform other tasks within BigQuery. To connect to Google BigQuery, you first need to setup a BigQuery service account.

BigQuery Setup Service Account
This task creates a service ID to connect to Google BigQuery.

Remarks
When a new BigQuery connection is created, it is by default accessible to all users in the EDIS_Role. To lock to a specific ID, group, or set of ID’s, see section Multi-Tenant Support.

Procedure:SSISDB.EDIS.usp_config_bigquery_id

Parameters

Name

Description

Default Value

@bq_svc_id

The name of the service ID to use when connecting to Google BigQuery

None; Required

@project_nm

The name of the Google BigQuery project to associate this service ID with.

None; Required

@project_id

The Project ID for the Google BigQuery service account

None; Required

@creds_file_path

This is the file path to the Json service account file provided from Google BigQuery when you setup a service account. This file contains the X509 certificate’s private key, which will be called on when connecting to BigQuery.

None; Required

Example: This example creates a Google BigQuery service ID called “svc_sls”.

EXEC SSISDB.EDIS.usp_config_bigquery_id @bq_svc_id = 'svc_sls'

,@project_nm = 'sales-team-proj'

,@project_id = '12343252972729'

,@creds_file_path = 'C:\temp\svc_sls_creds.json'

BigQuery Import Task
This tasks imports data from Google BigQuery to SQL Server. You simply provide a source query and a destination table and EDIS does the rest.

Procedure:SSISDB.EDIS.usp_bigquery_import_data

Parameters

Name

Description

Default Value

@bq_svc_id

Name of the service ID to use to connect to BigQuery (See proc EDIS.usp_config_bigquery_id to setup a BigQuery service ID)

None; Required

@dest_svc_id

The destination EDIS Service ID to send the results to. Currently, only SQL Server destinations are supported. The default is the host SQL Server for EDIS

EDIS Host SQL Server (@@servername)

@src_qry

The source query to use for extracting the data. Note: The source query needs to be written in Standard SQL. EDIS does not support the source query written in legacy SQL.

None; Required

@dest_tbl_nm

Name of the destination table to import the data to. If the destination table is a temp table, it can be listed as ‘##myTbl’. If the table is not a temporary table, it needs to be 3-part qualified i.e. [database].[schema].[table]

None; Required

@crt_dest_tbl

Flag indicating if you want to have EDIS create the destination table for you if it does not exist.

0 (False)

@load_on_ordinal

Flag indicating if you want to import the data to the destination table based on the column order instead of the column name. The default is that EDIS will map columns based on column names

0 (False)

@bq_fetch_size

The fetch size when streaming rows from BigQuery. If your dataset has a lot of columns and you receive errors on the import, adjust the fetch size down.

1000

@batch_size

Batch size for importing rows.

10000

Example: This example imports data from BigQuery to a temp table called “##bq_import_raw”.

Name of the service ID to use to connect to BigQuery (See proc EDIS.usp_config_bigquery_id to setup a BigQuery service ID)

None; Required

@src_svc_id

The source EDIS Service ID to query against. The source can be any db connection service ID in EDIS. The default is the host SQL Server for EDIS

EDIS Host SQL Server (@@servername)

@src_qry

The source query to use for querying against the @src_svc_id.

None; Required

@dataset_nm

The BigQuery dataset name to export the source data to

None; Required

@dest_tbl_nm

The destination table in BigQuery to load the results to. If the table does not exist, EDIS will create it at runtime.

None; Required

@truncate_existing

Flag indicating if you want to truncate the existing data on the destination table in BigQuery before loading the results

0 (False)

@drop_existing

Flag indicating if you want BigQuery to drop the existing table and recreate it at runtime.

0 (False)

@merge_existing

Flag indicating if you want to update/append the source data to the destination table. This is similar to a SQL MERGE command that will update existing rows and append new ones based on the primary key. Since Google BigQuery does not have primary keys, the next argument @dest_tbl_key_cols is where you supply the key columns so the merge command knows how to merge the results to BigQuery (i.e. what rows get updated, what rows get appended).

0 (False)

@dest_tbl_key_cols

The destination table primary key columns. This argument is only applicable if you set the @merge_existing parameter to 1 (true). When supplying multiple columns for this parameter, seperate them with a comma i.e. @dest_tbl_key_cols = ‘rgn_id, fscl_wk’

Empty String

@work_dir

Temporary local work directory to use for preparing BigQuery data for export. The disk space on this directory needs to be large enough to hold the source query results in a CSV file format. Once the bigquery operation is complete, any temporary files in this directory are automatically deleted.

None; Required

Example: This example exports data to BigQuery.

EXEC SSISDB.EDIS.usp_bigquery_export_data

@bq_svc_id = 'svc_sls'

,@src_qry =

'

SELECT RGN_ID, SUM(SALES) AS SALES

FROM adventureworks.sales.sales_his

GROUP BY RGN_ID

'

,@dataset_nm = 'REGION_SALES_REPO'

,@dest_tbl_nm = 'sales_his'

,@work_dir = 'C:\temp\bq_data\'

BigQuery Append Data Task
This tasks appends data from a BigQuery source query to another table in BigQuery. Columns are matched based on their name.

Procedure:SSISDB.EDIS.usp_bigquery_append_data

Parameters

Name

Description

Default Value

@bq_svc_id

Name of the service ID to use to connect to BigQuery (See proc EDIS.usp_config_bigquery_id to setup a BigQuery service ID)

None; Required

@src_qry

The BigQuery source query. Must be written in StandardSQL.

None; Required

@dataset_nm

The BigQuery dataset name to append the source data to

None; Required

@dest_tbl_nm

The BigQuery destination table name to append the source data to

None; Required

Example: This example appends data to a BigQuery table.

EXEC SSISDB.EDIS.usp_bigquery_append_data

@bq_svc_id = 'svc_sls'

,@src_qry =

'

SELECT RGN_ID, SUM(SALES) AS SALES

FROM `sales-dev.region_sales.weekly_results`

GROUP BY RGN_ID

'

,@dataset_nm = 'REGION_SALES_REPO_HIS'

,@dest_tbl_nm = 'sales_his'

BigQuery Create Table from Query Task
This tasks creates a new BigQuery table based on a BigQuery SELECT statement. This is similar to SQL Server’s SELECT INTO command, which creates a new table with data.

Procedure:SSISDB.EDIS.usp_bigquery_crt_tbl_from_qry

Parameters

Name

Description

Default Value

@bq_svc_id

Name of the service ID to use to connect to BigQuery (See proc EDIS.usp_config_bigquery_id to setup a BigQuery service ID)

None; Required

@src_qry

The BigQuery source query. Must be written in StandardSQL.

None; Required

@dataset_nm

The BigQuery dataset name to append the source data to

None; Required

@dest_tbl_nm

The BigQuery destination table name to append the source data to

None; Required

@drop_existing

Flag indicating if you want BigQuery to drop the table if it exists. If this is set to 0 (False) and the destination table already exists, an error will be raised.

0 (False)

Example: This example creates a new table in BigQuery from a SELECT statement.

EXEC SSISDB.EDIS.usp_bigquery_crt_tbl_from_qry

@bq_svc_id = 'svc_sls'

,@src_qry =

'

SELECT RGN_ID, SUM(SALES) AS SALES

FROM `sales-dev.region_sales.weekly_results`

GROUP BY RGN_ID

'

,@dataset_nm = 'REGION_SALES_REPO_HIS'

,@dest_tbl_nm = 'sales_his'

,@drop_existing = 1

BigQuery Drop Table Task
This tasks drops a table in BigQuery if it exists

Procedure:SSISDB.EDIS.usp_bigquery_drop_tbl

Parameters

Name

Description

Default Value

@bq_svc_id

Name of the service ID to use to connect to BigQuery (See proc EDIS.usp_config_bigquery_id to setup a BigQuery service ID)

None; Required

@dataset_nm

The BigQuery dataset name where the table to drop lives

None; Required

@dest_tbl_nm

The BigQuery destination table name to drop

None; Required

@fail_if_missing

Flag indicating if you want the task to fail if the table to delete does not exist. By default, this task acts in a manner similar to DROP IF EXISTS, meaning the task does not fail if the table is missing. If you want to override this behavior, set this flag to 1

0 (False)

Example: This example drops table RGN_SALES_TEST in dataset RGN_SLS.

EXEC SSISDB.EDIS.usp_bigquery_drop_tbl

@bq_svc_id = 'svc_sls'

,@dataset_nm = 'RGN_SLS'

,@dest_tbl_nm = 'RGN_SALES_TEST'

BigQuery Run SQL Command Task
This tasks executes a SQL Command against a BigQuery table

Procedure:SSISDB.EDIS.usp_bigquery_run_sql_cmd

Parameters

Name

Description

Default Value

@bq_svc_id

Name of the service ID to use to connect to BigQuery (See proc EDIS.usp_config_bigquery_id to setup a BigQuery service ID)