Manually adding an empty column

If you add new columns to an existing table schema, the columns must be
NULLABLE or REPEATED. You cannot add a REQUIRED column to an existing
table schema. If you attempt to add a REQUIRED column to an existing table
schema in the CLI or API, the following error is returned: BigQuery error in
update operation: Provided Schema does not match Table
[PROJECT_ID]:[DATASET].[TABLE]. Cannot add required columns to an existing
schema. REQUIRED columns can be added only when you create a table while
loading data, or when you create an empty table with a schema definition.

After adding a new column to your table's schema definition, you can load data
into the new column by using a:

CLI

Issue the bq update command and provide a JSON schema file. If the table
you're updating is in a project other than your default project, add the
project ID to the dataset name in the following format:
[PROJECT_ID]:[DATASET].

bq update [PROJECT_ID]:[DATASET].[TABLE] [SCHEMA]

Where:

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
updating.

[TABLE] is the name of the table you're updating.

[SCHEMA] is the path to the JSON schema file on your local machine.

When you specify the schema on the command line, you cannot include a
RECORD (STRUCT)
type, you cannot include a column description, and you cannot specify the
column's mode. All modes default to NULLABLE.

If you attempt to add columns using an inline schema definition, you must
supply the entire schema definition including the new columns. Because you
cannot specify column modes using an inline schema definition, the update
will attempt to change any existing REQUIRED column to NULLABLE. This
results in the following error: BigQuery error in update operation: Provided Schema does not match Table [PROJECT_ID]:[DATASET].[TABLE]. Field [FIELD] has changed mode from REPEATED to NULLABLE.

First, issue the bq show command with the --schema flag and write the
existing table schema to a file. If the table you're updating is in a
project other than your default project, add the project ID to the
dataset name in the following format: [PROJECT_ID]:[DATASET].

Add the new columns to the end of the schema definition. If you attempt
to add new columns elsewhere in the array, the following error is returned:
BigQuery error in update operation: Precondition Failed.

Using a JSON file, you can specify descriptions, NULLABLE or REPEATED
modes, and RECORD types for new columns. For example, using the schema
definition from the previous step, your new JSON array would look like the
following. In this example, a new NULLABLE column is added named
column4. column4 includes a description.

After updating your schema file, issue the following command to update
the table's schema. If the table you're updating is in a project other than
your default project, add the project ID to the dataset name in the
following format: [PROJECT_ID]:[DATASET].

bq update [PROJECT_ID]:[DATASET].[TABLE] [SCHEMA]

Where:

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
updating.

[TABLE] is the name of the table you're updating.

[SCHEMA] is the path to the JSON schema file on your local machine.

For example, enter the following command to update the schema definition
of mydataset.mytable in your default project. The path to the schema
file on your local machine is /tmp/myschema.json.

bq update mydataset.mytable /tmp/myschema.json

API

Call the tables.patch
method and use the schema
property to add empty columns to your schema definition. Because the
tables.update method replaces the entire table resource, the tables.patch
method is preferred.

Adding a nested column to a RECORD

In addition to adding new columns to a table's schema, you can also add new
nested columns to a RECORD. The process for adding a new nested column is very
similar to the process for adding a new column.

Console

Adding a new nested field to an exising RECORD column is not currently
supported by the console BigQuery web UI.

Classic UI

Adding a new nested field to an exising RECORD column is not currently
supported by the classic BigQuery web UI.

CLI

Issue the bq update command and provide a JSON schema file that adds the
nested field to the existing RECORD column's schema definition. If the table
you're updating is in a project other than your default project, add the
project ID to the dataset name in the following format:
[PROJECT_ID]:[DATASET].

bq update [PROJECT_ID]:[DATASET].[TABLE] [SCHEMA]

Where:

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
updating.

[TABLE] is the name of the table you're updating.

[SCHEMA] is the path to the JSON schema file on your local machine.

When you specify the schema on the command line, you cannot include a
RECORD (STRUCT)
type, you cannot include a column description, and you cannot specify the
column's mode. All modes default to NULLABLE. As a result, if you are
adding a new nested column to a RECORD, you must supply a JSON schema file.

To add a nested column to a RECORD using a JSON schema file:

First, issue the bq show command with the --schema flag and write the
existing table schema to a file. If the table you're updating is in a
project other than your default project, add the project ID to the
dataset name in the following format: [PROJECT_ID]:[DATASET].

Open the schema file in a text editor. The schema should look like the
following. In this example, column3 is a nested repeated column. The
nested columns are nested1 and nested2. The fields array lists
the fields nested within column3.

After updating your schema file, issue the following command to update
the table's schema. If the table you're updating is in a project other than
your default project, add the project ID to the dataset name in the
following format: [PROJECT_ID]:[DATASET].

bq update [PROJECT_ID]:[DATASET].[TABLE] [SCHEMA]

Where:

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
updating.

[TABLE] is the name of the table you're updating.

[SCHEMA] is the path to the JSON schema file on your local machine.

For example, enter the following command to update the schema definition
of mydataset.mytable in your default project. The path to the schema
file on your local machine is /tmp/myschema.json.

bq update mydataset.mytable /tmp/myschema.json

API

Call the tables.patch
method and use the schema
property to add the nested columns to your schema definition. Because the
tables.update method replaces the entire table resource, the tables.patch
method is preferred.

Adding a column when you overwrite or append data

You can add new columns to an existing table when you load data into it and
choose to overwrite the existing table. When you overwrite an existing table,
the schema of the data you're loading is used to overwrite the existing table's
schema. For information on overwriting a table using a load job, see:

Adding a column in a load append job

Adding a column to an existing table during an append operation is not
currently supported by the GCP Console or the classic BigQuery web UI.

When you add columns using an append operation in a load job, the updated
schema can be:

Automatically detected (for CSV and JSON files)

Specified in a JSON schema file (for CSV and JSON files)

Automatically inferred from the data for Avro and Cloud Datastore export files

If you specify the schema in a JSON file, the new columns must be defined in it.
If the new column definitions are missing, the following error is returned when
you attempt to append the data: Error while reading data, error message:
parsing error in row starting at position [INT]: No such field: [FIELD].

When you add new columns during an append operation,
the values in the new columns are set to NULL for existing rows.

To add a new column when you append data to a table during a load job:

Console

You cannot add new columns to an existing table when you load data using the
console BigQuery web UI.

Classic UI

You cannot add new columns to an existing table when you load data using the
classic BigQuery web UI.

CLI

Use the bq load command to load your data and specify the --noreplace
flag to indicate that you are appending the data to an existing table.

If the data you're appending is in CSV or newline-delimited JSON format,
specify the --autodetect flag to use schema auto-detection
or supply the schema in a JSON schema file. The added columns can be
automatically inferred from Avro or Cloud Datastore export files.

Set the --schema_update_option flag to ALLOW_FIELD_ADDITION to indicate
that the data you're appending contains new columns.

If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: [PROJECT_ID]:[DATASET].

[LOCATION] is the name of your location. The --location flag is
optional if your data is in the US or the EU multi-region
location. For example, if you are using
BigQuery in the Tokyo region, set the flag's value to
asia-northeast1. You can set a default value for the location using the
.bigqueryrc file.

[FORMAT] is NEWLINE_DELIMITED_JSON, CSV, AVRO, or
DATASTORE_BACKUP.

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table.

[TABLE] is the name of the table you're appending.

[PATH_TO_SOURCE] is a fully-qualified Cloud Storage URI,
a comma-separated list of URIs, or the path to a data file on your
local machine.

[SCHEMA] is the path to a local JSON schema file. A schema file is
required only for CSV and JSON files when --autodetect is
unspecified. Avro and Cloud Datastore schemas are inferred from the
source data.

Examples:

Enter the following command to append a local Avro data file,
/tmp/mydata.avro, to mydataset.mytable using a load job. Since added
columns can be automatically inferred from Avro data you do not need to use
the --autodetect flag. mydataset is in your default project and was
created in the US multi-region location.

Enter the following command append a newline-delimited JSON data file in
Cloud Storage to mydataset.mytable using a load job. The --autodetect
flag is used to detect the new columns. mydataset is in your default project
and was created in the US multi-region location.

Enter the following command append a newline-delimited JSON data file in
Cloud Storage to mydataset.mytable using a load job. The schema containing
the new columns is specified in a local JSON schema file,
/tmp/myschema.json. mydataset is in myotherproject, not your default
project. mydataset was created in the US multi-region location.

Enter the following command append a newline-delimited JSON data file in
Cloud Storage to mydataset.mytable using a load job. The --autodetect
flag is used to detect the new columns. mydataset is in your default
project and was created in the asia-northeast1 region.

[LOCATION] is the name of your location. The --location flag is
optional if your data is in the US or the EU multi-region
location. For example, if you are using
BigQuery in the Tokyo region, set the flag's value to
asia-northeast1. You can set a default value for the location using the
.bigqueryrc file.

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
appending.

[TABLE] is the name of the table you're appending.

[QUERY] is a query in standard SQL syntax.

Examples:

Enter the following command query mydataset.mytable in your default
project and to append the query results to mydataset.mytable2 (also in
your default project). mydataset was created in the US multi-region
location.

Enter the following command query mydataset.mytable in your default
project and to append the query results to mydataset.mytable2 in
myotherproject. mydataset was created in the US multi-region location.

Enter the following command query mydataset.mytable in your default
project and to append the query results to mydataset.mytable2 (also in
your default project). mydataset was created in the asia-northeast1
region.

Relaxing a column's mode

Currently, the only supported modification you can make to a column's mode is
changing it from REQUIRED to NULLABLE. Changing a column's mode from
REQUIRED to NULLABLE is also called column relaxation. You can relax
REQUIRED columns:

Console

You cannot currently relax a column's mode using the console BigQuery web UI.

Classic UI

Expand your dataset and select your table.

On the Table Details page, click the Schema tab.

Click the down arrow to the right of your required column and choose
either:

Make NULLABLE — relaxes the individual column's mode

All REQUIRED to NULLABLE — changes all REQUIRED columns
in the schema definition to NULLABLE

In the Confirm Mode Change dialog, click OK to change the mode to
NULLABLE. Note that this change cannot be undone.

CLI

First, issue the bq show command with the --schema flag and write the
existing table schema to a file. If the table you're updating is in a
project other than your default project, add the project ID to the
dataset name in the following format: [PROJECT_ID]:[DATASET].

After updating your schema file, issue the following command to update
the table's schema. If the table you're updating is in a project other than
your default project, add the project ID to the dataset name in the
following format: [PROJECT_ID]:[DATASET].

bq update [PROJECT_ID]:[DATASET].[TABLE] [SCHEMA]

Where:

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
updating.

[TABLE] is the name of the table you're updating.

[SCHEMA] is the path to the JSON schema file on your local machine.

For example, enter the following command to update the schema definition of
mydataset.mytable in your default project. The path to the schema file
on your local machine is /tmp/myschema.json.

bq update mydataset.mytable /tmp/myschema.json

API

Call tables.patch and
use the schema property to change a REQUIRED column to NULLABLE in
your schema definition. Because the tables.update method replaces the
entire table resource, the tables.patch method is preferred.

Changing REQUIRED to NULLABLE in a load or query job

You can relax REQUIRED columns to NULLABLE in an existing table's schema
when you load data into it and choose to overwrite the existing table. When you
overwrite an existing table, the schema of the data you're loading is used to
overwrite the existing table's schema. For information on overwriting a table
using a load job, see:

Note: Column relaxation does not apply to Cloud Datastore export appends. The
columns in tables created by loading Cloud Datastore export files are always
NULLABLE.

To relax a column from REQUIRED to NULLABLEwhen you append data to a table
during a load job:

Console

You cannot currently relax a column's mode using the console BigQuery web UI.

Classic UI

You cannot relax an existing column's mode when you append data to a table
in a load job using the classic BigQuery web UI.

CLI

Use the bq load command to load your data and specify the --noreplace
flag to indicate that you are appending the data to an existing table.

If the data you're appending is in CSV or newline-delimited JSON format,
specify the relaxed columns in a local JSON schema file or use the
--autodetect flag to use schema detection
to discover relaxed columns in the source data. For information on relaxing
column modes using a JSON schema file, see
Manually changing REQUIRED columns to NULLABLE.

Relaxed columns can be automatically inferred from Avro files. Column
relaxation does not apply to Cloud Datastore export appends. The columns in
tables created by loading Cloud Datastore export files are always
NULLABLE.

Set the --schema_update_option flag to ALLOW_FIELD_RELAXATION to
indicate that the data you're appending contains relaxed columns.

If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: [PROJECT_ID]:[DATASET].

[LOCATION] is the name of your location. The --location flag is
optional if your data is in the US or the EU multi-region
location. For example, if you are using
BigQuery in the Tokyo region, set the flag's value to
asia-northeast1. You can set a default value for the location using the
.bigqueryrc file.

[FORMAT] is NEWLINE_DELIMITED_JSON, CSV, or AVRO.
DATASTORE_BACKUP files do not require column relaxation. The columns
in tables created from Cloud Datastore export files are always
NULLABLE.

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table.

[TABLE] is the name of the table you're appending.

[PATH_TO_SOURCE] is a fully-qualified Cloud Storage URI,
a comma-separated list of URIs, or the path to a data file on your
local machine.

[SCHEMA] is the path to a local JSON schema file. This option is
used only for CSV and JSON files. Relaxed columns are automatically
inferred from Avro files.

Examples:

Enter the following command to append a local Avro data file,
/tmp/mydata.avro, to mydataset.mytable using a load job. Since added
columns can be automatically inferred from Avro data you do not need to
specify a schema file. mydataset is in your default project and was created
in the US multi-region location.

Enter the following command append data from a newline-delimited JSON file
in Cloud Storage to mydataset.mytable using a load job. The schema
containing the relaxed columns is in a local JSON schema file —
/tmp/myschema.json. mydataset is in your default project and was created
in the US multi-region location..

Enter the following command append data in a CSV file on your local machine
to mydataset.mytable using a load job. The command uses schema auto-
detection to discover relaxed columns in the source data. mydataset is
in myotherproject, not your default project. mydataset was created in
the US multi-region location.

Enter the following command append data from a newline-delimited JSON file
in Cloud Storage to mydataset.mytable using a load job. The schema
containing the relaxed columns is in a local JSON schema file —
/tmp/myschema.json. mydataset is in your default project and was created
in the asia-northeast1 region.

Changing REQUIRED to NULLABLE in a query append job

Relaxing columns during an append operation is not currently supported by
the GCP Console or the classic BigQuery web UI.

When you relax columns using an append operation in a query job, you can relax
all required fields in the destination table by setting the
--schema_update_option flag to ALLOW_FIELD_RELAXATION. You cannot relax
individual columns in a destination table using a query append.

To relax all columns in a destination table when you append data to it during a
query job:

Console

You cannot currently relax a column's mode using the console BigQuery web UI.

Classic UI

You cannot relax columns in a destination table when you append query
results using the BigQuery web UI.

CLI

Use the bq query command to query your data and specify the
--destination_table flag to indicate which table you're appending.

To specify that you are appending query results to an existing destination
table, specify the --append_table flag.

Set the --schema_update_option flag to ALLOW_FIELD_RELAXATION to indicate
that all REQUIRED columns in the table you're appending should be changed
to NULLABLE.

Specify the use_legacy_sql=false flag to use standard SQL syntax for the
query.

If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format: [PROJECT_ID]:[DATASET].

[LOCATION] is the name of your location. The --location flag is
optional if your data is in the US or the EU multi-region
location. For example, if you are using
BigQuery in the Tokyo region, set the flag's value to
asia-northeast1. You can set a default value for the location using the
.bigqueryrc file.

[PROJECT_ID] is your project ID.

[DATASET] is the name of the dataset that contains the table you're
appending.

[TABLE] is the name of the table you're appending.

[QUERY] is a query in standard SQL syntax.

Examples:

Enter the following command query mydataset.mytable in your default
project and to append the query results to mydataset.mytable2 (also in
your default project). The command changes all REQUIRED columns in the
destination table to NULLABLE. mydataset was created in the US
multi-region location.

Enter the following command query mydataset.mytable in your default
project and to append the query results to mydataset.mytable2 in
myotherproject. The command changes all REQUIRED columns in the
destination table to NULLABLE. mydataset was created in the US
multi-region location.

Enter the following command query mydataset.mytable in your default
project and to append the query results to mydataset.mytable2 (also in
your default project). The command changes all REQUIRED columns in the
destination table to NULLABLE. mydataset was created in the
asia-northeast1 region.