Overview

The main mechanism to interact with your BaseSpace Sequence Hub data is via the website at www.basespace.illumina.com. However, for some use-cases, it can be useful to work with the same data using the Linux command line interface (CLI).

With BaseMount, we introduced a way to mount your BaseSpace Sequence Hub files and explore them on the command line as if they were a file system. Now, we are taking this a step further by introducing a suite of tools, BaseSpace Sequence Hub Command Line Interface (BaseSpaceCLI) to both read data from your BaseSpace Hub account and create new data, by uploading samples and launching apps. These tools integrate with BaseMount and provide a way to carry out many routine BaseSpace Sequence Hub tasks efficiently at the command line.

MacOS (Yosemite and El Capitan)

Other Unix platforms

The tarball used for the Mac installation should work on any Unix platform that supports Python, but this is unsupported so use at your own risk.

Demonstration Videos

As well as this documentation, there are also video demos of BaseSpaceCLI functionality.

Authentication

Before the CLI tools can be used, an authentication token must be obtained from BaseSpace Sequence Hub . This token will be used each time the CLI tools contact the BaseSpace Sequence Hub API. If you have already used BaseMount, a token will already be present that can also be used by BaseSpaceCLI and you can skip this step.

To authenticate, type bs authenticate at the terminal. BaseSpaceCLI supplies you with a URL. Copy and paste this URL into your web browser and login to BaseSpace Sequence Hub as normal. Then click the "Accept" button to allow the CLI to access your BaseSpace Sequence Hub account. Once you click accept, the word "success!" should appear in your terminal to show your BaseSpaceCLI account is now authenticated.

Unlike BaseMount, BaseSpaceCLI does not support access token encryption. This would mean that you would have to enter your password after every BaseSpace CLI command. If you have already authenticated though BaseMount with an encrypted access token, BaseSpaceCLI will ask you to reauthenticate to get an unencrypted access token.

App Launch Access Tokens

BaseSpace needed a platform update to support the generation of access tokens which permit app launch. If you have an older access token, such as one generated when BaseMount was originally released, this token may give an error message if you try to launch an app with the CLI. In this case, you should delete your access token (rm ~/.basespace/default.cfg) and reauthenticate.

You can also authenticate against an alternative BaseSpace instance (e.g. a BaseSpace Onsite System) using the --api-server option:

$ bs --config hoth authenticate --api-server https://api.cloud-hoth.illumina.com/
# use "bs --config hoth" for successive commands to make use of this token

When deriving an access token, the scopes define the permissions a token has to access, create and manipulate BaseSpace data.
By default, bs authenticate requests a fairly broad set of scopes that works well with the supported operations, but sometimes it can be desirable to request a specific set of scopes.
You can achieve this by:

$ bs --config guest --scopes="BROWSE GLOBAL,READ GLOBAL"

Wrapping

All the BaseSpace CLI tools can be accessed through a single command: bs. This command uses the same model as git and similar tools to access the underlying functionality, so in place of git push, git pull, git commit, the BaseSpace CLI has bs launch, bs list, bs upload and so on.

Usage and available commands

To see a usage/help message and the tools that can be executed through bs, run it with the --help option:

$ bs --help
usage: bs [options] <COMMAND> [cmd-options]
BaseSpace's Command Line Interface (CLI).
The BaseSpaceCLI tool suite is a set of command line tools for interacting with BaseSpace,
Illumina's cloud-based sequencing informatics platform.
Optional arguments:
-v, --verbose Increase verbosity of output. Can be repeated.
--log-file LOG_FILE Specify a file to log output. Disabled by default.
-q, --quiet Suppress output except warnings and errors.
-h, --help Show help message and exit.
--debug Show tracebacks on errors.
-V, --version Show program's version number and exit.
--dry-run Rehearsing COMMAND, without actually running it.
-c CONFIG, --config CONFIG Configuration id, to be used to access: ~/.basespace/<CONFIG>.cfg
--terse Output relevant BaseSpace IDs to stdout, but nothing else; warnings
and errors still appear on stderr
Commands:
Credentials:
authenticate obtain credentials for the BaseSpaceCLI tools to use
whoami get information about selected config
Creating and listing:
create project create a project
history Get the event history for a user
list list BaseSpace entities
upload sample upload a FASTQ-based sample into a BaseSpace project.
Apps:
import app import a new app for launch
kill appsession abort running appsessions.
launch app launch an app.
Filesystem:
mount access BaseSpace as a filesystem.
unmount stop accessing BaseSpace as a filesystem.
Configuration:
register register a BaseSpaceCLI tool
unregister delete one or more BaseSpaceCLI tool from the registry
See 'bs help COMMAND' for more information on a specific command.

Options

General Options

General options are show at the top of the help message for the top level bs command. These are as follows:

Verbose (-v), quiet (-q) and debug (--debug) control the level of output and the log-file option (--log-file) allows this output to be written to a file

The help (-h) and version (-V) options provide information about the bs command itself

The dry-run mode (--dry-run) is used as a cue by the underlying tools to report what they would do without actually doing it. This can also be useful when used with the verbose mode to see how the tools are interacting with the BaseSpace API.

The config option (-c) allows the user to select a different configuration to be used. More information can be found in selecting configurations.

The terse option (--terse) outputs only relevant BaseSpace IDs and nothing else, to assist with scripting bs commands together. Some examples of how the terse option is interpreted

When listing entities (samples, projects, appresults) only the ID is returned with no header or other output

When launching apps, only the appsession ID is returned

When uploading samples, only the sample ID is returned

Specific Options

Running an underlying command with the --help option shows the usage message for that command. The example below shows the help for the launch command:

Selecting Configurations

The bs command requires some configuration information to provide the BaseSpace API location and the access token that should be used for API calls. By default, commands launched with bs will use $HOME/.basespace/default.cfg, but a different configuration can be selected with eg. --config other pointing at the configuration file in $HOME/.basespace/other.cfg. This config file is also paired with its own app specification file for app launch.

Having separate configuration like this is useful if you have multiple BaseSpace users or work with multiple BaseSpace instances, which might also have different app details, so you can conveniently switch between them.

These configuration files are shared with BaseMount, so if you are already a BaseMount user you may already have configurations set up, or you can set them up with BaseSpaceCLI and they are automatically available to BaseMount.

Verb-noun

The bs commands are all provided in the form bs <verb> <noun> eg. bs launch app and bs upload sample. The bs command also allows this ordering to be reversed, so that you can specify bs <noun> <verb> eg. bs app launch and bs sample upload. This helps with command discovery, since you can type bs sample and see all the operations that can be performed on samples.

Auto Completion

The bs command supports tab-completion, so you can type bs upl<tab> and it will autocomplete to bs upload.

Adding Additional Tools

Additional tools can be added to BaseSpaceCLI and will immediately be accessible through the bs wrapping command. This can be achieved by bs register and bs unregister.

Exit Codes

The BaseSpaceCLI tools follow the convention of issuing exit codes to indicate whether a command was successful or not. The exit code will be 0 if the command was successful and non-zero otherwise. Some examples of commands that might fail are improperly formatted commands or those where the BaseSpaceAPI returns an error, such as if a sample upload or app launch fails.

As with all shell commands, the exit code is stored in the $? variable directly after the command has been run.

Launching an App

The bs launch app command is a tool to launch BaseSpace apps from the command line.

Basic Usage

Apps can only be launched if the app has been "imported", which means setting up a specification how the app should be launched. Specifications for the BWA, Isaac, TopHat and Cufflinks apps comes as standard:

If you run the same command with the --all-apps option, all the apps in BaseSpace are listed including a column for whether they are currently launchable with the CLI.

The parameters entry shows what you need to pass on the command line to launch the app. Each argument is listed in order, with the name (eg. project-id) and then the type in brackets. If a type has a [] suffix, this means multiple arguments can be supplied either via comma separation or by providing a file (prefixed with an @ sign in the same way as curl) with one entry per line. Further example are provided in the bulk launches section.

Note that each specification also provides a source. This is either preset to denote the pre-made specifications that ship with BaseSpace CLI or a file containing user-derived specifications. More detail of how these specifications can be derived is provided in "Importing a New App".

You can also see the specification for a specific app by name or by BaseSpace ID:

The description of the app shows the mandatory arguments - what must be supplied - in order. Launch arguments can be specified as BaseMount paths (recommended) or as BaseSpace IDs eg.

# assumes your BaseMount path is in $BASEMOUNT
# when launching with BaseMount, the supplied paths must be of the proper type and in the correct order
# in the case of BWA, a project path followed by a sample path
# the returned string is the name of the launch which can be found through the basespace.com web page or with "bs list appsessions"
$ bs launch app -n BWA $BASEMOUNT/Projects/MyProject $BASEMOUNT/Projects/MyProject/Samples/MySample/
BWA Whole Genome Sequencing v1.0 : MySample
# the same launch, but done with BaseSpace IDs for app, project and sample respectively
$ bs launch app -i 279279 21646627 25385372
BWA Whole Genome Sequencing v1.0 : MySample
# Improper arguments will cause an error will be raised. This example, the arguments are in the wrong order.
$ bs launch app -n BWA $BASEMOUNT/Projects/MyProject/Samples/MySample/ $BASEMOUNT/Projects/MyProject
wrong type of BaseMount path selected: $BASEMOUNT/Projects/MyProject/Samples/MySample needs to be of type project

Project target

A project is required for every app launch. This is where the app data will be written; it does not have to be the same project where input samples or appresults come from. Your access token must have write access to the project.

Access token consistency

If you use multiple configuration files for different BaseSpace accounts, you need to make sure that any bs commands refer to a BaseMount path that is mounted with the same access token as the one referred to in the bs command. An error will be thrown if you try to mix access tokens.

Incorrect configuration options

If you make an error in specifying your options, such as specifying the genome-id as one that is misspelled, the error message you will get from BaseSpace is "Error with API server response: Conflict: Form validation found some errors."

Setting Options

Most BaseSpace apps have additional options that can be specified in the web form at launch. For the command line tool, each of these options has a default value that is set when the app is imported. These can be seen by listing the apps in verbose mode:

Unlike a web launch, you cannot specify sample attributes like strandedness for samples on an individual basis - for example you cannot have half the samples in an app stranded and the other half not stranded.

Discovering possible option values

For fields that would usually be selected with a drop-down, like genome-id, there is currently no way to see from BaseSpaceCLI what those options should be. You need to visit the BaseSpace web page and look at the app launch form to find the possible values.

Importing a New App

To launch an app which is currently not listed, use the "import app" command. This can be used in several ways.

One method is to specify a BaseMount path to an existing AppSession. The import app command will use the details of the AppSession to derive a new app specification:

# this command will return nothing by default. You can view the new app with "bs list apps"
$ bs import app -m $BASEMOUNT/Projects/MyProject/AppSessions/MyAppSession/

When importing an app from an AppSession, it is best to choose an AppSession where as many options as possible have been chosen. If, for example, an app launch has been made with the "Call Novel Transcripts" option unchecked, this option will not appear in the AppSession and will therefore not appear in the app launch specification.

A similar method is to provide an AppSession ID, which you can obtain using the BaseSpace web interface or using bs list appsessions. As with a BaseMount path, the import app command uses the AppSession details to derive the app specification. This is a useful method if you do not have BaseMount installed:

$ bs import app -a 24780760

You can also specify the details of the app yourself. This is an advanced and error-prone technique, but is provided for completeness. It requires the properties (-p), defaults (-e), app name (-n) and app ID (-i). Both properties and defaults are specified as json files - to see the format of these files, look at an existing app specification file.

Finally, you can also import app specifications from an existing app specification file. This facilitates the sharing of app launch specifications between users:

$ bs import app -j input-apps.json

The new app is stored in the user's $HOME directory under .basespace/-apps.json. The app specification files are shown when listing apps with bs list apps:

This allows you to find the file so you can send it to other users and they can import the apps with import apps -j.

Note that the app specifications are paired with a configuration file, so that you can have one set of app specifications for each configuration - this is useful if you work with multiple BaseSpace instances, which might have different app configurations. More details about these configuration files are found in "Multiple Configurations".

If you import an app that is already part of the presets, this will be override the preset. This can be useful, for example, if you want to have a different set of defaults to those provided in the preset configuration.

You can also manually rename the app that you import using the -n switch. This is needed to prevent name collisions when importing a different version of the same app.

Bulk Launches

A key advantage of programmatic app launch is the ability to launch batches of apps. The "app launch" command has been designed to take advantage of standard features of the Unix shell to enable a number of bulk launch mechanisms.

There are more example of using bs commands with shell features in the recipes section.

Multiple Configurations

The bs wrapper command allows users to specify a configuration to be used for the underlying commands. This configuration also selects a set of local app specifications to be used. The app specifications are stored in the user's .basespace directory, adjacent to the configuration file:

# will use app specifications from the file $HOME/.basespace/other-apps.json
$ bs -c other list apps

These additional app specifications will be created and used automatically. In the case of an app specification that is present in both the local and the preset file, the local app specification will override the behaviour of the preset.

Apps and Versions

The app specifications provided with BaseSpace CLI are for specific versions of each app. If another version of the app is released, we cannot guarantee that the specification will work with this new version. Therefore, even if a new version exists, the specification file will remain tied to the older version. If you want to derive an app launch specification for the new version of the app, you can do so by importing an appsession based on launching an app with this new version.

Waiting for apps to finish

To facilitate the chaining of apps, BaseSpace CLI comes with a bs wait command, to wait for an app to finish and then derive the output produced by an app. This can then be used to parameterize a downstream app launch.

Basic Usage

The bs wait command accepts as arguments one or more appsessions and will then wait for these appsessions to finish, polling based on a specified interval (default 60 seconds). Once they have all finished, bs wait returns the appresults that have been generated by the provided appsessions. The default output of bs wait is a tabular format to maintain consistency with bs list.

If the AppSessions provided have already finished, bs wait returns immediately with the above output.

If any of the provided AppSessions have failed (reached an Error or Aborted state) bs wait returns an error. This behaviour can be adjusted using the --ignore-failed switch.

Options

Most options of bs wait are in common with bs list, in particular --terse (to return just the IDs for the AppSession outputs) and the -f formatting options. There are two additional options specific to bs wait:

Polling interval (-i/--interval) - select how often bs wait calls the API to find out if the AppSession has finished

Ignore failed AppSessions (--ignore-failed) - rather than fail if any provided AppSessions have failed, continue work on the remainder

Chaining Example

Below is a script that uses bs wait withbs launch app to implement a chain of TopHat and Cufflinks apps. The script starts by launching TopHat apps on samples based on a filtered bs list command, storing the ID of each appsession in a file. It then uses bs launch to launch the Cufflinks apps, embedding bs wait commands on the IDs stored in the files used in the previous steps. Because files are used to store the AppSession IDs, this chain can be resumed if the script was interrupted for any reason by just rerunning the script.

Read 2 would have a 2 in the ReadNum field, like this:
@M00900:62:000000000-A2CYG:1:1101:18016:2491 2:N:0:13

Quality considerations

The number of base calls for each read must equal the number of quality scores

The number of entries for Read 1 must equal the number of entries for Read 2

The uploader will determine if files are paired-end based on the matching file names in which the only difference is the ReadNum

For paired-end reads, the descriptor must match for every entry for both reads 1 and 2

Each read has passed filter

These options are also available from the BaseSpaceCLI with the --show-validation-rules option.

Switching off validation

It is possible to relax some of the validation rules to allow a wider variety of fastq files to be uploaded:

Switch off read name validation (--allow-invalid-readnames), so that READ1 and READ2 descriptors can be of any form.

Invalid Fastq Files

The fastq validation rules are in place to ensure the smoothest possible compatibility between uploaded samples and apps. Uploading files where not all validation rules are met may cause certain apps to fail; these options are used at the user's own risk.

Upload Options

The only mandatory configuration switch for sample upload is the BaseSpace project which should be used as the destination (-p). This can be the project ID, the project name or a BaseMount path. The upload tool will check whether the specified token has access to a project with the specified name and issue an error if there are problems.

Bulk Upload

One advantage of uploading fastq files using the command line tool is that it is possible to script the bulk upload of many samples. This section provides some examples of using the standard features of the Unix shell to upload large numbers of samples with a handful of commands.

Using SampleSheet.csv

If your fastq files have been generated using BclToFastq, you will probably have a sample sheet you can use to extract sample names, which facilitates grouping batches of fastq files together:

The code for this is simpler than using a SampleSheet.csv, but it will upload everything from the directory, including (for example) the "Undetermined" sample as derived by a demultiplexing run.

List Entities

As well as functionality to upload samples and launch apps, BaseSpaceCLI includes a group of commands to list the entities present in your BaseSpace account. This functionality overlaps with what is provided by BaseMount but is included to provide another way to access your data, including environments where BaseMount is not available.

Basic Usage

The entity listing commands allow users to view projects, samples, appsessions and appresults. Most of the functionality is common to each entity with a few commands having separate options.

Static Entities

The following commands show the listing of projects, samples, and appresults. Each lists the ID and name.

AppSessions

AppSessions differ slightly from the other entities because they are of most relevance while analysis is still underway. The bs list appsessions command allows you to see details of appsessions based on their status. This is analogous to monitoring tools from High-Performance Computing queuing systems, for example the qstat tool for Sun Grid Engine.

Using Shell Features

For cases where BaseSpaceCLI requires a single ID, such as a project ID, an appropriate bs list command that returns a single ID can be turned into a variable with backticks or in a $(variable substitution) and put directly into the command line. For cases where many IDs are required, such as app launch, newline separated commands can be similarly enclosed as <(process substitutions).

Recipes

Complex Filtering

The bs list commands can be used in combination to build lists of IDs for input to other bs commands, such as app launch. In some cases, using name filtering along with --terse will be sufficient to achieve this. However, in other cases you can also use the usual combination of pipes, grep and cut to pull out what is needed:

BaseSpace Copy

Copying BaseSpace data is possible in BaseSpaceCLI, via the bs cp tool.
This tools allows to copy data from/to BaseSpace instances, as well as the local file system.
It has been designed to copy robustly even with high latency or low bandwidth connections, and it will resume downloads if needed.

URIs

bs cp uses the Uniform Resource Identifier to select the source and destination location of your data.
Multiple schemas are available for different methods of authenticating to BaseSpace:

conf://[name]/

Use the supplied configuration with a given name.
If the name is not provided then use 'default'.
These configuration files are created by the BaseSpace CLI.
For example, conf://server/ maps to ~/.basespace/server.cfg

env:///

Use the environment variables BASESPACE_API_SERVER and BASESPACE_ACCESS_TOKEN.

http[s]://[token@]hostname/

Used for interactive authentication, authenticate directly to a given API server URL.
A special case is made for api.basespace.illumina.com where you can simply use basespace.illumina.com.
If a token is supplied then use that for authentication.

Examples

Copy the local directory dataset to the BaseSpace project MyUploads with the AppResult name MyRun1 using environment variables

$ bs cp -v dataset env:///Projects/MyUploads/AppResult/MyRun1

Copy the local directory dataset to the BaseSpace project MyUploads with the AppResult name MyRun1 using the default configuration file (needs to be created by bscli)

$ bs cp -v dataset config:///Projects/MyUploads/AppResult/MyRun1

Copy the local directory dataset to the BaseSpace project MyUploads with the AppResult name MyRun1 using interactive authentication