The Run Monitoring features in BaseSpaceTM Sequence Hub (BSSH) enable users to remotely monitor the quality of their sequencing runs and troubleshoot sequencing errors. As part of our efforts to extend real time Run Monitoring capabilities, we recently released new data quality metrics in BSSH.

% Occupancy for iSeq™ and MiniSeq™ instruments

In a previous release, we added the %Occupied measure in the Charts section of Run Monitoring for the NovaSeq™ systems. As part of this release, this metric will now be visible for iSeq and MiniSeq systems, in BaseSpace Sequence Hub. This measure can be used to understand loading concentrations on the flow cell.

For patterned and non-patterned flow cells, % Occupancy is the percentage of clusters on the flowcell that have DNA that can ultimately be sequenced. With patterned flow cells (such as iSeq), the number of nano wells on the patterned grid determines the total number of possible clusters. For non-patterned flow cells (such as MiniSeq), the total number of possible clusters is the number of non-duplicated spots identified by Real Time Analysis (RTA) during template generation.

% Pass Filter (%PF) settings for all instruments

The Flow Cell chart in BaseSpace Sequence Hub has also been updated to include the %Pass Filter (%PF) for all instruments. This additional information will allow users to determine in particular tiles of a flowcell have unusual levels of %PF.

With these enhancements, we have added capabilities that are currently not available in Sequence Analysis Viewer (SAV). SAV will be updated in the future so our users have a consistent experience across SAV and BSSH.

#QB6200

BaseSpace™ Sequence Hub is used by investigators around the world to facilitate and scale their sequencing and genomic data analysis operations. At Illumina, we understand that security, privacy, and confidentiality are complex issues, and we are committed to protecting our software-as-a-service (SaaS) customers’ data.

To ensure that our customers remain compliant with upcoming changes to the EU General Data Protection Regulation (GDPR), we’ve made a number of updates to privacy practices, policies and agreements that are effective May 25, 2015 for all users globally. These changes include explaining in more detail how we use your information, including your choices, rights, and controls.

Privacy and compliance is a shared responsibility between Illumina and our customers. We are responsible for the security of the BaseSpace Sequence Hub platform. Our cloud provider, Amazon Web Services (AWS) is responsible for providing the tools, services and functionality that enable both the data controller (our customers) and the data processor (Illumina) to be successful.

Figure 1: Shared responsibility Model

A short summary of our changes:

GDPR and Terms & Conditions (T&Cs). GDPR places new obligations on organizations that process EU personal data. As a result, we have updated our business operational practices. The following documents (Privacy Policy (Link), and Terms & Conditions (Link)) better explain our customers’ and users’ rights, and their relationship with Illumina. In addition all our NGS product support pages have been updated with a Privacy & Security section (Link).

Improved clarity and transparency.As a key part of GDPR compliance, we’ve described our data processing practices in clear language. For instruments sending Performance Data (IPD) to BaseSpace Sequence Hub, or connected in the Run Monitoring or Storage and Analysis mode, our updated Illumina®Proactive Technical Note (Link) clearly explains what data is sent to BaseSpace in each of the connectivity modes.

Data Protection Addendum:BaseSpace Sequence Hub leverages AWS to deliver its services. The updated AWS Service Terms (Link) incorporate the GDPR Data Processing Addendum (DPA) and will automatically apply to all customers. Illumina is willing to sign a DPA for customers who ask for it.

Opt-in & Opt-out:Sharing data with BaseSpace Sequence Hub, irrespective of connectivity mode, is entirely controlled by our customers. If you would like to opt out of sharing Instrument Performance Data (IPD), Run Monitoring, or Storage and Analysis mode, you can do so at any time.

In addition, we are continually reviewing and updating our security best practices to safeguard your data and the services we provide. We are ISO 27001 certified, which has a direct emphasis on international compliance and governance. Please review our security and data privacy whitepaper (Link) to learn more about our security practices.

We hope this makes your use of our SaaS products much easier. As always, please contact us at informatics@illumina.com if you have any questions.

BaseSpace Sequence Hub has enabled users to remotely monitor their sequencing runs with the Run Charts function with a very similar interface to that of SAV. We have recently released a synchronized update with SAV to offer an expanded set of metrics for monitoring run quality. At the same time, we have added a few capabilities previously only present in SAV. These enhancements provide a consistent experience and enable users to make informed decisions on the quality of their sequencing runs – whether they are standing in front of their instrument accessing SAV or monitoring the run remotely using BaseSpace Sequence Hub.

Expanded menu of metrics that maintains consistency with SAV

BaseSpace Sequence Hub now includes per cycle Phasing and Pre-phasing metrics, % No Call, and Median QScore measures in the Charts section of Run Monitoring. These measures were also released as part of SAV 2.4.5. % No Call & Median QScores are available for all sequencing platforms. The new Phasing/Pre-phasing metrics are available for all platforms except MiSeq and HiSeq 2000/2500.

Traditional Phasing (and pre-phasing) metrics, which were calculated once at cycle 25, are now listed as “Legacy Phasing Rate.” The new per-cycle weights are listed as “Phasing Weight” in the Run Charts.

Improved usability

The Charts section of Run Monitoring now includes the same menu structure as SAV 2.4.5. Now, metrics in the drop down menus only appear if they are available for the cycle, significantly improving the usability of the charts.

Extracted, Called, and Scored cycles have a minimum-maximum range

Run Monitoring now provides Extracted, Called, and Scored cycles as a minimum-maximum range during an instrument run. Previously, Run Monitoring showed only the maximum cycles. A wide spread between the leading and lagging tile might be an indication of a run problem. Now users can easily spot a problem with their run on both SAV and BaseSpace Sequence Hub.

New Metrics in Both SAV and BaseSpace Sequence Hub

In addition to the changes enumerated above, both SAV and BaseSpace Sequence Hubnow include Occupied Count (K) and % Occupied measures in the Charts section of Run Monitoring for NovaSeq systems. The Occupied Count is a measure of the number of wells on the flow cell with DNA. Adding these new metrics will help users understand their loading concentrations and identify issues with their sequencing run.

Integration and interoperability between laboratory systems –or lack thereof—remains a challenge for those performing next-generation sequencing (NGS) or other genomics studies.[i] To address this challenge, we developed version 2.2 of the integration between BaseSpace Clarity LIMS and the NovaSeq 6000 instrument. This integration now supports the NovaSeq S1 flow cell.

The NovaSeq S1 flow cell delivers up to 0.5TB of output in two days and is ideally suited for high-intensity sequencing applications. Users can now sequence up to 8 human genomes or 80 exomes per run in approximately 24 hours.[ii] And now, users of both Basespace Clarity LIMS and NovaSeq 6000 instrument can access this out-of-the box integration to quickly get up and running with their system.

The NovaSeq 6000 version 2.0 Workflow in BaseSpace Clarity LIMS that supports the integration version 2.2.1

BaseSpace Cohort Analyzer enables users to automatically aggregate and analyze subjects with genomics and phenotype data in a few clicks. Ultimately, users can analyze and share data for biomarker discovery, translational research, and clinical trials.

One of the most powerful features of BaseSpace Cohort Analyzer is the ability to centralize all available information for a subject into a single record. This includes phenotype obtained from various phenotypic databases, lab and image data, and genomic, methylation, proteomics, and expression data, to name a few. Breaking down siloed data in this way enables users to perform integrative analyses to make meaningful discoveries in aggregated data. Now, users of BaseSpace Cohort Analyzer can take advantage of a new beta feature: the Data Uploader.

You can now easily import your genomic data (somatic mutation or copy number variations between tumor and normal samples), or RNA-Seq data into BaseSpace Cohort Analyzer for analysis. Either upload your own files or directly import from a BaseSpace Sequence Hub Enterprise account. The uploader supports >500 phenotype and subject measurements.

In an increasingly globalized world, bacteria can spread rapidly and easily. Furthermore, they often contain genes that make them resistant to antibiotics or confer high virulence. Sequencing the entire genome of bacteria enables a thorough characterization and thus makes it possible for researchers to monitor the spread of particular strains of bacteria or sets of genes.

In collaboration with the Illumina BaseSpace Sequence Hub development team, GoSeqIt has published two apps for characterization of bacterial single isolates. Both of these apps are now available to BaseSpace Sequence Hub users:

The Bacterial Analysis Pipeline app will initially predict the species of the bacterial draft genome based on the number of kmers (oligonucleotides with the length k) co-occurring between the input genome and bacterial genomes in a reference database (1). Further, acquired antimicrobial resistance genes are identified using a BLAST-based approach, where the nucleotide sequence of the input genome is compared to the genes in the ResFinder database (2). Depending on the identified species, Multilocus Sequence Typing (MLST) is performed, also using a BLAST-based approach (3). One-hundred-twenty-five (125) MLST schemes are currently available.

If the input genome is recognized as belonging to Enterobacteriaceae or the gram positive bacteria (Enterococcus, Streptococcus, or Staphylococcus), BLAST is used to search for plasmid replicons using the PlasmidFinder database (4). Identified plasmids of the incF, IncH1, IncH2, IncI1, IncN, or IncA/C type are further subtyped by plasmid MLST (4). Finally, identified Escherichia coli, Enterococcus sp., Listeria sp., and Staphylococcus aureus are compared to the VirulenceFinder database containing known virulence genes (5). For more information, refer to the article titled “Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance.” Figure 1 illustrates the output for species prediction and MLST, while figure 2 illustrates the output for the prediction of acquired antimicrobial resistance genes.

Figure 1: Example of output from the Bacterial Analysis Pipeline app for species prediction and MLST of the input genome.

Figure 2: Example of output from the Bacterial Analysis Pipeline app for acquired antimicrobial resistance genes in the input genome.

E. coli Serotyping App

The E. coli Serotyping app uses a BLAST-based approach to predict the serotype of E. coli isolates by comparing the input genome with a database of specific O-antigen processing system genes for O typing and flagellin genes for H typing (7). The app outputs the predicted serotype along with the identified O-antigen genes (wzx, wzy, wzm, and wzt) and flagellin genes (fliC, flkA, fllA, flmA, and flnA).

Figure 3: Example of output from the E. coli Serotyping app. So far, only E. coli isolates can in this way be in silico serotyped.

Using the New Apps

The price for using the Bacterial Analysis Pipeline app is 5 iCredits per uploaded file plus the cost of computing. The E. coli Serotyping app costs 1 iCredit per uploaded file plus the cost of computing.

Both apps use methods that have been throughly described and published in renowned scientific journals.

Integration and interoperability between laboratory systems—or lack thereof—remains a challenge for those performing next-generation sequencing (NGS) or other genomics studies.1 To address this challenge, we developed version 2.2 of the integration between BaseSpace Clarity LIMS and the NovaSeq 6000 instrument. This integration now supports the NovaSeq S4 flow cell, as well as the NovaSeq Xp protocol.

Figure 1: The NovaSeq 6000 version 2.0 Workflow in BaseSpace Clarity LIMS that supports the integration version 2.2

The NovaSeq S4 flow cell delivers up to 6 TB of output in two days and is ideally suited for high intensity sequencing applications. Users can now sequence up to 48 human genomes or 384 exomes per run in less than 48 hours. This innovation paves the way for large-population-scale initiatives at the lowest price per sample, and enables labs to cost effectively perform human whole-genome sequencing.2 And now, users of both BaseSpace Clarity LIMS and the NovaSeq 6000 instrument can access this out-of-the box integration to get up and running with their system sooner.

The new integration helps users track samples throughout the workflow. Specifically, it:

Supports S13, S2, and S4 flow cells per sample

Supports different applications on the same flow cell

Calculates samples and reagents volumes based on the flow cell type

Creates an output file for use with liquid handling robots

Validates every step in the workflow

The new integration also tracks sequencing run information in BaseSpace Clarity LIMS to help with troubleshooting or trending:

Run recipe files (JSON) are automatically generated to set up and initiate the run

Sample sheets, which are compatible with BaseSpace Sequence Hub and bcl2fastq
v 2.19, are automatically generated and placed directly on the NovaSeq 6000 instrument

Sequencing run are tracked and run metrics are parsed per lane and per flow cell