Lab Members:

BGI Genomics

BGI was founded in 1999 to support the Human Genome Project. Since then, BGI has grown into a multinational genomics company with global operations, including sequencing laboratories based in the US, Europe, Hong Kong, and mainland China.

We operate all common NGS platforms, and exclusively offer the best data quality at the lowest cost from DNA Nanoball Sequencing with the BGISEQ-500 system, developed by BGI’s Complete Genomics subsidiary.

BGI was founded in 1999 to support the Human Genome Project. Since then, BGI has grown into a multinational genomics company with global operations, including sequencing laboratories based in the US, Europe, Hong Kong, and mainland China.

We operate all common NGS platforms, and exclusively offer the best data quality at the lowest cost from DNA Nanoball Sequencing with the BGISEQ-500 system, developed by BGI’s Complete Genomics subsidiary.

Our vast experience with Genomics, Proteomics and Bioinformatics positions BGI uniquely to support academia and pharmaceutical companies with highly reliable genomic data for Basic Research and Drug Development.

The transcriptome is the total set of RNA transcripts, mRNAs and non-coding RNAs in one or a population of cells. With next generation sequencing technology, all transcriptional activity, for both coding and non-coding regions, in any organism can be characterized without prior annotation information. This... Show more »

Transcriptomics

The transcriptome is the total set of RNA transcripts, mRNAs and non-coding RNAs in one or a population of cells. With next generation sequencing technology, all transcriptional activity, for both coding and non-coding regions, in any organism can be characterized without prior annotation information. This allows the identification of regulatory RNAs, annotation of coding SNPs, determination of the relative abundance of transcripts, and more. BGI provides comprehensive transcriptome sequencing services as well as the bioinformatics expertise essential to conduct extensive analysis of the RNA-Seq data.

Whole genome re-sequencing aims to sequence the individual whose reference genome is already known. As reference genome sequences become increasingly available for many species, cataloguing sequence variations and understanding their biological consequences have become major research goals. With next-gen high-throughput sequencing... Show more »

Whole genome re-sequencing aims to sequence the individual whose reference genome is already known. As reference genome sequences become increasingly available for many species, cataloguing sequence variations and understanding their biological consequences have become major research goals. With next-gen high-throughput sequencing technology, BGI can apply whole genome re-sequencing to your human, plant and animal, and microbe samples in order to understand genetic differences and their implications. BGI’s whole human genome sequencing can be used to study human genetics, population and evolution, human disease and clinical research, and pharmacogenomics.

We make your research budget go further with the introduction of Whole Genome Sequencing services at a lower, industry-leading price.
BGISEQ-500 based WGS services are executed with BGI’s own sequencing platform, building on our world-class sequencing experience: No compromise, just get more for less.

BGISEQ-500 is a state-of-the-art sequencing system, powered by combinatorial Probe-Anchor Synthesis (cPAS) and DNA Nanoball (DNB™) technology, developed by BGI’s Complete Genomics subsidiary in Silicon Valley, California.
The combination of linear amplification and DNB technology reduces the error rate while enhancing the signal.

BGI has several bioinformatics data centers, located in Shenzhen, Hong Kong, Beijing and Wuhan respectively, with the total peak performance up to 236.5T flops，the total memory capability up to 56.3 TB and total storage capability up to 24.5PB. Among these, data center in shenzhen and in hongkong ranks the first and second respectively in the bioinformatics field in China.They provide stable and efficient resource for storing, processing and analyzing the massive bioinformatics data.

With the advancements of gene research technologies, big data processing makes a high requirement for computing and storage capabilities increasing by 10-fold every 12-18 months, far above the Moore law’s reference value. Under the background of big data bang, BGI continues increasing the investment of hardware resource. In 2014, the first BGI’s supercomputing cluster with the computing capability over 1PFLOPS will come into service and become the Performance benchmarking in the bioinformatics research field of china, even in the world.

BGI, with the ability of world’s leading bioinformatics computing and analyzing, has worked with national supercomputing center to build the TH-BGI bioinformatics union lab located in national supercomputing center base-binhai new area in Tianjin, with an aim to create a world’s leading gene data computing and development base. The cooperation fully leverages the TH-1’s computing capability, starting with developing high performance computing application technology, including optimizing the existing bioinformatics software algorithm and developing high-quality bioinformatics analysis tools to resolve the issues massive amounts of data processing faces.

ChIP-Seq, also known as ChIP-Sequencing, is widely used to analyze protein interactions with DNA. It combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify binding sites of DNA-associated proteins, and can be used to precisely map global binding sites for any protein of interest. ChIP... Show more »

ChIP-Seq, also known as ChIP-Sequencing, is widely used to analyze protein interactions with DNA. It combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify binding sites of DNA-associated proteins, and can be used to precisely map global binding sites for any protein of interest. ChIP sequencing offers higher resolution and more precise and abundant information in comparison with array-based ChIP-chip.

Benefits:

Wide detection range: Genome wide protein DNA interaction studies

Cost-effective: Less data required for identifying the binding sites in whole genome

Only require low amount of ChIP DNA: As low as 5ng ChIP-ed DNA is adequate

Metagenomics is the study of genomes contained within an entire microbial community. This technology has opened up a new era in the study of microbial diversity with direct access to the genomes of numerous non-cultivatable microorganisms in their natural habitat. Metagenomic sequencing analyzes microbial community diversity, gene... Show more »

Metagenomics is the study of genomes contained within an entire microbial community. This technology has opened up a new era in the study of microbial diversity with direct access to the genomes of numerous non-cultivatable microorganisms in their natural habitat. Metagenomic sequencing analyzes microbial community diversity, gene composition and function, as well as metabolic pathways associated with the specific environment. This approach has been applied to environmental studies as well as biomarker research.

Benefits:

Comprehensive: Enable investigation of all microbes of a certain environment in a single experiment, as well as analysis of microbial community diversity and gene function

Metabonomics/metabolomics is an emerging field after genomics and proteomics prevalence, seeking metabolism mechanism and etiology by qualitative and quantitive measurement of small molecules within 1000Da. Metabolite profiling provides authentic and intuitive way to learn cell respondence to various stimuli, and instantaneous... Show more »

Metabonomics/metabolomics is an emerging field after genomics and proteomics prevalence, seeking metabolism mechanism and etiology by qualitative and quantitive measurement of small molecules within 1000Da. Metabolite profiling provides authentic and intuitive way to learn cell respondence to various stimuli, and instantaneous snapshot of the physiology of cell, hence has been massively utilized in disease research, pharmaceutical effect detection. Our team research mainly focuses on metabolites full scan, targeted metabolite quantitation, metabolomics software and data processing pipeline development, and etc. At the mean time, we integrated with BGI high-throughput nucleic acid sequencing and mass spectrometry platform, with genomics, transcriptomics and proteomics technology, support the trans-omics association study.

Scientific research through traditional transcriptome analysis of early-stage embryos, stem cells, cancers, immune cells, and developing neurons is limited, because the amount of sample that is obtained does not meet the minimum requirements for traditional NGS, which is probably due to extremely little or immensely high... Show more »

Scientific research through traditional transcriptome analysis of early-stage embryos, stem cells, cancers, immune cells, and developing neurons is limited, because the amount of sample that is obtained does not meet the minimum requirements for traditional NGS, which is probably due to extremely little or immensely high heterogeneity. Thus, the need for single-cell transcriptome analysis is particularly urgent. Single-cell RNA-Seq quantifies the mRNA of a single cell through single-tube reverse transcription and PCR amplification, allowing several micrograms of cDNA to be collected for traditional library construction and Hiseq2000 sequencing. Single-cell RNA-Seq is a novel application of RNA-Seq that is capable of sequencing a single cell or infinitesimal amounts of sample.

Benefits:

Accommodates an infinitesimal amount of sample input, which is not possible with traditional sequencing protocols.

Single-tube amplification greatly reduces sample loss. This benefit allows a minimum amount of sample to be required, ensures high sensitivity for library construction, avoids the loss of transcripts that are expressed with low abundance, and reduces data bias.

Provides reliable data for subsequent analysis.

More than 90% of genes can be detected.

A greater number of low-abundance transcripts can be detected than using microarray.

HiSeq2000 was launched as a new sequencing instrument by Illumina, Inc. in 2010, with the same principle as Genome Analyzer, using a stable reversible terminator sequencing-by-synthesis method. The technology uses four kinds containing terminal blocking group and different fluorescent signal bases to... Show more »

HiSeq X/4000/2500/Novaseq

HiSeq2000 was launched as a new sequencing instrument by Illumina, Inc. in 2010, with the same principle as Genome Analyzer, using a stable reversible terminator sequencing-by-synthesis method. The technology uses four kinds containing terminal blocking group and different fluorescent signal bases to complete complementary strand synthesis, not only to ensure the high accuracy and sequencing order, but also to exclude the sequencing error caused by the repeat sequences and the homopolymer. Unlike Genome Analyzer, HiSeq2000 combines the optical systems and manufacturing processes, uses two laser sources on the Flow Cell Scan, at the same time, four cameras on the four kinds of bases were recorded to reduce signal interference between different bases improved sequencing accuracy. HiSeq2000 adopts dual surface imaging technology, Flow Cell effective area increased, thereby increasing the throughput, reducing sequencing costs.

Performance Parameters:

Read Length

Single Flow Cell Output

Single Flow Cell Run Time

Dual Flow Cell Output

Dual Flow Cell Run Time

1×35

47-52Gb

1.5 days

95-105Gb

2 days

2×50

135-150Gb

4.5 days

270-300Gb

5.5 days

2×100

270-300Gb

8.5 days

540-600Gb

11 days

Performance

2×50bp Q30≥85%, 2×100bp Q30≥80%

*Install specifications for HiSeq sequencers with an Illumina PhiX library and cluster densities between 610-678 K/mm2 that pass filtering on a HiSeq system using TruSeq v3 Cluster and SBS kits for HiSeq. Performance may vary based on sample quality, cluster density, and other experimental factors.

Technical Features:

Dual surface imaging technology;

Four cameras were taking pictures of four kinds of bases, reducing signal interference;

HiSeq2500 (the upgraded version of Illumina HiSeq2000 sequencing instrument) was launched in 2012. There are 20 Hsieq2500 in BGI and distributed around the world. Compared with HiSeq2000, HiSeq2500 has two sequencing modes: High Output mode and Rapid mode. High Output mode has the same pattern as HiSeq2000 sequencing instrument; there is no change in sequencing reagents and Flow Cell. Rapid mode uses the new Flow Cell and sequencing reagents, shortening sequencing time, sequencing read lengths up to 150bp.

Performance Parameters:

High Output Mode*

Read Length

Dual Flow Cell

Single Flow Cell

Dual Flow Cell Run Time

1×36

95-105Gb

47-52Gb

2 days

2×50

270-300Gb

135-150Gb

5.5 days

2×100

540-600Gb

270-300Gb

11 days

2×150

-

-

-

Performance

2×50bp Q30≥85%, 2×100bp Q30≥80%

Rapid Mode*

Read Length

Dual Flow Cell

Single Flow Cell

Dual Flow Cell Run Time

1×36

18-22Gb

9-11Gb

7 hours

2×50

50-60Gb

25-30Gb

16 hours

2×100

100-120Gb

50-60Gb

27 hours

2×150

150-180Gb

75-90Gb

40 hours

Performance

2×50bp Q30≥85%, 2×100bp Q30≥80%, 2×150bp Q30≥75%

*Install specifications based on Illumina PhiX control library at supported cluster densities (between 610-678 K clusters/mm2 passing filter using TruSeq v3 or 700-820 K clusters/mm2 passing filter using TruSeq Rapid kits). Run times for rapid run mode correspond to onboard cluster generation(1.5 hours) and sequencing; for high-output mode, run times correspond to sequencing only, not include Flow Cell Preparation time(4-5hours). Performance may vary based on sample quality, cluster density, and other experimental factors. HiSeq2000 instruments prior to serial number 700895, rapid mode to extend the running time of approximately 15 hours when upgraded.

Technical Features:

Rapid mode uses 2 lane Flow Cell and new sequencing reagents, sequencing time is shortened to a few hours, the sequencing read lengths up to 150bp;

Dual flow path, supporting high output and rapid sequencing modes;

Deselect lanes to bypass image process, reduce scan time

Complete 1 human genome sequencing in one day.

Illumina Miseq

Miseq (Illumina) is a new generation of miniaturized sequencing instrument (bench-top) launched on February 2011, which uses a chemical method based on Illumina TruSeq technology that nucleotide sequencing-by-synthesis by reversible terminator. Compared with Hiseq2000, the unparalleled accuracy of next-generation sequencing reagents, transitory sequencing time by new fluid system are the main advantages, although the throughput of a run is low.

(2)This data is based on Illumina control libraries PhiX (a balanced representation of A, T, G, and C nucleotides). The appropriate clusters density is 900-1200K/mm2, and 880-965K/mm2 after PF. But different sample sequencing results by their own properties.

Technical Features:

Short sequencing cycle: Miseq V2 150PE is short to 24 hours running time, so it can used for fast and efficient amplicon sequencing and small genome sequencing, with a small amount of time to complete projects of small amount of data.

Up to 250bp of paired-end read-length:Miseq sequencing read length has reached 2x250bp, which can effectively across complex genomes highly repetitive regions,thereby enhancing the effect of assembly, more conducive for gene annotation and gene function dig.

BGI is a world-leading provider of Human Whole Exome Sequencing, having sequenced more than 45,000 human exomes to date. Our expertise in human exome sequencing comes from our 15 years of experience doing more whole exome and whole genome sequencing than any institution in the world. Our data analysis prowess is derived from our... Show more »

BGI is a world-leading provider of Human Whole Exome Sequencing, having sequenced more than 45,000 human exomes to date. Our expertise in human exome sequencing comes from our 15 years of experience doing more whole exome and whole genome sequencing than any institution in the world. Our data analysis prowess is derived from our long history and extensive work in creating innovative and highly effective proprietary approaches and algorithms for analyzing data for genome analysis. We have unparalleled experience and expertise in both the process of large scale NGS sequencing and in analyzing the data generated.

In addition to Human Exome Sequencing, we provide mouse exome sequencing service using Agilent Mouse All Exon kit and developed a platform for sequencing the exome of monkeys based on our participation in the Chinese rhesus macaque and Cynomolgus macaque genome projects.

While exomes account for 1% of the human genome, protein coding regions contain about 85% of the pathogenic mutations. By selectively targeting DNA sequences that encode proteins, exome sequencing allows for the identification of novel gene mutations associated with both Mendelian disorders and common diseases.

Cost-Effective: Our proficiency and efficiency in exome sequencing allows us to offer our customers comprehensive analysis of whole exons at lower prices than other providers. Researchers can now do more science and get more answers with limited research budgets.

Industry-leading throughput/ turnaround: Our highest throughput sequencing capacity and rapid project turnaround times enable our customers to advance their research more rapidly and effectively allowing them to attain their goals in a more timely manner

Having multiple sequencing platforms enables BGI to choose (in consultation with the customer) the most appropriate system to solve their scientific challenges.

The Complete Genomics platform provides industry leading accuracy and comprehensive variation detection ideal for the research that needs:

Precise mapping of genetic recombination sites

Accurate identification of de novo disease causing mutations

Sensitive detection of somatic variants in complex cancer genomes

We currently offer accurate exome sequencing at 100X on the Complete Genomics platform. Free SNP validation of 100+ loci is included to ensure the accuracy of your results, and we will redo sequencing for you at no additional cost if the percentage of correct SNP callings among validated sites falls below 95%.

The Ion Proton platform has rapid sequencing speeds (only 2-4 hours for every sequencing run), and requires only 50-100 ng input DNA. Therefore, it is suitable for projects with a need for:

Extremely rapid turn-around time

Analysis of very low amount of DNA

Please check out our hot rapid exome sequencing offer at 100X on the Ion Proton platform, taking you from sample to results in as little as 7 days.

Illumina platform provides high throughput analysis and ample bioinformatics tools are available for a wide range of research analyses. Thus, this platform is suitable for projects with a need for

Single cell sequencing can facilitate the elucidation of cell lineage relationships. The major applications of this technique include profiling scarce clinical samples (i.e. circulating tumor cells), pre-implantation genetic diagnosis, embryonic development research, and tumor progression analysis.

Advancing the possibilities of single cell sequencing in human disease research, we have developed an innovative end-to-end solution for genomics analysis at the single cell level. Within this solution, multiple displacement amplification (MDA) has been further enhanced and incorporated into BGI’s whole genome amplification (WGA) protocol, which enables uniform amplification of genomic DNA from single cells (from as little as one cell) with negligible sequence bias and maximized genome coverage.

BGI has sequenced hundreds of cells, and the results from applying the new single-cell sequencing method to identify the genetic characteristics of essential thrombocythemia and clear cell renal cell carcinoma have been published in the journal Cell.

Benefits:

Less quantity of input DNA required

Longer length of amplified product (>10 kb), which is better for CNV and SV detection

We can also perform customized analysis to meet requirements of specific projects, e.g., sub-clone evolution of tumor.

Sample Requirements:

Fresh Tissue: A size of 1-2 cm3 is recommended; as low as 5 mm3 is also acceptable for precious samples. Tissue samples should be immediately stored in liquid nitrogen or at -80 ℃ after surgical resection, without other solvents treatment.

Whole Blood (or Bone Marrow): The total volume should be no less than 5 ml. Samples should be collected with anticoagulant tube and stored at -80℃.

Cell Suspensions: No fewer than 100,000 cells are recommended. It is necessary to follow the standard cell cryopreservation operation protocols, freeze cells gradually with cryopreservation media, and store them in liquid nitrogen or at -80℃.

Isolated Single Cells: Single cells should be stored separately in 3-5 μL solvent (e.g., PBS), in DNase/Rnase free PCR tube (200 μL), and stored at -80 ℃ for no longer than one week. Cells should be free of nucleic acid–binding dyes.

Sequencing Strategy:

91 PE or 101 PE

Recommended Data Amount:

≥ 30X for whole genome resequencing. Sequencing depth can be customized according to the research purpose.

Turnaround Time:

The standard turnaround time for the workflow is approximately 57 business days for whole genome sequencing of 100 samples at 30X coverage.

Completion Indicator:

Effective mean depth of each sample should be no less than required in the contract.

Genotyping by sequencing (GBS) is a unique, cost-effective tool for associate studies and genomics-assisted breeding. It generates large number of single nucleotide polymorphisms (SNPs) for use in genetic analysis. GBS is becoming increasingly important, particularly in plant species with complex genomes that lack reference... Show more »

Genotyping by sequencing (GBS) is a unique, cost-effective tool for associate studies and genomics-assisted breeding. It generates large number of single nucleotide polymorphisms (SNPs) for use in genetic analysis. GBS is becoming increasingly important, particularly in plant species with complex genomes that lack reference sequences. We use methylation-sensitive restriction enzymes to reduce genome complexity and avoid the repetitive fraction of the genome, where methylation is more likely to happen, thus allowing lower copy regions to be targeted with two- to three-fold higher efficiency. Other key advantages of this system include reduced sample handling, fewer PCR and purification steps, no size fractionation, and inexpensive bar-coding. Further applications of GBS to breeding, conservation, and global species and population surveys allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, and conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.

Benefits:

Decreased amounts of input DNA: Only 300 ng of each sample is required.

Faster, simpler protocol compared to traditional methods: The protocol takes 40 working days to complete for 96 samples.

Large range of applications: GBS is suitable for species with or without reference sequences.

Low cost: GBS is an attractive and feasible option for large numbers of markers or individuals.

Bioinformatics:

Standard bioinformatics analysis

Basic analysis of sequence data

Reads clusters across tagged sites of the sequence in species without reference sequences; alignment for species with reference sequences

SNP detection

Advanced bioinformatics -- Population analysis

Phylogenetic tree analysis

Population structure analysis

Principal Component Analysis (PCA)

Advanced bioinformatics -- Genetic map construction

Genotyping

Genetic map construction

QTL mapping analysis (phenotype data provided by the customer)

Integration with the original genetic map for the same mapping populations if provided by the customer

Sample Requirements:

Sample type: genomic DNA with no or minimal degradation and contamination

Sample quantity: ≥ 300 ng/sample

Sample concentration: ≥25 ng/µL

Sample number: 96 or in multiples of 96

Turnaround Time:

The standard turnaround time for the whole workflow (including library construction, sequencing and standard bioinformatics analysis) is 40 business days for 96 samples.

Target region capture enriches specific regions (e.g., the MHC region) or specific genes by probe hybridization based on probes designed according to the genomic regions of interest. It is cost effective to use targeted region sequencing to find variants with large samples. BGI has completed over 65,000 analyses using target... Show more »

Target region capture enriches specific regions (e.g., the MHC region) or specific genes by probe hybridization based on probes designed according to the genomic regions of interest. It is cost effective to use targeted region sequencing to find variants with large samples. BGI has completed over 65,000 analyses using target region sequencing and has developed a series of software (e.g., SOAPsnp) to analyze the sequencing data and generate precise alignment results, accurate variance results, and custom analysis results. Many successful cases have been published in journals such as Science and Mammalian Genome.

Benefits:

Targeted: Focus on the regions of interest, such as exons, promoters, and enhancers.

Cost effective: Much lower cost for narrowed region sequencing, which is a significant advantage for projects with large sample sizes and deep sequencing.

Rich experience: Our technicians have performed a large number of targeted region sequencing projects; they are familiar with experimental methods and trouble-shooting techniques.

Custom-tailored: Custom bioinformatics analysis is available for specific research purposes and data characteristics. Internal software evaluation is available to update the analysis pipeline and ensure an optimized final analysis result for the customer.