Recent locations

Recent jobs

C.savignyi assembly and gene annotation

Assembly

The genome of a single Ciona savignyi from
San Francisco Bay was shotgun-sequenced by the Broad Institute and
assembled using Arachne2. The Sidow lab at Stanford used this as a
basis for the assembly.

The assembly consists of 374 Reftigs, totalling 174 Megabases, with a Contig N50 of 141Kb and Reftig N50 size of 1800Kb.

Gene annotation

The standard Ensembl mammalian pipeline was modified for annotation of
the Ciona savignyi genome, owing to the lack of genomic information from
closely-related species. Thus, in addition to aligning known Ciona
proteins to the sequence (as per the standard pipeline), we aligned
Ciona-specific cDNA and EST sequences against the genome, and then used
these in conjunction with protein data from other species to build
additional gene models.

More information

Statistics

Summary

Assembly

CSAV 2.0, Oct 2005

Database version

78.2

Base Pairs

177,003,750

Golden Path Length

The golden path is the length of the reference assembly. It consists of the sum of all top-level sequences in the seq_region table, omitting any redundant regions such as haplotypes and PARs (pseudoautosomal regions).

177,003,750

Genebuild by

Ensembl

Genebuild method

Full genebuild

Genebuild started

Apr 2006

Genebuild released

Jun 2006

Genebuild last updated/patched

Apr 2013

Gene counts

Coding genes

Genes and/or transcript that contains an open reading frame (ORF).

11,616

Small non coding genes

Small non coding genes are usually fewer than 200 bases long. They may be transcribed but are not translated. In Ensembl, genes with the following biotypes are classed as small non coding genes: miRNA, miscRNA, rRNA, scRNA, snlRNA, snoRNA, snRNA, and also the pseudogenic form of these biotypes. The majority of the small non coding genes in Ensembl are annotated automatically by our ncRNA pipeline. Please note that tRNAs are annotated separately using tRNAscan. tRNAs are included as 'simple fetaures', not genes, because they are not annotated using aligned sequence evidence.

340

Pseudogenes

A pseudogene shares an evolutionary history with a functional protein-coding gene but it has been mutated through evolution to contain frameshift and/or stop codon(s) that disrupt the open reading frame.

216

Gene transcriptsNucleotide sequence resulting from the transcription of the genomic DNA to mRNA. One gene can have different transcripts or splice variants resulting from the alternative splicing of different exons in genes.