mauvehttp://gel.ahabs.wisc.edu/mauve/Mauve is a system for efficiently constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion. Multiple genome alignment provides a basis for research into comparative genomics and the study of evolutionary dynamics. Aligning whole genomes is a fundamentally different problem than aligning short sequences.

MCSCANhttp://chibba.pgml.uga.edu/duplication/mcscan/MCscan is a computer program that can simultaneously scan multiple genomes to identify homologous chromosomal regions and subsequently align these regions using genes as anchors. This is the toolset for generating the synteny correspondences in Plant Genome Duplication Database. It is intended as an easy-to-use and quick way to identify conserved gene arrays both within the same genome and across different genomes.

Mugsy is a multiple whole genome aligner. Mugsy uses Nucmer for pairwise alignment, a custom graph based segmentation procedure for identifying collinear regions, and the segment-based progressive multiple alignment strategy from Seqan::TCoffee. Mugsy accepts draft genomes in the form of multi-FASTA files and does not require a reference genomehttp://mugsy.sourceforge.net/

Software packages for whole genome alignment

More softwares and information will be added. (latest updated on 28/02/2011)

LASTZ or BLASTZ * recommended (Schwartz et al. 2003) LASTZ is a program for aligning DNA sequences, a pairwise aligner. Originally designed to handle sequences the size of human chromosomes and from different species, it is also useful for sequences produced by NGS sequencing technologies such as Roche 454.

LAGAN (Brudno et al. 2003) The Lagan Tookit is a set of alignment programs for comparative genomics. The three main components are a pairwise aligner (LAGAN), a multiple aligner (M-LAGAN), and a glocal aligner (Shuffle-LAGAN). All three are based on the CHAOS local alignment tool and combine speed (regions up to several megabases can be aligned in minutes) with high accuracy. The results of the alignment can be visualized using the VISTA visualization tool.

MUMMER*recommended (3 papers for 3 versions 1.0, 2.1, 3.0) MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. For example, MUMmer 3.0 can find all 20-basepair or longer exact matches between a pair of 5-megabase genomes in 13.7 seconds, using 78 MB of memory, on a 2.4 GHz Linux desktop computer. MUMmer can also align incomplete genomes; it can easily handle the 100s or 1000s of contigs from a shotgun sequencing project, and will align them to another set of contigs or a genome using the NUCmer program included with the system. If the species are too divergent for a DNA sequence alignment to detect similarity, then the PROmer program can generate alignments based upon the six-frame translations of both input sequences.

AVID (or click here)(Bray et al. 2002) AVID is designed to be fast, memory efficient, and practical for sequence alignments of large genomic regions up to megabases long.

Cgaln (Nakato and Gotoh et al. 2010) Cgaln (Coarse grained alignment) is a program designed to align a pair of whole genomic sequences of not only bacteria but also entire chromosomes of vertebrates on a nominal desktop computer. Cgaln performs an alignment job in two steps, at the block level and then at the nucleotide level. The former “coarse-grained” alignment can explore genomic rearrangements and reduce the regions to be analyzed in the next step. The latter is devoted to detailed alignment within the limited regions found in the first stage. The output of Cgaln is ‘glocal’ in the sense that rearrangements are taken into consideration while each alignable region is extended as long as possible. Thus, Cgaln is not only fast and memory-efficient, but also can filter noisy outputs without missing the most important homologous segment pairs.

Calculate the likelihood of chance similarities between random sequences.

Alfresco (Dalca and Brudno 2008) A key feature of the program is to use available analysis programs relevant to comparative genome sequence analysis, combine the results of these, and graphically present them in an intuitive way, thereby facilitating the analysis of large genomic regions.

BLAT (read everything about it on Wikipedia or UCSC Genome Browser and FAQ) (Kent 2002) BLAT (the BLAST-Like Alignment Tool) is a software program developed by Jim Kent at UCSC to identify similarities between DNA sequences and protein sequences.BLAT is much faster than older tools such as BLAST for nucleotide and protein alignments, and it can also perform spliced alignments of RNA to DNA.

BLAST (megablast) OK! Everyone knows it! Just click it for latest version of blast. (Please keep in mind it’s BLAST not BLAST+, for BLAST+, click here)

SSAHA2 (Ning et al 2001 paper about SSAHA) (Sequence Search and Alignment by Hashing Algorithm) is a pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences. SSAHA2 reads of most sequencing platforms (ABI-Sanger, Roche 454, Illumina-Solexa) and a range of output formats (SAM, CIGAR, PSL etc.) are supported. A pile-up pipeline for analysis and genotype calling is available as a separate package.