Wiki

Function

Concatenate multiple sequences into a single sequence

Description

union reads in several sequences, concatenates them and writes
them out as a single sequence. The input is typically a list file
containing references to multiple sequences or subsequences (regions
of a sequence). Optionally, feature information will be used.

The output can have source features generated which document composite
sequences in the EMBL/GenBank feature table. The -findoverlap optin
checks for overlaps between adjacent joined regions and reports them
in the overlap file.

Usage

Here is a sample session with union

The file 'cds.list' contains a list of the regions making up the coding sequence of 'embl:x65923':

The result is a normal sequence file containing a single sequence
resulting from the concatenation of the input sequences.

Data files

None.

Notes

union is most useful when the input sequences are specified in a "list file". A list file contain references to any number of sequences which are retrieved from some other file or database. Each sequence reference is a Uniform Sequence Address (USA) which can include the specification of sub-regions of the sequence, eg. em:x65923[20:55]). Specifying several such subregions in a sequence or sequences allows you to enter disjoint sequences to be joined.