The ALIGN compendium to the PRINTS protein fingerprint database is now
available via anonymous ftp from
s-ind2.dl.ac.uk pub/database/prints/align
ncbi.nlm.nih.gov repository/PRINTS/align
The readme is appended below
Alan Bleasby
SEQNET/EMBnet/BIONET
DRAL Daresbury Laboratory
Warrington WA4 4AD
UK
********************************************************
* *
* PRINTS COMPENDIUM OF PROTEIN SEQUENCE ALIGNMENTS *
* *
********************************************************
PRINTS5.0 ALIGNMENTS
Departments of Biochemistry & Molecular Biology
University College London, London WC1E 6BT, UK
The University of Leeds, Leeds LS2 9JT, UK
attwood at bsm.bioc.ucl.ac.ukbmb5meb at biovax.leeds.ac.uk
Creation date: 2nd June 1994
Compiled by: T.K.ATTWOOD & M.E.BECK
This compendium of protein sequence alignments is a companion resource to
the PRINTS database of protein motif fingerprints [1]. For each entry in
PRINTS, we have made available a corresponding alignment in NBRF format: the
root name of each of these is identical to the PRINTS identification code.
Fingerprints are derived from groups of conserved motifs in multiple
alignments. These are used to dredge the OWL composite sequence database [2]
in an iterative fashion, so the fingerprint matures with each database pass
[3-5] - further details of the nature and derivation of fingerprints are given
in the PRINTS readme and documentation files. Both starting alignments and
their resulting fingerprints thus stem directly from OWL.
Within OWL, sequences retain the database identification codes of their primary
sources (except those from NRL-3D, which for convenience are prefixed by NRL_).
These codes often change between source releases, so alignments (and their
fingerprints) derived from early versions of OWL will include original rather
than current database codes. A simple method for retrieving the current code,
if this should prove desirable, is to use OWL's query language DELPHOS, which
is accessible from SEQNET (e.g. within DELPHOS type: /info seq "string" , where
`string' is part of the sequence whose current code you wish to retrieve).
Disclaimer
----------
The alignments are, in the main, only intended to be reliable in the regions
from which fingerprints have been defined, although many are complete over the
full sequence length. Each has been generated manually, using either SOMAP [6]
(part of the ADSP suite [3]), XALIGN or VISTAS [7]. We make no claims for their
`correctness' (if such a thing exists), but provide them in good faith as a
guide to, or as an illustration of, the type of protein families contained in
PRINTS. We hope they will be of use to those wishing to augment the information
contained in PRINTS, or to others who simply seek a convenient starting point
for their own analyses - the files should be accessible to any software that
reads NBRF format.
VISTAS and XALIGN will shortly be available from the DRAL SEQNET service.
References
----------
1. Attwood, T.K. and Beck, M.E. (1994) PRINTS - A protein motif fingerprint
database. Protein Engineering, 7 (7), in press.
2. Bleasby, A.J. and Wootton, J.C. (1990) Constructing validated, non-
redundant composite protein sequence databases. Protein Engineering, 3 (3),
153-159.
3. Parry-Smith, D.J. and Attwood, T.K. (1992) ADSP - A new package for
computational sequence analysis. CABIOS, 8 (5), 451-459.
4. Attwood, T.K. and Findlay, J.B.C. (1994) Fingerprinting G-protein-coupled
receptors. Prot.Engng. 7 (2), 195-203.
5. Attwood, T.K. and Findlay, J.B.C. (1993) Design of a discriminating finger-
print for G-protein-coupled receptors. Prot.Engng. 6 (2) 167-176.
6. Parry-Smith, D.J. and Attwood, T.K. (1991) SOMAP - A novel interactive
approach to multiple protein sequence aligment. CABIOS, 7 (2), 233-235.
7. Perkins, D.N. and Attwood, T.K. VISTAS - A package for VIsualising
STructures And Sequences. In preparation.