You are here

Computing meta-homologs

Fri, 05/23/2014 - 20:18 — lpryszcz

Trees

MetaPhOrs combines information from multiple strains into single meta-proteome for each species. In result, the phylogenetic signals from multiple strains of one species present in given tree are counted multiple times and number of trees in orthology tables may be slightly larger than number of trees retrieved in tree page.

Consistency score

Orthology/paralogy assignment in MetaPhOrs is based on Consistency Score (CS). Consistency score ranges from 0 to 1. In brief, the closer the value of CS to 1.0, the more confident the prediction.

Consistency score is the ratio of the number of trees confirming given relationship over the total number of trees that were used to infer the relationship between particular protein pair. Orthology Consistency Score (CSo) is calculated for orthology searches, respectively paralogy Consistency Score (CSp) for paralogy queries, as follows:

CSo = To / (To + Tp)

CSp = Tp / (To + Tp)

where:

To stands for number of trees confirming orthology

Tp for number of trees confirming paralogy relationship.

The recommended CSo threshold for orthology prediction is 0.5. The CS might be altered by the user in order to adjust sensitivity/positivity of each query accordingly. All homology relationships are returned when CS cut-off of 0.0 is applied, while CS cut-off of 1.0 returns only fully consistent predictions.

Evidence level

Evidence level defines the number of independent sources (databases), in which trees confirming each prediction have been found. In general the higher evidence level, the better reliability of the prediction as more sources were used to infer it.

Evidence level may vary from 1 to 12 (as trees were retrieved from 12 databases). The Evidence Level cut-off has to altered with care, as external databases overlap partially, and for some pairs of species there is only one source of data (Evidence Level of 1). It's recommended to start queries with Evidence Level cut-off of 1, and then eventually increase the cut-off.

Note, in the first releases (200909 and 200911), evidence level was counting different phylomes as independent source. From release 201405 on, only different databases are counted.