Relationship between phylogenetic distribution and genomic features in Neurospora crassa.

In the post-genome era, insufficient functional annotation of predicted genes greatly restricts the potential of mining genome data. We demonstrate that an evolutionary approach, which is independent of functional annotation, has great potential as a tool for genome analysis. We chose the genome of a model filamentous fungus Neurospora crassa as an example. Phylogenetic distribution of each predicted protein coding gene (PCG) in the N. crassa genome was used to classify genes into six mutually exclusive lineage specificity (LS) groups, i.e. Eukaryote/Prokaryote-core, Dikarya-core, Ascomycota-core, Pezizomycotina-specific, N. crassa-orphans and Others. Functional category analysis revealed that only approximately 23% of PCGs in the two most highly lineage-specific grouping, Pezizomycotina-specific and N. crassa-orphans, have functional annotation. In contrast, approximately 76% of PCGs in the remaining four LS groups have functional annotation. Analysis of chromosomal localization of N. crassa-orphan PCGs and genes encoding for secreted proteins showed enrichment in subtelomeric regions. The origin of N. crassa-orphans is not known. We found that 11% of N. crassa-orphans have paralogous N. crassa-orphan genes. Of the paralogous N. crassa-orphan gene pairs, 33% were tandemly located in the genome, implying a duplication origin of N. crassa-orphan PCGs in the past. LS grouping is thus a useful tool to explore and understand genome organization, evolution and gene function in fungi.