Abstract

Background:
With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies.

Results:
We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements.

Conclusions:
Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs. Although initial examinations of several motifs provide evidence for their likely functions, other motifs will require more in-depth analysis to reveal their functions.

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1

5

Schematic representation of the workflow…

Fig. 1

13

Schematic representation of the workflow for the discovery of novel fungal ncRNA candidates.…

Fig. 1

Schematic representation of the workflow for the discovery of novel fungal ncRNA candidates. Noncoding IGRs and intron sequences were analyzed by BLAST to identify clusters of DNA sequences that are similar. The clusters were reduced in number by removing examples that are similar to known RNA motifs present in the Rfam database, and by searching for unannotated protein coding regions by using RNAcode [38]. The remaining clusters were examined by using CMfinder [28] to seek evidence for nucleotide sequence covariation and to develop preliminary secondary structure models. Pre-candidate RNA motifs (P1 designates pairing element 1; arrow identifies covariation site) with only a few covariations and few representatives were discarded. Finally, Infernal was used to search for additional representatives that might have been missed in the initial search, and to refine the consensus sequence and secondary structure model. The refined candidate ncRNA depicted is the novel motif rpl7 reported in this study

Fig. 2

5

HDV ribozyme consensus models and…

Fig. 2

13

HDV ribozyme consensus models and the characteristics of a newly found HDV ribozyme…

Fig. 2

HDV ribozyme consensus models and the characteristics of a newly found HDV ribozyme representative from fungi. a General consensus for HDV self-cleaving ribozymes as reported previously [49]. N designates any nucleotide, H designates adenosine, cytidine or uridine, and parentheses identify optional nucleotides. Solid lines indicate zero-length connectors with the exception of a variable-length connector labeled var. The arrowhead identifies the site of cleavage (Clv). b Consensus sequence and secondary structure model for HDV ribozyme variants identified in fungi. Yellow boxes encompass nucleotides and structures that are different from the general consensus depicted in a, including a “loop E” motif. Other annotations are as described for Fig. 1. c Sequence and predicted secondary structure of a bimolecular ribozyme construct derived from an HDV ribozyme from the fungus P. chrysogenum. An 18-nucleotide “substrate” strand was separated from a larger “ribozyme” strand by disconnecting the two sub-domains at the junction between P1 and P2. Nucleotides depicted in red match the highly-conserved positions in the fungal HDV ribozyme consensus in Fig. 2b. d PAGE separation of 5ˊ 32P-labeled substrate RNA after partial digestion with RNase T1 (T1), partial digestion with alkali (–OH), or incubation with 100 nM of the ribozyme strand (R) of the P. chrysogenum bimolecular construct under permissive reaction conditions (See Methods). NR designates no reaction. Bands corresponding to the substrate (S), 5ˊ cleavage product (5ˊ Clv) and various products generated by RNase T1 cleavage after G residues are also identified

Fig. 3

5

Sequence, structure and activity of…

Fig. 3

13

Sequence, structure and activity of the bimolecular HDV construct from Aspergillus niger. a…

Fig. 3

Sequence, structure and activity of the bimolecular HDV construct from Aspergillus niger.a Sequence and secondary structure model for an HDV variant ribozyme from A. niger. Annotations are as described for Fig. 2c. b PAGE separation of 5ˊ 32P-labeled substrate RNA after partial digestion with RNase T1 (T1), partial digestion with alkali (–OH), or incubation with 100 nM of the ribozyme strand (R) of the A. niger bimolecular construct under permissive reaction conditions (See Materials and methods). NR designates no reaction. Bands corresponding to the substrate (S), 5ˊ cleavage product (5ˊ Clv) and various products generated by RNase T1 cleavage after G residues are also identified. c Mass spectrum analysis of the cleavage products. Peaks that are close to calculated masses of the 9-nucleotide 5´ Clv and 10-nucleotide 3´ Clv products are noted on the graph. The calculated (calc.) and observed (obs.) masses for the cleavage products are listed

Fig. 4

5

Structure and gene control function…

Fig. 4

13

Structure and gene control function of the SDC motif. a Consensus model depicting…

Fig. 4

Structure and gene control function of the SDC motif. a Consensus model depicting the conserved sequences and predicted secondary structure of SDC motif RNAs. Annotations are as described in Fig. 1. b Sequence and secondary structure model for the SDC motif representative from N. crassa. Nucleotides depicted in red correspond to the most highly conserved nucleotides present in the consensus sequence in a. M1 through M5 identify nucleotide differences at the positions indicated in mutant constructs used to assess the importance of the P1 stem to gene expression. c Schematic representation of the genetic elements present near the SDC motif, including the location of the luciferase reporter gene used for RT-PCR and reporter-fusion gene expression assays. Arrows identify primer binding sites used for RT-PCR. Dashed lines identify splicing variations using one of the two 5ˊ splice sites (GU) and the 3ˊ splice site (AG) that can convert the precursor mRNA (Pre) into the alternative splicing products Sp-I and Sp-II. The graphic is not drawn to scale. d Agarose gel separation of RT-PCR products generated from SDC reporter fusion transcript in N. crassa. The absence (–) or presence (+) of reverse transcriptase (RT) in the assay is indicated. The asterisk denotes an RT-PCR product whose identity was not confirmed by DNA sequencing. M indicates double-stranded DNA markers. The two images depict neighboring parts of the same gel. e Gene expression of wild-type (WT) and mutant SDC reporter-fusion constructs. Relative light units were normalized to WT (value of 1). The values are an average of three independent replicates, and error bars represent standard deviation

Consensus sequence, structure, and putative uORF peptide products for the amd motif. RNA annotations are as described for Fig. 1. Plots for the peptide sequences are proportional to the amino acids denoted, where black, white and gray bars indicate the most- to least-common amino acid, in the order presented. Numbers in parentheses indicate the numbers of representatives predicted to code for 8- and 6-amino acid peptides

Fig. 6

5

The ies6 RNA motif and…

Fig. 6

13

The ies6 RNA motif and its expression. a Consensus sequence and secondary structure…

Fig. 6

The ies6 RNA motif and its expression. a Consensus sequence and secondary structure of the ies6 motif. Annotations are as described for Fig. 1. b Agarose gel separation of RT-PCR products generated by using primers specific for the sense or antisense transcripts as designated. Bands corresponding to the double-stranded DNA products derived from the sense and antisense RNA templates are indicated. M designates double-stranded DNA size markers, and lanes containing PCR products generated with (+) or without (–) the use of reverse transcriptase (RT) are identified. The asterisk identifies a spurious PCR product

Fig. 7

5

Consensus sequences and secondary structure…

Fig. 7

13

Consensus sequences and secondary structure models for the predicted RNA motifs a hexA…

Fig. 7

Consensus sequences and secondary structure models for the predicted RNA motifs ahexA, bSART-1, c AU-rich hairpin, d variant snoRNAs, and e group I ribozyme region. Annotations are as described for Fig. 1