A computational approach to predict scaffolding lncRNAs at large scale

The human transcriptome contains thousands of long non-coding RNAs (lncRNAs). Characterizing their function is a current challenge. An emerging concept is that lncRNAs serve as protein scaffolds, forming ribonucleoproteins and bringing proteins in proximity. However, only few scaffolding lncRNAs have been characterized and the prevalence of this function is unknown. Here, researchers from the Aix-Marseille University, Inserm propose the first computational approach aimed at predicting scaffolding lncRNAs at large scale. They predicted the largest human lncRNA-protein interaction network to date using the catRAPID omics algorithm. In combination with tissue expression and statistical approaches, the researchers identified 847 lncRNAs (∼5% of the long non-coding transcriptome) predicted to scaffold half of the known protein complexes and network modules. Lastly, they show that the association of certain lncRNAs to disease may involve their scaffolding ability. Overall, these results suggest for the first time that RNA-mediated scaffolding of protein complexes and modules may be a common mechanism in human cells.

Data production and analysis workflows

(A) Predictions of protein-lncRNA interactions (PRI) using catRAPID omics for the human proteome and long non-coding transcriptome. Interactions are further filtered by co-presence in the same GTEx tissue. The produced PRI network contains 6.02 million interactions. (B) Protein groups and lncRNAs are tested for enrichment in lncRNA protein’s targets among groups of proteins. After noise filtering, a final list of scaffolding lncRNA candidates is produced. (C) Principle of the enrichment in lncRNA protein’s targets among groups of proteins. Colors of nodes correspond to the ones used on the lncRNA association to protein groups box on (B).