Abstract:

Eukaryotic cellular programs are context dependent. Furthermore, gene regulation exerts this control through layers of interacting molecules. Therefore, genetic variation could interfere in the way these molecules behave by introducing mutations in their underlying DNA sequence. This phenomenon may also selectively affect some gene copies, hence producing an Allele Specific Expression (ASE). This work investigates the effect of genetic variation on genes exhibiting ASE to identify variants with potential regulatory role. Consequently, it shortlists and characterizes variants near protein coding regions as well as introns. Moreover, it relies on analyzing RNA-sequencing and SNP-array data of primary white blood cells of eight European healthy individuals. The study also evaluates two sets of samples: the cells in their basal state and after treatment with lipopolysaccharide (LPS). The proposed model shortlists exon variants representative of ASE, then pairs them against intron/upstream mutations to process linkage disequilibrium analysis. Finally, those which co-segregate with a correlation greater than 0.80 are selected as potential intron/upstream regulatory variants. This process yielded 546 intronic and 80 upstream variants of which 28.20% and 31.25% corresponded to known regulatory elements, according to the Variant Effect Predictor from ENSEMBL. Furthermore, the selected variants are enriched for terms describing an immune response. This trait is especially true for the LPS samples that indeed reacted as if under a bacterial infection. Finally, the selected upstream variants occur more often proximal to the core promoter than to the upper limit of 10kb.