Dezfouli, Mahya

Abstract [en]

The work presented in this thesis describes methodologies developed for integration and accurate interpretation of barcoded DNA, to empower large-scale-omics analysis. The objectives mainly aim at enabling multiplexed proteomic measurements in high-throughput format through DNA barcoding and massive parallel sequencing. The thesis is based on four scientific papers that focus on three main criteria; (i) to prepare reagents for large-scale affinity-proteomics, (ii) to present technical advances in barcoding systems for parallel protein detection, and (iii) address challenges in complex sequencing data analysis.

In the first part, bio-conjugation of antibodies is assessed at significantly downscaled reagent quantities. This allows for selection of affinity binders without restrictions to accessibility in large amounts and purity from amine-containing buffers or stabilizer materials (Paper I). This is followed by DNA barcoding of antibodies using minimal reagent quantities. The procedure additionally enables efficient purification of barcoded antibodies from free remaining DNA residues to improve sensitivity and accuracy of the subsequent measurements (Paper II). By utilizing a solid-phase approach on magnetic beads, a high-throughput set-up is ready to be facilitated by automation. Subsequently, the applicability of prepared bio-conjugates for parallel protein detection is demonstrated in different types of standard immunoassays (Papers I and II).

As the second part, the method immuno-sequencing (I-Seq) is presented for DNAmediated protein detection using barcoded antibodies. I-Seq achieved the detection of clinically relevant proteins in human blood plasma by parallel DNA readout (Paper II). The methodology is further developed to track antibody-antigen interaction events on suspension bead arrays, while being encapsulated in barcoded emulsion droplets (Paper III). The method, denoted compartmentalized immuno-sequencing (cI-Seq), is potent to perform specific detections with paired antibodies and can provide information on details of joint recognition events.

Recent progress in technical developments of DNA sequencing has increased the interest in large-scale studies to analyze higher number of samples in parallel. The third part of this thesis focuses on addressing challenges of large-scale sequencing analysis. Decoding of a huge DNA-barcoded data is presented, aiming at phase-defined sequence investigation of canine MHC loci in over 3000 samples (Paper IV). The analysis revealed new single nucleotide variations and a notable number of novel haplotypes for the 2nd exon of DLA DRB1.

Taken together, this thesis demonstrates emerging applications of barcoded sequencing in protein and DNA detection. Improvements through the barcoding systems for assay parallelization, de-convolution of antigen-antibody interactions, sequence variant analysis, as well as large-scale data interpretation would aid biomedical studies to achieve a deeper understanding of biological processes. The future perspectives of the developed methodologies may therefore stem for advancing large-scale omics investigations, particularly in the promising field of DNA-mediated proteomics, for highly multiplex studies of numerous samples at a notably improved molecular resolution.