Hyväksymispäivämäärä

Tiivistelmä

Background and aims: Cell factories are currently used to produce biochemicals for nutrition, pharma, medical, and energy industries with a high future potential and interest among these and other industries. General products of cell factories are complex biochemicals difficult to derive, and robust ways of controlling the flux of carbon in the metabolic reactions of the cell are required. In this work, we propose a three step synthetic biology workflow to perform pathway optimization in Saccharomyces cerevisiae with tryptophan pathway as an example. First, metabolic modelling is performed on the chosen metabolic pathway to identify suitable genes to regulate. Then, the promoters of the selected genes are replaced by synthetic ones, hence setting the gene expression on a chosen level. The derived S. cerevisiae strains are then screened for the production of the chosen molecule. Thirdly, machine learning is utilized, fitting a regression model on the data derived from the screening measurements, and new strains are produced according to the prediction given by the model.

Methods: We identified tryptophan pathway enzymes worth regulating with flux balance analysis (FBA) on yeast stoichiometric metabolic model. The capability of the system to grow and produce tryptophan under anaerobic conditions were investigated in simulated conditions where essential fatty acids were unconstrained. With CRISPR/Cas9 genome editing technology, the promoters of the selected three genes were replaced by synthetic ones in a randomized manner. The promoters were designed to carry 0-8 binding sites for bacterial LexA domain which in turn is a part of a chimeric synthetic transcription factor protein in a randomized manner. We used an S. cerevisiae strain harbouring a VioABE cassette to convert tryptophan to our screening compound, green prodeoxyviolacein. We derived yeast strains with clearly different levels of prodeoxyviolacein, and screened them in an in-house robot facility. We attempted to find the genotypes of the produced strains with capillary electrophoresis with no success, so we fit linear and gam models with simulated genotypes.

Results: We identified three genes G6PDH, PGI1, and TKL1 as suitable regulation targets on the tryptophan pathway. In two screening rounds we derived altogether 268 strains of S. cerevisiae that produce different levels of tryptophan, the highest having a 4.5 fold change in its absorbance values in comparison to the parental strain. With genotypes derived from simulated data we implemented three regression models on the data to mediate a machine learning program capable of predicting the best promoter combinations once inputted real genotypes.

Conclusions: Based on our experiments, the FBA derived predictions of genes to regulate seemed to be valid, and under current knowledge we managed to implement three simulatenous CRISPR/Cas9 mediated genome editions to modify the promoters of the selected genes. Generalized additive models described the absorbance values well with the simulated genotypes. In future, the workflow must nonetheless be further fine-tuned by the validation and elucidation of the genotypes derived from the CRISPR/Cas9 experiments, using the machine learning models to predict the genotypes for the next round of screening.