Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, United States of America.

4

Department of Pediatrics, University of California, San Diego, La Jolla, CA, United States of America.

Abstract

Genome-scale models of metabolism and macromolecular expression (ME-models) explicitly compute the optimal proteome composition of a growing cell. ME-models expand upon the well-established genome-scale models of metabolism (M-models), and they enable a new fundamental understanding of cellular growth. ME-models have increased predictive capabilities and accuracy due to their inclusion of the biosynthetic costs for the machinery of life, but they come with a significant increase in model size and complexity. This challenge results in models which are both difficult to compute and challenging to understand conceptually. As a result, ME-models exist for only two organisms (Escherichia coli and Thermotoga maritima) and are still used by relatively few researchers. To address these challenges, we have developed a new software framework called COBRAme for building and simulating ME-models. It is coded in Python and built on COBRApy, a popular platform for using M-models. COBRAme streamlines computation and analysis of ME-models. It provides tools to simplify constructing and editing ME-models to enable ME-model reconstructions for new organisms. We used COBRAme to reconstruct a condensed E. coli ME-model called iJL1678b-ME. This reformulated model gives functionally identical solutions to previous E. coli ME-models while using 1/6 the number of free variables and solving in less than 10 minutes, a marked improvement over the 6 hour solve time of previous ME-model formulations. Errors in previous ME-models were also corrected leading to 52 additional genes that must be expressed in iJL1678b-ME to grow aerobically in glucose minimal in silico media. This manuscript outlines the architecture of COBRAme and demonstrates how ME-models can be created, modified, and shared most efficiently using the new software framework.

Multi-scale processes modeled in a ME-model depicted in a dividing E. coli cell.

ME-models expand upon underlying M-models by explicitly accounting for the reactions involved in expressing genes that are required to catalyze enzymatic processes. The synthesis of each major macromolecule is coupled to the reaction that it is involved in by accounting for its dilution to daughter cells during cell division. Each dilution is a function of growth rate (μ).

The flow of information from input data to the ME-model, as facilitated using the ‘build_me_model’ script.

The ‘build_me_model’ workflow uses the ECOLIme package to load and process the E. coli M-model along with all supplied files containing information defining gene expression processes/reactions. This information is then used to populate the different ProcessData classes (shown in turquoise boxes) and link them to the appropriate MEReaction classes (shown in red ovals), all of which are defined in the COBRAme package. The entirety of the MEReactions comprise a working ME-model. Not all input data, ProcessData classes, and MEReaction classes are shown. For a complete list, reference the COBRAme Documentation.

The previous ME-models implemented coupling constraints explicitly as model pseudo-metabolites. With COBRAme, instead of using explicit coupling constraints (metabolites), dilution of coupled macromolecules to the daughter cell is accounted for by applying a coupling coefficient directly in the reaction in which the macromolecule is used. For example, for the metabolic reaction shown above, a small amount (μ/keff) of the catalyzing enzyme is consumed by the reaction in which it is involved. In other words, for a given amount of flux carried by the metabolic reaction, μ/keff * vmetabolic_reaction of the catalyzing enzyme must be synthesized. A subset of the major macromolecular coupling in iJL1678b-ME is also shown, along with their representation in the ME-matrix. Reference the COBRAme Documentation for derivations and further explanation of the coupling coefficients.

Flux variability analysis of reactions representing the expression of the Pgi enzyme and the PGI metabolic reaction.

The variability becomes negligible (the max and min possible fluxes converge) for metabolic and translation fluxes when using a μ precision of 10−5 and for transcription fluxes when using a μ precision of 10−15. There are two transcription reactions for pgi since this gene can be transcribed using two different sigma factors. The lower limit of reaction flux values is set to 10−15 mmol • gDW-1 • hr-1 as this is close to the lowest value that can be accurately represented in double-precision floating-point in Python. Note the maximum reaction flux for the reverse direction of PGI does not drop to 10−15 mmol • gDW-1 • hr-1 by this μ precision. However, considering the general scale of metabolic reaction fluxes (see ), the maximum flux effectively drops to zero for practical purposes. High μ precision can be achieved without sizeable increases in total solve time using qMINOS. The ME-model simulations were repeated nine times for each precision and the error bars represent the standard deviation of the solve times.

Comparison of the simulated fluxes of iOL1650-ME to the COBRAme-constructed version of the same model at transcription, translation, and metabolic flux scales.

All fluxes are shown in pairwise comparison on the left using a log scale axis. The fluxes are separated into the major reaction types to be shown on a linear axis on the right. In order for fluxes of 0 mmol/gDW/hr to appear, 0 fluxes have been replaced with 10−14 on the left plot. At each level, the models provided comparable flux predictions (R2>0.98). The models cannot be expected to give completely identical flux predictions due to the ME-model updates outlined in Nonequivalent Changes. Since iJL1678b-ME does not contain membrane surface area constraints, iOL1650-ME was used for comparison.