Analysis of biological data often requires an understanding of components of
pathways and/or networks and their mutual dependency relationships. Such systems
are often analyzed and understood from datasets made up of the states of the
relevant components and a set of discrete outcomes or results. The analysis of
these systems can be assisted by models that are consistent with the available
data while being maximally predictive for untested conditions. Here, we present a
method to construct such models for these types of systems. To maximize
predictive capability, we introduce a set of “don’t care” (dc) Boolean variables
that must be assigned values in order to obtain a concrete model. When a dc
variable is set to 1, this indicates that the information from the corresponding
component does not contribute to the observed result. Intuitively, more dc
variables that are set to 1 maximizes both the potential predictive capability as
well as the possibility of obtaining an inconsistent model. We thus formulate our
problemas maximizing the number of dc variables that are set to 1, while
retaining a model solution that is consistent and can explain all the given known
data. This amounts to solving a quantified Boolean formula (QBF) with three
levels of quantifier alternations, with a maximization goal for the dc variables.
We have developed a prototype implementation to support our new modeling approach
and are applying our method to part of a classical system in developmental
biology describing fate specification of vulval precursor cells in the C. elegans
nematode. Our work indicates that biological instances can serve as challenging
and complex benchmarks for the formal-methods research community.