The below figure depicts the healthcare analytics challenge as the order of complexity is scaled.

1. Introduction Beta Release 1.0

It is our pleasure to introduce startup venture Ingine; Inc that brings to market The BioIngine.com™; Cognitive Computing Platform for the Healthcare market, delivering Medical Automated Reasoning Programming Language Environment (MARPLE) capability based on the mathematics borrowed from several disciplines and notably from late Prof Paul A M Dirac’s Quantum Mechanics.

The BioIngine.com™; is a High Performance Cloud Computing Platformdelivering HealthCare Large-Data Analytics capability derived from an ensemble of bio-statistical computations. The automated bio-statistical reasoning is a combination of “deterministic” and “probabilistic” methods employed against both structured and unstructured large data sets leading into Cognitive Reasoning.

The BioIngine.com™ is a result of several years of efforts with Dr. Barry Robson; former Chief Scientific Officer, IBM Global Healthcare, Pharmaceutical and Life Science. His research has been in developing quantum math driven exchange and inference language achieving semantic interoperability, while also enabling Clinical Decision Support System, that is inherently Evidence Based Medicine (EBM). The solution, besides enabling EBM, also delivers knowledge graphs for Public Health surveys including those sought by epidemiologists. Based on Dr Robson’s experience in the biopharmaceutical industry and pioneering efforts in bioinformatics, this has the data mining driven potential to advance pathways planning from clinical to pharmacogenomics.

The BioIngine.com™; brings the machinery of Quantum Mechanics to Healthcare analytics; delivering a comprehensive data science experience that covers both Patient Health and Population Health (Epidemiology) analytics, driven by a range of bio-statistical methods from descriptive to inferential statistics, leading into evidence driven medical reasoning.

The BioIngine.com™; transforms the large clinical data sets generated by interoperability architectures, such as in Health Information Exchange (HIE) into “semantic lake” representing the Health ecosystem that is more amenable to bio-statistical reasoning and knowledge representation. This capability delivers evidence-based knowledge needed for Clinical Decision Support System, better achieving Clinical Efficacy by helping to reduce medical errors.

The BioIngine.com™; platform working against large clinical data sets or while residing within the large Patient Health Information Exchange (HIE) works in creating opportunity for Clinical Efficacy, while it also facilitates in the better achievement of “Efficiencies in the Healthcare Management” that Accountable Care Organization (ACO) seeks.

Our endeavors have resulted in the development of revolutionary Data Science to deliver Health Knowledge by Probabilistic Inference. The solution developed addresses critical areas in both scientific and technical, notably the healthcare interoperability challenges of delivering semantically relevant knowledge both at patient health (clinical) and public health level (Accountable Care Organization).

2. WhyThe BioIngine.com™?

The basic premise in engineering The BioIngine.com™ is in acknowledging the fact that in solving knowledge extraction from the large data sets (both structured and unstructured), one is confronted by very large data sets riddled by high-dimensionality and uncertainty.

Generally in solving insights from the large data sets the order in complexity is scaled as follows:-

A. Insights around :- “what”

For large data sets, descriptive statistics are adequate to extract a “what” perspective. Descriptive statistics generally delivers statistical summary of the ecosystem and the probabilistic distribution.

B. Univariate Problem :- “what”

Considering some simplicity in the variables relationships or is cumulative effects between the independent variables (causing) and the dependent variables (outcomes):-

All the above are still discussing the “what” aspect. When the complexity increases the notion of independent and dependent variables become non-deterministic, since it is difficult to establish given the interactions, potentially including cyclic paths of influence in a network of interactions, amongst the variables. A very simple example in just a simple case is that obesity causes diabetes, but the also converse is true, and we may also suspect that obesity causes type 2 diabetes cause obesity… In such situation what is best as “subject” and what is best as “object” becomes difficult to establish. Existing inference network methods typically assume that the world can be represented by a Directional Acyclic Graph, more like a tree, but the real world is more complex than that that: metabolism, neural pathways, road maps, subway maps, concept maps, are not unidirectional, and they are more interactive, with cyclic routes. Furthermore, discovering the “how” aspect becomes important in the diagnosis of the episodes and to establish correct pathways, while also extracting the severe cases (chronic cases which is a multivariate problem). Indeterminism also creates an ontology that can be probabilistic, not crisp.

Most ACO analytics addresses the above based on the PQRS clinical factors, which are all quantitative. Barely useful for advancing the ACO into solving performance driven or value driven outcomes most of which are qualitative.

The above discussed challenges of analyzing multivariate pushes us into techniques such as Neural Net; which is the next level to Multivariate Regression Statistical Approach…. where multiple regression models are feeding into the next level of clusters, again an array of multiple regression models.

The Neural Net method still remains inadequate in exposing “how” probably the human mind is organized in discerning the health ecosystem for diagnostic purposes, for which “how”, “why”, “when” etc becomes imperative to arrive at accurate diagnosis and target outcomes efficiently. Its learning is “smudged out”. A little more precisely put: it is hard to interrogate a Neural Net because it is far from easy to see what are the weights mixed up in different pooled contributions, or where they come from.

“So we enter Probabilistic Computations which is as such Combinatorial Explosion Problem”.

Note:- Beta Release 1.0 only addresses HDN transformation and inference query against the structured data sets and Features A, B and E. However, as a non-packaged solution C and D features can still be explored.

Release 2.0 will deliver full A.I driven reasoning capability MARPLE working against both structured and unstructured data sets. Furthermore, it will be designed to be customized for EBM driven “Point Of Care” and “Care Planning” productized user experience.

The BioIngine.com™; offers a comprehensive bio-statistical reasoning experience in the application of the data science as discussed above that blends descriptive and inferential statistical studies.

The BioIngine.com™; is a High Performance Cloud Computing Platformdelivering HealthCare Large-Data Analytics capability derived from an ensemble of bio-statistical computations. The automated bio-statistical reasoning is a combination of “deterministic” and “probabilistic” methods employed against both structured and unstructured large data sets leading into Cognitive Reasoning.

Given the challenge of analyzing against the large data sets both structured (EHR data) and unstructured data; the emerging Healthcare analytics are around above discussed methods D and E; Ingine Inc is unique in the Hyperbolic Dirac Net proposition.

Results into a Inferential Statistical mechanism suitable for a highly complex system – “Hyperbolic Dirac Net”

Involves an approach that is based on the premise that a Highly Complex System driven by the human social structures continuously strives to achieve a higher order in the entropic journey by continuos discerning the knowledge hidden in the system that is in continuum.

A System in Continuum seeking Higher and Higher Order is a Generative System.

A Generative System; Brings System itself as a Method to achieve Transformation. Similar is the case for National Learning Health System.

A Generative System; as such is based on Distributed Autonomous Agents / Organization; achieving Syndication driven by Self Regulation or Swarming behavior.

It has capabilities to facilitate medical workflow, continuity of care, medical knowledge extraction and representation from vast large sets of structured and unstructured data, automating bio-statistical reasoning leading into large data driven evidence based medicine, that further leads into clinical decision support system including knowledge management and Artificial Intelligence; and public health and epidemiological analysis.

Accountable Care Organization (ACO) driven by Affordability Care Act transforms the present Healthcare System that is adaptive (competitive) into generative (collaborative / co-ordinated) to achieve inclusive success and partake in the savings achieved. This is a generative systemic response contrasting the functional and competitive response of an adaptive system.

Natural selection seems to have resulted in functional transformation, where adaptive is the mode; does not account for diversity.

Self Regulation – seems like is a systemic outcome due to integrative influence (ecosystem), responding to the system constraints. Accounts for rich diversity.

From the above observation, should the theory in self regulation seem more correct and that adheres to laws of nature, in which generative learning occurs. Then, the assertion is “method” is offered by the system itself. System’s ontology has an implicate knowledge of the processes required for transformation (David Bohm – Implicate Order)

For very large complex system,

System itself is the method – impetus is the “constraint”.

In the video below, the ability for the cells to creatively create the script is discussed which makes the case for self regulated and generative complex system in addition to complex adaptive system.

Further Notes on Q-UEL / HDN :-

That brings Quantum Mechanics (QM) machinery to Medical Science.

Is derived from Dirac Notation that helped in defining the framework for describing the QM. The resulting framework or language is Q-UEL and it delivers a mechanism for inferential statistics – “Hyperbolic Dirac Net”

In a highly complex system; the causality becomes indeterminable; meaning the correlation or relationships between the independent and dependent variables are not obviously established. Also, they seem to interchange the position. This creates dilemma between :- subject vs object, causes vs outcomes.

Approaching a highly complex system, since the priori and posterior are not definable; inferential techniques where hypothesis are fixed before the beginning the study of the system become enviable technique.

Review of Inferential Techniques as the Complexity is Scaled

Step 1:- Simple System (turbulence level:-1)

Frequentist :- simplest classical or traditional statistics; employed treating data random with a steady state hypothesis – system is considered not uncertain (simple system). In Frequentist notions of statistics, probability is dealt as classical measures based only on the idea of counting and proportion. This technique is applied to probability to data, where the data sets are rather small.

Increase complexity: Larger data sets, multivariate, hypothesis model is not established, large variety of variables; each can combine (conditional and joint) in many different ways to produce the effect.

Step 2:- Complex System (turbulence level:-2)

Bayesian :- hypothesis is considered probabilistic, while data is held at steady state. In Bayesian notions of statistics, probability is of the hypothesis for a given sets of data that is fixed. That is, hypothesis is random and data is fixed. The knowledge extracted contains the more subjectivist notions of uncertainty, belief, reliability, or confidence often used in automated inference and decision support systems.

Additionally the hypothesis can be explored only in an acyclic fashion creating Directed Acyclic Graphs (DAG)

Increase the throttle on the complexity: Very large data sets, both structured and unstructured, Hypothesis random, multiple Hypothesis possible, Anomalies can exist, There are hidden conditions, need arises to discover the “probabilistic ontology” as they represent the system and the behavior within.

Step 3: Highly Chaotic Complex System (turbulence level:-3)

Certainly DAG is now inadequate, since we need to check probabilities as correlations and also causations of the variables, and if they conform to a hypothesis producing pattern, meaning some ontology is discovered which describes the peculiar intrinsic behavior among a specific combinations of the variables to represent a hypothesis condition. And, there are many such possibilities within the system, hence very chaotic and complex system.

Now the System itself seems probabilistic; regardless of the hypothesis and the data. This demands Multi-Lateral Cognitive approach

BioIngine.com Platform

Is High Performance Cloud Computing Platform delivering both probabilistic and deterministic computations; while combining HDN Inferential Statistics and Descriptive Statics.

The bio-statistical reasoning algorithm have been implemented in the Wolfram Language; which is a knowledge based programming unified symbolic language. As such symbolic language has a good synergy in implementing Dirac Notational Algebra.

The Bioingine.com; brings the Quantum Mechanics machinery to Healthcare analytics; delivering a comprehensive data science experience that covers both Patient Health and Public Health analytics driven by a range of bio-statistical methods from descriptive to inferential statistics, leading into evidence driven medical reasoning.

The Bioingine.com transforms the large clinical data sets generated by interoperability architectures, such as in Health Information Exchange (HIE) into semantic lake representing the Health ecosystem that is more amenable to bio-statistical reasoning and knowledge representation. This capability delivers evidence based knowledge needed for Clinical Decision Support System better achieving Clinical Efficacy by helping to reduce medical errors.

Algorithm based on Hyperbolic Dirac Net (HDN)

An HDN is a dualization procedure performed on a given inference net that consists of a pair of split-complex number factorizations of the joint probability and its dual (adjoint, reverse direction of conditionality). Hyperbolic Dirac Net is derived from Dirac Notational Algebra that forms the mechanism to define Quantum Mechanics.

A Hyperbolic Dirac Net (HDN) is a truly Bayesian model and a probabilistic general graph model that includes cause and effect as players of equal importance. It is taken from the mathematics of Nobel Laureate Paul A. M. Dirac that has become standard notation and algebra in physics for some 70 years. It includes but goes beyond the Bayes Net that is seen as a special and (arguably) usually misleading case. In attune with nature, the HDN does not constrain interactions and may contain cyclic paths in the graphs representing the probabilistic relationships between all things (states, events, observations, measurements etc.). In the larger picture, HDNs define a probabilistic semantics and so are not confined to conditional relationships, and they can evolve under logical, grammatical, definitional and other relationships. It is also, in its larger context, a model of the nature of natural language and human reasoning based on it that takes account of uncertainty.

Explanation: An HDN is an inference net, but it is also best explained by showing that it stands in sharp contrast to the current notion of an inference net that, for historical reasons, is today often taken as meaning the same thing as a Bayes Net. “A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.” [https://en.wikipedia.org/ wiki/Bayesian_ network]. In practice, such nets have little to do with Bayes, nor Bayes’ rule, law, theorem or equation that allows verification that probabilities used are consistent with each other and all other probabilities that can be derived from data. Most importantly, in reality, all things interact in the manner of a general graph, and a DAG is in general a poor model of reality since it consequently may miss key interactions.

DiracMiner

Is a machine learning based biostatistical algorithm that transforms Large Data Sets such as Millions of Patient Records into Semantic Lake as defined by HDN driven computations that is a mix of Numbers theory (Riemann Zeta) and Information Theory (Dual Bayesian or HDN)

DiracBuilder

Send an HDN query to KRS to seek HDN probabilistic inference / estimate. The Query for the inference contains the HDN that the user would like to have, and DiracBuilder helps get the best similar dual net by looking at what Billions of QUEL tags and joint probabilities are available.

High Performance Cloud Computing

The Bioingine.com Platform computes (probabilistic computations) against the billions of Q-UEL tags employing extended in-memory processing technique. The creation of the billions of Q-UEL tags and querying against them is combinatorial explosionproblem.

The Bioingine platform working against large clinical data sets or while residing within the large Patient Health Information Exchange (HIE) works in creating opportunity for Clinical Efficacy and also facilitates in the better achievement of “Efficiencies in the Healthcare Management” that ACO seeks.

Our endeavors have resulted in the development of revolutionary Data Science to deliver Health Knowledge by Probabilistic Inference. The solution developed addresses critical areas both scientific and technical, notably the healthcare interoperability challenges of delivering semantically relevant knowledge both at patient health (clinical) and public health level (Accountable Care Organization).

This case is very similar, because high BP and diabetes are each comorbidities with high BMI and hence to some extent with each other. Consequently we just substitute diabetes by BP throughout.

(0) We can in f act test the strength of the above with the following RR, which in effect reads as “What is the relative risk of needing to take BP medication if you are diabetic as opposed to not diabetic?

<‘Taking BP medication’:=’1’ | ‘Taking diabetes medication’:= ‘1’>

/<‘Taking BP medication’:=’1’ | ‘Taking diabetes medication’:= ‘0’>

The following predictive odds PO make sense and are useful here:-

<‘Taking BP medication’:=’1’ | ‘BMI’:= ’50-59’ >

/<‘Taking BP medication’:=’0’ | ‘BMI’:= ’50-59’ >

and (separately entered)

<‘Taking diabets medication’:=’1’ | ‘BMI’:= ’50-59’ >

/<‘Taking diabetes medication’:=’0’ | ‘BMI’:= ’50-59’ >

And the odds ratio OR would be a good measure here (as it works in both directions). Note Pfwd = Pbw theoretically for an odds ratio.

The mission of Ingine Inc as a startup is to bring advancement in data science as applicable to medical knowledge extraction from large data sets.

Particularly following are the differentiators owing to which Ingine Inc is a candidate startup in hope of advancing science in difficult to solve areas; driven by decades of research by Dr. Barry Robson.

Introducing Hyperbolic Dirac Net (HDN); a machinery created borrowing from Quantum Mechanics to advance data mining and deep learning beyond what Bayesian could deliver; against the backdrop of very large data sets riddled with uncertainty and high-dimentionality. Most importantly, HDN based non-hypothesis approach allows us to create a learning system workbench that is also amenable to research and discovery related efforts based on deep learning techniques.

Integrate Patient centric studies with epidemiological studies to achieve a comprehensive framework to advance integrated large data driven bio-statistical approach which addresses both systemic and also functional concerns. This means blending both descriptive and inferential (HDN) statistical approaches.

Introduce a comprehensive notational and symbolic programming framework that allows us to create a unified mathematical framework to deliver both probabilistic and deterministic methods of reasoning which allows us to create varieties of cognitive experience from large sets of data riddled with uncertainty.

Use all of the above in creating a Point of Care platform experience that delivers EBM in a PICO format as followed by the industry as a gold standard.

While PICO is employed as a framework to create EBM driven diagnosis process as a consequence of both qualitative and quantitative methods that better achieves systematic review; medical exam setting is used as a specification to define the template for enacting the EBM process. This is based on the caveat that for a system to qualify as an expert system in the medical area, it should also be able to pass medical exams based on the knowledge the learning system has acquired that is scientifically curated by both automated machine learning and manual intervention efforts.

As part of the overall architecture, that employs some ingenious design techniques such as non-predicated, non -hypothesis driven and schema-less design; semantic lake a tag driven knowledge repository is created from which the cognitive experience is created employing inferential statistics. Furthermore the capability can be delivered as a cloud computing platform where parallelization, in-memmory processing, high performance computing (HPC) and elastic scaling are addressed.

Bioingine.com employs algorithmic approach based on Hyperbolic Dirac Net that allows inference nets that are a general graph (GC), including cyclic paths, thus surpassing the limitation in the Bayes Net that is traditionally a Directed Acyclic Graph (DAG) by definition.

The Bioingine.com approach thus more fundamentally reflects the nature of probabilistic knowledge in the real world, which has the potential for taking account of the interaction between all things without limitation, and ironically this more explicitly makes use of Bayes rule far more than does a Bayes Net.

It also allows more elaborate relationships than mere conditional dependencies, as a probabilistic semantics analogous to natural human language but with a more detailed sense of probability. To identify the things and their relationships that are important and provide the required probabilities, the Bioingine.com scouts the large complex data of both structured and also information of unstructured textual character.

It treats initial raw extracted knowledge rather in the manner of potentially erroneous or ambiguous prior knowledge, and validated and curated knowledge as posterior knowledge, and enables the refinement of knowledge extracted from authoritative scientific texts into an intuitive canonical “deep structure” mental-algebraic form that the Bioingine.com can more readily manipulate.

In the below referenced articles on the employ of Bayesian Net to model Clinical Pathways as probabilistic inference net, replace Bayesian Net to achieve stress tested Hyperbolic Dirac Net (HDN) which is a non-acyclic Bayesian resolving both correlation and causation in both the direction; etymology –> outcomes and outcomes –> etymology

1. Elements of Q-UEL

Q-UEL is based on the Dirac Notation and associated algebra The notation was introduced into later editions of Dirac’s book to facilitate understanding and use of quantum mechanics (QM) and it has been a standard notation in physics and theoretical chemistry since the 1940s

QM is a system for representing observations and measurements, and drawing probabilistic inference from them. The Q in Q-UEL refers to QM, but a simple mathematical transformation of QM gives classical everyday behavior. Q-UEL inherits the machinery of QM by replacing the more familiar imaginary number i (such that ii = -1), responsible for QM as wave mechanics, by the hyperbolic imaginary number h (such that hh=+1). Hence our inference net in general is called the Hyperbolic Dirac Net (HDN)

In probability theory A, B, C, etc. represent things, states, events, observations, measurements, qualities etc. In this paper we mean medical factors, including demographic factors such as age and clinical factors such as systolic blood pressure value or history of diabetes.

They can also stand for expressions containing many factors, so note that by e.g.

2) Employing Q-UEL preliminary inference net as the query can be created.

“Will my female patient age 50-59 taking diabetes medication and having a body mass index of 30-39 have very high cholesterol if the systolic BP is 130-139 mmHg and HDL is 50-59 mg/dL and non-HDL is 120-129 mg/dL?”.

This forms a preliminary inference net as the query, which may be refined and to which probabilities must be assigned

The real answers of interest here are not qualitative statements, but the final probabilities. The protocols involved map to what data miners often seem to see as two main options in mining, although we see them as the two ends of a continuum.

Method (A) may be recognized as Unsupervised (or unrestricted) data mining and post-filtering, and is the method mainly used here. In this approach

we (1) mine data (“observe”),(2) compute a very large number of the more significant probabilities and render them as tags and maintained as Knowledge Representative Store (KRS) or Semantic Lake (“evaluate”), (3) use a propose inference net as a query to search amongst the probabilities represented by those tags, but only looking for those relevant to complete the net and assign probabilities to it, assessing what is available, and seeing what can be substituted (“interpret”), and (4) compute the overall probability of the final inference net in order to make a decision (“decide”). Unsupervised data mining is preferred because it generates many tags for an SW-like approach, and may uncover new unexpected relationships that could be included in the net.

Method (B)uses supervised (or restricted) data mining and prefiltering. Data mining considers only what appears in the net. The down-stream user interested in inference always accesses the raw database, while in (A) he or she may never see it.

The advantage of (B) is that mining is far less computationally demanding both in terms of processing and memory. Useful to computing HDN for a specified Hypothesis.

The Popular Bayes Net BN Compared with our Hyperbolic Dirac Net HDN.

Each probabilities of any kind can also be manipulated for inference in a variety of ways, according to philosophy (which is a matter of concern ). The BN is probably the most popular method, perhaps because it does seem to be based on traditional, conservative, principles of probability. However, the BN is traditionally (and, strictly speaking, by definition) confined to a probability network that is a directed acyclic graph (DAG).

In general, reversibility, cyclic paths and feedback abound in the real world, and we need probabilistic knowledge networks that are general graphs, or even more diffuse fields of influence, not DAGs. In our response as the Hyperbolic Dirac Net (HDN), “Dirac” relates to use of Paul A. M. Dirac’s view of quantum mechanics (QM).

QM is not only a standard system for representing probabilistic observation and inference from it in physics, but also it manages and even promotes concepts like reversibility and cycles. The significance of “hyperbolic” is that it relates to a particular type of imaginary number rediscovered by Dirac. Dirac notation entities, Q-UEL tags, and the analogous building blocks of an HDN all have complex probabilities better described as probability amplitudes. This means that they have the form x + jy where x and y are real numbers and j is an imaginary number, though they can also be vectors or matrices with such forms as elements.

Q-UEL is seen as a Lorentz rotation i → h of QM as wave mechanics. The imaginary number involved is now no longer the familiar i such that ii = -1, but the hyperbolic imaginary number, called h in Q-UEL, such that hh = +1.

This renders the HDN to behave classically. A basic HDN is an h-complex BN.

Both BN and basic HDN may use Predictive Odds in which conditional probabilities (or the HDN’s comparable h-complex notions) are replaced by ratios of these.