Commitment ordering (CO)

- This page provides resources that can help a wider audience of software developers, beyond academic database researchers, to understand CO and utilize it for guaranteeing Global serializability in distributed (multi process/thread) environments. CO is the only serializability based general concurrency control method that allows building large-scale reliable transactional-based distributed Web applications (in an environment with shared resources, even for atomic transactions with only two processes/participants). CO allows optimistic implementations. In most cases atomic transactions are preferred over other non-transactional ad-hoc methods for reliability.

- Generalized CO (VO; which includes CO and all its variants) is necessary for guaranteeing Global serializability under the following two assumptions:

1. Utilization of atomic transactions for reliability (in a failure prone environment).2. Requiring database concurrency control autonomy in a distributed environment with multiple databases and other transactional objects(i.e., databases do not communicate concurrency control information to each other).

Necessity in VO means thatif VO is violated under autonomy (independence of external concurrency control information), then serializability cannot be guaranteed (may be violated, which typically quickly happens);If VO is violated under local (defined by the boundary of an atomic commitment protocol (ACP) participant) serializability, then autonomy (of the ACP participants) cannot be maintained (should be violated) for sustaining global serializability.The necessity follows mathematically from the above assumptions.

- The CO patents should have been expired already, meaning CO is in the public domain.

Commitment ordering (or Commit ordering; CO) is a general concurrency control (Serializability) class of techniques that I discovered in 1990, while at Digital Equipment Corporation (DEC). It provides an effective general solution to the Global Serializability Problem. This problem in the area of transactional systems has been characterized as Open problem still in October 1991, and has been considered unsolvable also years after I publicly disclosed CO in May 1991. CO is also an effective general Distributed serializability technique, i.e., it allows the effective distribution of every other known concurrency control technique. CO is the only known general serializability method that provides global and distributed serializability without the costly distribution of concurrency control information (e.g., local precedence relations, locks, timestamps, tickets, etc.), and thus provides unbounded scalability in network size and number of database systems. It also provides automatic global deadlock resolution (a fact true also in the common all-SS2PL environment, but unnoticed in the research literature). This makes CO especially effective in highly distributed environments, e.g., as with Cloud computing.

Major conclusions of the CO research:

1. CO can made any concurrency control mechanism serializable (serializability compliant) without interfering with the mechanism's scheduling methods.

Any
serializability mechanism can be used locally in each local
transactional object/database system (not necessarily made CO compliant; a
complete heterogeneity). In addition to any given serializability mechanism only a certain vote ordering strategy (voting according to local precedence order) needs to be maintained
locally (for distributed transactions), which makes each local
transactional object/database system Vote ordering (VO) compliant, and the entire environment Global VO compliant (with Global serializability and automatic Global deadlock resolution).

Special cases: When only a single concurrency control (serializability) method is utilized locally for all objects. I.e., the techniques allow the effective
distribution (low overhead, scalable, with neither mechanism's interference nor even necessarily making it CO compliant) of every
known serializability method while
guaranteeing Distributed/Global serializability and automatic Global deadlock resolution.

Evidently CO has been misunderstood by some researchers for years (see Quotations in Wikipedia - Global serializability), as I found a while ago, to my surprise, after years of not following the related research area. With all the interest earlier in Global serializability problem it is quite strange that almost immediately after I introduced the CO solution it has almost "disappeared," or often mentioned with accompanying incorrect information when mentioned. The Global serializability problem has continued to be considered "unsolvable" by some, with explanations and excuses as before. However the quite intensive rate of publications about the subject of concurrency control of databases in conferences and journals has declined quickly. Proposed solutions still have continued to appear, but usually not in major publications. SS2PL has practically been quite satisfactory for distribution for years with a quite limited utilization in "shared nothing" environments (without a distributed lock manager; e.g., in a common (SS2PL based) multi-database system with a transactional environment that provides 2PC). As a result people from other areas, different from databases, looking for not-SS2PL Global serializability solutions (e.g., optimistic), have not known about CO, and often ended with ineffective proposals based on other earlier database techniques. Another reason is the fast technological development of computing: With limited distribution over fast, low-latency networks the difference between local and distributed CC is shrinking, i.e., local techniques are useful over increasing sizes of computing environments and geographical distances. However, with distribution beyond a certain modest size only CO (including SS2PL) scales, also for using possibly needed optimistic methods. For example, with Cloud computing local methods cannot support the high distribution, and CO can provide substantial benefits. Current wisdom proposes ways to avoid distributed atomic transactions, but in many cases (applications) distributed transactions are unavoidable.

Commitment ordering has been quite widely known inside the
transaction processing and databases communities at DEC since 1990 (primarily pursued in order to acquire buy-in for utilization in products and approval for patenting; DEC was largely the "decision-by-consensus company" at its lower-medium organizational levels). It was under company confidentiality due to patenting processes of CO
methods for guaranteeing global serializability (it had an email distribution of approximately 40 people, probably too many to keep a secret about a technique with a self-explanatory name in a company of ~130,000...). Intended trial and possible utilization of SCO (replacement for SS2PL) in Rdb, DEC's database system, which was sold to Oracle - I do not know if ever happened. CO was disclosed
outside of DEC by lectures and technical reports' submission and
distribution to leading database researches (tens) in May 1991,
immediately after its first patent filing. In September 1991 it was
presented during a poster session in the Asilomar High Performance
Transaction Systems (HPTS} workshop in Pacific Grove, California. It was later presented with articles at VLDB 92 (CO), PODS 93 (ECO), and RIDE-IMS 93 (MVCO). An additional article (submitted Aug. 91) appeared in Sep. 94 in the journal IPL (see references below).

The discovery of CO resulted from my attempt to find the Global serializability "secret" of SS2PL, a hidden property (if existed), and by that to generalize SS2PL (SS2PL is constraining, which often negatively impacts performance). At that time SS2PL was the only available solution and the de-facto standard for Global serializability. The consensus among the database experts was that "nothing more (beyond SS2PL) can be done." I first generalized it slightly to a property I named "Separation" (SEP), thought this was the end of the way, and wrote a paper (by Phil Bernstein's suggestion, to prove my claims, which turned out to be the right thing to do). Later, after finding a mistake in SEP's necessity proof and being mad about the broken proof and confused for three days, I suddenly "saw" CO which corrected the proof! CO generalized SEP, was the necessary property, and everything came into place. "Commitment ordering" was the natural name for the new property. It was simple, beautiful, non-blocking, surprisingly general, and a new paradigm in concurrency control (serializability). It provided a general effective, scalable solution for the global serializability problem. Then Extended SEP was generalized into Extended CO (ECO). Algorithms emerged as part of the CO discovery, where SEP algorithms turned into CO algorithms. As a matter of fact the correct necessary (CO) algorithm for the necessity proof was discovered just before the resulting property (CO) was formulated. A generalization of conflicts and CO for multi-version (MVCO) followed. The combination of CO and all its generalizations that do not need the distribution of CC information (e.g., local precedence relations, locks, timestamps, tickets, etc.) are now called Vote ordering (VO), or Generalized CO (GCO). As ECO, it consists of Local serializability and a Vote ordering strategy. VO achieves Global serializability and Automatic global deadlock resolution also with no need in local CO (but rather local serializability).

Atomic commit protocol (ACP) has been needed to bring Global serializability also for SS2PL based environments (to sync the end of a distributed transaction). After explicitly being mentioned in the first CO publication, the need in ACP has been repeated in many places, and ACP (specifically, Two-phase commit protocol - 2PC) alone has become in time the folklore Global serializability solution among practitioners, forgetting that assuming the popular SS2PL has been the key in existing solutions (that do not use CO in its more general forms). However, another aspect, CO's side-benefit of automatic Global deadlock resolution in SS2PL based environments, has never been mentioned in the literature until today (except in the CO publications), though global deadlocks have been a quite intensive research area.

Remarks about distributed VO, CO, and SS2PL (also named Rigorous scheduling, and previously Strict 2PL): Without ACP and a related vote ordering strategy the scalable distributed CO (and distributed VO) solution cannot be explained in general, and cannot work at all for autonomous database systems. SS2PL is the only known CO variant for which the vote ordering strategy is automatic (implied). Thus Global SS2PL can be trivially explained without knowing about CO, and indeed this has been noticed earlier, and SS2PL has been utilized for Global serializability in multi-database systems also several years before the discovery of CO in 1990. Global SS2PL has become the de-facto standard for Global serializability, and is such until today. The added value of CO (VO): Every CO (VO) compliant database system (with any CC mechanism, that may outperform SS2PL in some situations, especially optimistic or semi-optimistic variants; CO compliance is easy to enforce; VO requires only local serializability and Vote ordering strategy) can join any such SS2PL based distributed solution and maintain Global serializability. With the needed (for CO, VO) Vote ordering strategy also Local recoverability (strictnes) implies Global recoverability (strictness).

Wikipedia - Commitment ordering is probably the broadest and most comprehensive existing article on CO (with no mathematical notation, but quite mathematical) .

Unified conflict theory: Materialized conflicts and non-materialized conflicts; . A conflicting operation blocked by a lock, issued by transaction T and released by T at its end, generates a non-materialized conflict. A cycle in the augmented conflict graph with at least one non-materialized conflict is not a cycle in the regular conflict graph and does not affect serializability; however, any cycle represents a deadlock

Generic local enhanced CO algorithm, which uses the Local augmented conflict graph (rather than the regular Local conflict graph, as in the original Generic local CO algorithm), and locally both guarantees serializability, and resolves (local) deadlocks (Vs. only serializability in the original algorithm)

Database generalized autonomy: Generalizes autonomy (which includes extended knowledge autonomy) by allowing any local information to be used (not using CC information from another participant in Atomic commitment). This is a necessary condition for database systems to generate a unified schedule with Global serializability that captures both CO, its generalizing variants ECO and MVCO, and possibly additional unknown useful generalizing variants. The necessity is proven the same way as for (regular) autonomy and CO. It assumes using a generalized CDDDC condition defined with a transitive conflict (a conflict graph path between two voted-on (global) transactions), e.g., as in ECO, and a general (commutativity based) conflict, e.g., as in MVCO (October 2010). This can be summarized in the concept of

A mechanism for Global serializability that uses local Dynamic
atomicity in the framework of Abstract Data Types is described there,
which provides a result similar to a result in the CO articles. This mechanism is used to
prove this result in terms of history property, and does not comprise a
general algorithm to enforce CO and Global CO in a distributed environment as the CO work does (e.g.,
it cannot invalidate the CO patents). The CO articles, unlike the Dynamic atomicity work, also provide
additional results about CO: They generalize CO to MVCO (for Multi-version CC - MVCC) and ECO which relax local CO, and provide algorithms,
distribution, and optimality results.
The above definition of Dynamic atomicity is different from the one I have been referred to and referencing (Weihl 1989)
as prior art since I heard about Dynamic atomicity from Maurice Herlihy (at DEC at that time; after he saw my article) in December 1990 (or later?).
This definition defines a special case of CO, which is difficult to implement as an effective general mechanism, and I have not suspected that two definitions (by
the same people) have existed at approximately the same period (though
it could happen due to the time needed to publish a journal article).
The original Dynamic atomicity definition I knew (and its related optimality result) is from: