PropLogic is a Haskell package for propositional logic. It also contains a standalone executable of the same name, which is able to run a small collection of the functions defined in the package. Most of them are Prime Normal Form converters of some sort.

For example, consider the spdnf function that returns the Simplified Prime Disjunctive Normal Form of a given propositional formula

The power of a normalizer like spdnf and its relatives first of all lies in the fact that it solves all traditional problems of propositional logic: Need to know if a formula is satisfiable? Need to know if two formulas are equivalent? All this is immediately answered with spdnf: the example input is satisfiable, because the result is not false, and it is not valid, because the result is not true.

Another feature of our prime form normalizers lies hidden in the underlying algorithms: they are fast. In a straight-forward, traditional, or default fashion, the mentioned problems are of exponential complexity and the computational costs to solve them explode. The PropLogic implementations however should be feasible for input of any size.
The second part of this introduction contains some data that translate "fast" into real numbers of seconds on a current standard computer.

The PropLogic package is written in the Haskell programming language. But no prior understanding of Haskell is required to handle the command or this document. It is written also for users that are just looking for a tool for solving propositional logical problems. In particular, it can be used as a fast SAT-solver.

Unix-like systems however will not look in the working directory for the PropLogic command by default and it needs to be enforced to do so with the prefix ./ for the working directory. In other words, we have to modify the previous input line to

$ ./PropLogic pdnf "[p <-> [-q + r]]"

We continue this habit in the examples below.

Windows user should remember however to omit the prefix and replace the previous line by something like

C:\working\directory> PropLogic pdnf "[p <-> [-q + r]]"

where PropLogic may also be written proplogic, as Windows is case-insensitive in this respect.

Con-, dis-, sub- and equijunctions are multiary or n-ary, i.e. they may have any number of arguments. A unary conjunction, i.e. a conjunction of only one formula phi is written [* phi]. The nullary conjunction is [*]. The same holds of the other junctors, although one usually doesn't need these cases.

The outer brackets [...] around the n-ary junctions are mandatory. That disambiguates the parsing of formulas and makes any
preference rules obsolete.

The notation here is called "fancy" because it is different from the "pure" notation, which is the real Haskell data type in the package implementation. In our context here however, we will only be dealing with the fancy version.

You will probably be familiar with the notion of a Disjunctive Normal Form (or DNF) and the Conjunctive Normal Form (or CNF).

A DNF is a disjunction of literal conjunctions, where a literal is either an atomic formula, or a negated atomic formula.
For example,

[[-v * -w * x * -y * -z] + [v * w * -x] + [v * -x * z] + [-x * y]]

Our definition of a DNF also requires each literal conjunction to be a Normal Literal Conjunction (or NLC) in the sense that the atoms its literals are in a strict ascending order. For example, the second literal conjunction [v * w * -x] is a NLC, because v < w < x.

Similarly (or dually, as it is called in the context of boolean algebras), a CNF is a conjunction of Normal Literal Disjunctions (or NLDs), i.e. disjunctions of literals with atoms in a strict ascending order. An example CNF is say

[[-1 + -2 + 3 + -4 + -5] * [1 + 2 + -3] * [1 + -3 + 5] * [-3 + 4]]

In the sequel, we write <=> for the usual equivalence relation and => for the subvalence (or consequence or entailment relation) on formulas.
For example, --x <=> [x * true] and [x * y] => x are two true statements.

It is a commonly known fact that each formula φ has an equivalent DNF Δ and an equivalent CNF Γ.

γ is a prime factor of Δ, if γ is a factor of Δ and there is no other different factor γ of Δ with γ => γ' => Δ. In other words, if we delete any of the λi in γ, then the resulting NLC γ' = [λ1 * ... * λi-1 * λi+1 * ... * λk] is no longer a factor of Δ.

Δ is a Prime DNF or PDNF, if the components γ1, ..., γn are exactly the prime factors of Δ and if γ1 < ... < γn (where < is a standard linear order on NLCs).

The formal definition of DNFs and CNFs necessarily implies nullary and unary junctions like [*] and [+ x]. But most users would prefer a simplified version, as we call it. For example, one would rather write

etc. [SimpRules]. These abbreviated formulas are no DNFs and CNFs anymore in the strict formal sense. We call them simplified PDNFs and PCNFs or SPDNFs and SPCNFs, respectively. Accordingly, we have two command options

Each of these functions pdnf, pcnf, spdnf and spcnf is a canonical normalizer or canonizer for the equivalence relation <=> on propositional formulas in the sense that (1.) every formula is equivalent to its canonization and (2.) two formulas are equivalent if and only if their canonizations are identical.

We can exploit this property of canonizers for the solution of the traditional problems of propositional logic, e.g. the question, if a formula is satisfiable. So in particular, each of the four canonizers is also a SAT-solver.

This concludes our introduction to the PropLogic command as a tool for propositional logic.

Note, that you can always call for immediate help from the command itself by typing

$ ./PropLogic help

If you do so, you will find that PropLogic has some more options we didn't cover, yet.
This involves just some more variations of the Prime Normal Form idea and we explain that in the next chapter.

The last chapter below then finally attempts to give an impression on the performance of the PropLogic command.
Therefore, it also has an option testing that generates a series of randomly generated forms, generates their canonizations and reports some measurements for the whole process.

It is easy to see that this table displays very much the same information than the PDNF. In the first row, there is the ordered lists of atoms: x, y, z. The other three rows are representations of the three NLCs of the PDNF, namely [-1,-2] stands for [-x * -y], [-1,3] stands for [-x * z], and [2,3] stands for [y * z].
The form is "indexed" in the sense that 1, 2, ... refer to the corresponding first, second, ... element in the ordered atom list, in this example case [x,y,z].

The output of the xpdnf call is a two-dimensional table display. The underlying data structure in Haskell for these XPDNFs has the following components:

The X-Form, here ([x,y,z],[[-1,-2],[-1,3],[2,3]], which is made of

The ordered atom list, here [x,y.z]

The I-Form, here [[-1,-2],[-1,3],[2,3]], which is a list of three I-Lines[-1,-2], [-1,3], and [2,3][COSTACK].

3. An indication (i.e. a type constructor in Haskell), here "XPDNF", that tells how the X-Form needs to be translated into a propositional formula.

In the example, the X-Form ([x,y,z],[[-1,-2],[-1,3],[2,3]] is supposed to be a XPDNF, so the X-Form translated into a formula is the DNF [[-x * -y] + [-x * z] + [y * z]]. If the X-Form would have been a XPCNF, then its conversion into a formula would have been the CNF [[-x + -y] * [-x + z] * [y + z]].

The deeper reason for these X-Forms is the gain in speed. The "fast" canonizers that do the real work under the surface translate propositional formulas into X-Forms, and by far the most work is done on the I-Forms. In particular, it involves a lot of atom comparisons, and it is much "cheaper" to work on integers than comparing arbitrary atoms, e.g. strings.

Yet another advantage of the X- and I-Form abstraction is the fact that each function can be applied twice, for DNFs and CNFS, due to the dual character of propositional logic.

The primForm function will probably not be very interesting for the normal user. But it is the core function of all the other prime canonizers and it does the most work behind the scenes. It therefore is the candidate we are going to study for the performance of our whole Prime Normal Form approach to propositional logic.

In this part we are going to present some empirical data on the computational costs of our primForm function.
The actual testing function generates random input for primForm according to certain size parameters, and measures how long primForm takes to produce the results.

This concentration on just one function may need some justification. After all, the propositional algebras we implemented in the PropLogic package consist of about thirty functions. A full performance study should check them all.
Besides, primForm works on I-Forms, but the average user would rather work with propositional formulas.
However, note that

most of the functions in the "fast" instances of propositional algebras are indeed of trivial or polynomial complexity. The harder ones, at least the theoretically harder ones, all use the prime form conversion, which does the essential and biggest part of the work.

There is also a function to produce random propositional formulas. But it turns out, that the majority of these formulas are extreme, i.e. either tautologies or contradictions. This is not a flaw of the random generator, but a property of the data structure itself. The prime normal forms of extreme formulas are produced more or less instantanuously. But we are interested in the difficult cases, and random I-Forms are much better material input, as we will see. Recall, that I-Forms are just abstractions of DNFs and CNFs, and that converting a formula into either a DNF or CNF can be done in polynomial time [POLY]. The real work comes when DNFs and CNFs are converted into their prime normal forms.

There is an infinite variety of NP-complete problems, but they are all related and the SAT solver or boolean satisfiability problem is a distinguished representative [SAT]. The primForm function is also a SAT solver and our data here is also designed to take part in the discussion about fast SAT solvers.

Recall, that this SAT solver works as follows:
Given an I-Form q and let q' := primForm(q) be the prime normal form of q.

We generate random I-Forms, apply primForm and then report the time that took. We do that many times, and thus obtain data for both the average and the worst case performance and we do all that with an average personal computer [Hardware].

Obviously, this description first recalls the input parameters and then displays the the result of the test series, namely:
the numberOfTests again, which was the input 1000, and then the measurements for the worst (maximal) and the average results of the primForm conversion.

More precisely, if the N random forms were p_1,...,p_N, if q_1,...q_N are the according results, i.e. q_i=primForm(p_i), and if this would have been obtained in t_1,...,t_N seconds, respectively, then the test result would show:

The computational complexity, i.e. the performance of an algorithm is usually expressed as a functional dependency of the time and space recources from the size of the input. The input of the primForm function is an I-Form. But we already have different notions to express the size of an I-Form: atomNumber, formLength, averageLineLength and volume. So which one is it?

Actually, one might argue that volume is the "real" size measure. But in fact, as we are going to demonstrate below, it is not the most suitable one for an analysis of the situtation we consider. From a theoretical point of view, the atomNumber is in fact the real parameter that expresses the complexity of the problem, and this is also the predominant one in the literature on complexity theory. However, given an atom number X, it does make a huge performance difference which length Y and average I-Line length Z we choose when we run PropLogic testing X Y Z N.

where X is the atomNumber, Y is the formLength and Z is the averageLineLength of the N randomly generated I-Forms.
But instead of simply accumulating data, let us establish some facts about the general performance behaviour of the primForm function.

Recall, that this means: we generated 1000 random I-Forms p_1,...,p_1000 of length 10, average I-Line length 4, and thus a volume of approximately 10*4=40. For each of those I-Forms, the Prime Normal Form q_i=primForm(p_i) was computed in t_i seconds. The maximal length of all these p_1,...,p_1000 was 17, the average length was 8. The maximal time of the t_1,...,t_1000 was about 20 milli-seconds, the average time was around 2 milli-seconds with an standard deviation of about 2 milli-seconds as well.

This example already reveals a general phenomenon: although the Prime Normal Forms q_i are most of the time shorter than the original I-Form p_i, in this case an average length of 8 compare to the length 10 of p_i, it may as well become longer, namely up to 17 in the given test trail.

In other words, Prime Normal Form and Minimal Normal Form are different, most of the time!

Actually, if q = [l_1,...,l_k] is a Prime Normal Form, an equivalent Minimal Normal Form q' = [m_1,...,m_h] is a selection of the I-Lines of q, i.e. {m_1,...,m_h} is a subset of {l_1,...,l_k}.

For example, [[1,2],[1,3],[2,-3]] is a Prime Normal Form and [[1,3],[2,-3]] is an equivalent Minimal Normal Form.

Well, if there are indeed Minimal Normal Forms, why are we not using them as standard canonizations, instead of the Prime Normal Forms? A full discussion and answer is given in [PNFCanon], but two crucial arguments are:
1. Minimal Normal Forms are not unique in general, there may be several equivalent ones, and
2. for the generation of Minimal Normal Forms it seems to be necessary to generate the Prime Normal Forms, so that their is neither a reduction of space, nor a reduction of time in the process.

Note in particular, that a length increase from 10 to 1000 results in a time increase from milli-seconds to almost half an hour.

Nevertheless, this increasing tendency is only local. From a certain point onwards, a further increase of the length results in smaller Prime Normal Forms. Recall, that for an atom number of X, there are only 2^X different I-Lines of length X. The more I-Lines we add, the greater is the chance that the I-Forms collapses into the unit Prime Normal Form.

To demonstrate that, let us use an atomNumber of 4, i.e. we run

$ ./PropLogic testing 4 Y 4 10000

for different formLength=Y. The following graph shows the decline of both the maximal and the average I-Form length of the Prime Normal Forms after a peak for around formLength=14.

As we may suggest from the values of maxFormLength and averageFormLength, the input form is hardly changed, i.e. the random I-Forms are already in Prime Normal Form, but we don't care about a detailed explanation here.
For an atomNumber of 1000 however, there are I-Forms with <formLength 2^1000> where all I-Lines are different and the Prime Normal Forms are not much smaller either [PrimNum], and that is beyond any limits we can hope for here.

The following test shall try to reveal the worst case behavior for a given atomNumber=X. We put

probably looks awkward. But it is what more conventional mathematicians would rather write as

pdnf : PropForm(a) -> PDNF(a)

i.e. the function pdnf has the propositional formulas on the parametric atom type a as its domain and the PDNFs on a as its codomain. In Haskell, the type operator ":" is written "::" (because ":" is reserved for list construction; "x:l" is the same as "(cons x l)" in LISP). The prefix (Ord a) => means that this function is only defined on an atom type a that has a (linear) order structure defined on it.
(This is an example of the type class concept, which is a pretty unique Haskell feature.)

In the Haskell implementation, the built-in Haskell lists are replaced by so-called costacks, i.e. I-Lines are costacks of integers and I-Forms are costacks of I-Lines.

Functional programming languages with LISP as a common ancestor implement lists as stacks, i.e. lists can only be modified by adding or removing a head component. Therefore, the concatenation of two lists is not a primitive operation, one first needs to walk throught the entire first list to then attach the second one. A costack or concatenable stack therefore is a list-like data structure, where concatenation is indeed a primitive operation, too.

The "fast" normalizers in the PropLogic package do their main work on I-Lines and I-Forms, and they do apply concatenations frequently. Therefore, the costack data structure is implemented in a separate Costack module.
(In fact however, the real implementation of costacks in the Costack module is currently still done with ordinary Haskell lists. Different, more effective implementations are intended for future releases of the PropLogic package.)

In this introduction, we only concentrate on the "indexed" interpretation of the "X", i.e. how the atoms are replaced by positive integers to make the data structures more efficient. We neglect the "extended" meaning of the "X".
On a deeper level, this very much has to do with our theory-algebraic reconstruction of propositional logic in favor of the traditional algebraization as a boolean algebra (see the other literature on http://www.bucephalus.org/PropLogic).

But also in the use of the command line, it shows us that the XPDNF is not just a another representation of the PDNF, but carries a little more information. Consider the formula [x -> y -> [z + true]]. The atom z is what we call redundant in the formula in the sense that there are equivalent formulas without z. Prime Normal Forms only contain irredundant atoms and we see that z is disappeared in the PDNF

$ ./PropLogic pdnf "[x -> y -> [z + true]]"
[[* -x] + [* y]]

But the XPDNF preserves all the atoms, it is equiatomic to the original formula

Actuall, that the DNF and CNF conversion of propositional formulas can be done in polynomial time, is a questionable statement. At least for our special versions of propositional formulas, that also allow sub- and equijunctions. But this is not the main focus in complexity theory and we ignore the possible objections here. Besides, in the "fast" instances of propositional algebras in the PropLogic package are all based on data structures, where propositions are represented as DNFs or CNFs of some sort.

In fact, the test for the satisfiability of a DNF (and the test for the validity of the CNF) are trivial: Due to the semi-ordered definition of our DNFs, every DNF other than [+] is satisfiable anyway, because if it has at least one NLC [L1 * ... * Lm], then this NLC is satisfiable (because the L1,...,Lm cannot contain a complementary pair of literals), and so is the disjunction of multiple NLCs. So in fact, the satisfiability test is really only significant for CNF interpretations of I-Forms. But that doesn't change the general argument.

A call of the testing function produces very verbose output, for each of the N test cases it reports the full situation. This is done because when the random formulas become bigger, each primForm call becomes harder to perform and running test series may take long periods of time where nothing happens and a verbose version gives a better feeling for the behavior. On the other hand, this output-production adds to the time which is measured. But we neglect this artificial brake on the speed.

shows two more typical features of common Haskell functions: the three arguments are not given as a triple, the domain of the function is not a ternary cartesian product. The function is "currified" into a higher-order function of third degree.

And the type IO IForm is the type of "actions" on I-Forms, encapsulated in IO to preserve the purely functional character of the language. Otherwise, randomIForm wouldn't be a proper function, because the result is not deterministic; typical for a random generator. (The theoretical framework for all this is the famous "monad" concept of Haskell.)

http://www.bucephalus.org/text/PNFCanon/PNFCanon.html
The 36 pages paper called
Theory and implementation of efficient canonical systems for sentential calculus, based on Prime Normal Forms
is the main source for the mathematical background of the approach and algorithms in the PropLogic package.