Reprinted with corrections, December 1997. Access the latest information about Addison-Wesley titles from our World Wide Web site: http://www.awl.com/cseng Reproduced by Addison-Wesley from camera-ready copy supplied by the author. Cover image courtesy of the National Museum of American Art, Washington DC/Art Resource, NY Copyright ( 1998 by Addison Wesley Longman, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. 2 3 4 5 6 7 8 9 10-MA-0100999897

PREFACE

Theoretical computer science covers a wide range of topics, but none is as fundamental and as useful as the theory of computation. Given that computing is our field of endeavor, the most basic question that we can ask is surely "What can be achieved through computing?" In order to answer such a question, we must begin by defining computation, a task that was started last century by mathematicians and remains very much a work in progress at this date. Most theoreticians would at least agree that computation means solving problems through the mechanical, preprogrammed execution of a series of small, unambiguous steps. From basic philosophical ideas about computing, we must progress to the definition of a model of computation, formalizing these basic ideas and providing a framework in which to reason about computation. The model must be both reasonably realistic (it cannot depart too far from what is perceived as a computer nowadays) and as universal and powerful as possible. With a reasonable model in hand, we may proceed to posing and resolving fundamental questions such as "What can and cannot be computed?" and "How efficiently can something be computed?" The first question is at the heart of the theory of computability and the second is at the heart of the theory of complexity. In this text, I have chosen to give pride of place to the theory of complexity. My basic reason is very simple: complexity is what really defines the limits of computation. Computability establishes some absolute limits, but limits that do not take into account any resource usage are hardly limits in a practical sense. Many of today's important practical questions in computing are based on resource problems. For instance, encryption of transactions for transmission over a network can never be entirely proof against snoopers, because an encrypted transaction must be decrypted by some means and thus can always be deciphered by someone determined to do so, given sufficient resources. However, the real goal of encryption is to make it sufficiently "hard"-that is, sufficiently resource-intensiveto decipher the message that snoopers will be discouraged or that even determined spies will take too long to complete the decryption. In other words, a good encryption scheme does not make it impossible to decode
v

vi

Preface the message, just very difficult-the problem is not one of computability but one of complexity. As another example, many tasks carried out by computers today involve some type of optimization: routing of planes in the sky or of packets through a network so as to get planes or packets to their destination as efficiently as possible; allocation of manufactured products to warehouses in a retail chain so as to minimize waste and further shipping; processing of raw materials into component parts (e.g., cutting cloth into patterns pieces or cracking crude oil into a range of oils and distillates) so as to minimize wastes; designing new products to minimize production costs for a given level of performance; and so forth. All of these problems are certainly computable: that is, each such problem has a well-defined optimal solution that could be found through sufficient computation (even if this computation is nothing more than an exhaustive search through all possible solutions). Yet these problems are so complex that they cannot be solved optimally within a reasonable amount of time; indeed, even deriving good approximate solutions for these problems remains resource-intensive. Thus the complexity of solving (exactly or approximately) problems is what determines the usefulness of computation in practice. It is no accident that complexity theory is the most active area of research in theoretical computer science today. Yet this text is not just a text on the theory of complexity. I have two reasons for covering additional material: one is to provide a graduated approach to the often challenging results of complexity theory and the other is to paint a suitable backdrop for the unfolding of these results. The backdrop is mostly computability theory-clearly, there is little use in asking what is the complexity of a problem that cannot be solved at all! The graduated approach is provided by a review chapter and a chapter on finite automata. Finite automata should already be somewhat familiar to the reader; they provide an ideal testing ground for ideas and methods needed in working with complexity models. On the other hand, I have deliberately omitted theoretical topics (such as formal grammars, the Chomsky hierarchy, formal semantics, and formal specifications) that, while interesting in their own right, have limited impact on everyday computing-some because they are not concerned with resources, some because the models used are not well accepted, and grammars because their use in compilers is quite different from their theoretical expression in the Chomsky hierarchy. Finite automata and regular expressions (the lowest level of the Chomsky hierarchy) are covered here but only by way of an introduction to (and contrast with) the universal models of computation used in computability and complexity.

Preface

vii

Of course, not all results in the theory of complexity have the same impact on computing. Like any rich body of theory, complexity theory has applied aspects and very abstract ones. I have focused on the applied aspects: for instance, I devote an entire chapter on how to prove that a problem is hard but less than a section on the entire topic of structure theory (the part of complexity theory that addresses the internal logic of the field). Abstract results found in this text are mostly in support of fundamental results that are later exploited for practical reasons. Since theoretical computer science is often the most challenging topic studied in the course of a degree program in computing, I have avoided the dense presentation often favored by theoreticians (definitions, theorems, proofs, with as little text in between as possible). Instead, I provide intuitive as well as formal support for further derivations and present the idea behind any line of reasoning before formalizing said reasoning. I have included large numbers of examples and illustrated many abstract ideas through diagrams; the reader will also find useful synopses of methods (such as steps in an NP-completeness proof) for quick reference. Moreover, this text offers strong support through the Web for both students and instructors. Instructors will find solutions for most of the 250 problems in the text, along with many more solved problems; students will find interactive solutions for chosen problems, testing and validating their reasoning process along the way rather than delivering a complete solution at once. In addition, I will also accumulate on the Web site addenda, errata, comments from students and instructors, and pointers to useful resources, as well as feedback mechanisms-I want to hear from all users of this text suggestions on how to improve it. The URL for the Website is http://www.cs.urn.edu/-moret/computation/; my email address is moretics. unm. edu.

Using This Text in the Classroom
I wrote this text for well prepared seniors and for first-year graduate students. There is no specific prerequisite for this material, other than the elusive "mathematical maturity" that instructors expect of students at this level: exposure to proofs, some calculus (limits and series), and some basic discrete mathematics, much of which is briefly reviewed in Chapter 2. However, an undergraduate course in algorithm design and analysis would be very helpful, particularly in enabling the student to appreciate the other side of the complexity issues-what problems do we know that can be solved efficiently? Familiarity with basic concepts of graph theory is also

viii

Preface

useful, inasmuch as a majority of the examples in the complexity sections are graph problems. Much of what an undergraduate in computer science absorbs as part of the culture (and jargon) of the field is also helpful: for instance, the notion of state should be familiar to any computer scientist, as should be the notion of membership in a language. The size of the text alone will indicate that there is more material here than can be comfortably covered in a one-semester course. I have mostly used this material in such a setting, by covering certain chapters lightly and others quickly, but I have also used it as the basis for a two-course sequence by moving the class to the current literature early in the second semester, with the text used in a supporting role throughout. Chapter 9, in particular, serves as a tutorial introduction to a number of current research areas. If this text is used for a two-course sequence, I would strongly recommend covering all of the material not already known to the students before moving to the current literature for further reading. If it is used in a one-semester, first course in the theory of computation, the instructor has a number of options, depending on preparation and personal preferences. The instructor should keep in mind that the most challenging topic for most students is computability theory (Chapter 5); in my experience, students find it deceptively easy at first, then very hard as soon as arithmetization and programming systems come into play. It has also been my experience that finite automata, while interesting and a fair taste of things to come, are not really sufficient preparation: most problems about finite automata are just too simple or too easily conceptualized to prepare students for the challenges of computability or complexity theory. With these cautions in mind, I propose the following traversals for this text. Seniors: A good coverage starts with Chapter 1 (one week), Chapter 2 (one to two weeks), and the Appendix (assigned reading or up to two weeks, depending on the level of mathematical preparation). Then move to Chapter 3 (two to three weeks-Section 3.4.3 can be skipped entirely) and Chapter 4 (one to two weeks, depending on prior acquaintance with abstract models). Spend three weeks or less on Sections 5.1 through 5.5 (some parts can be skipped, such as 5.1.2 and some of the harder results in 5.5). Cover Sections 6.1 and 6.2 in one to two weeks (the proofs of the hierarchy theorems can be skipped along with the technical details preceding them) and Sections 6.3.1 and 6.3.3 in two weeks, possibly skipping the P-completeness and PSPAcE-completeness proofs. Finally spend two to three weeks on Section 7.1, a week on Section 7.3.1, and one to two weeks on Section 8.1. The course may then conclude with a choice of material from Sections 8.3 and 8.4 and from Chapter 9.

Preface

ix

If the students have little mathematical background, then most of the proofs can be skipped to devote more time to a few key proofs, such as reductions from the halting problem (5.5), the proof of Cook's theorem (6.3.1), and some NP-completeness proofs (7.1). In my experience, this approach is preferable to spending several weeks on finite automata (Chapter 3), because finite automata do not provide sufficient challenge. Sections 9.2, 9.4, 9.5, and 9.6 can all be covered at a non-technical level (with some help from the instructor in Sections 9.2 and 9.5) to provide motivation for further study without placing difficult demands on the students. Beginning Graduate Students: Graduate students can be assumed to be acquainted with finite automata, regular expressions, and even Turing machines. On the other hand, their mathematical preparation may be more disparate than that of undergraduate students, so that the main difference between a course addressed to this group and one addressed to seniors is a shift in focus over the first few weeks, with less time spent on finite automata and Turing machines and more on proof techniques and preliminaries. Graduate students also take fewer courses and so can be expected to move at a faster pace or to do more problems. In my graduate class I typically expect students to turn in 20 to 30 complete proofs of various types (reductions for the most part, but also some less stereotyped proofs, such as translational arguments). I spend one lecture on Chapter 1, three lectures reviewing the material in Chapter 2, assign the Appendix as reading material, then cover Chapter 3 quickly, moving through Sections 3.1, 3.2, and 3.3 in a couple of lectures, but slowing down for Kleene's construction of regular expressions from finite automata. I assign a number of problems on the regularity of languages, to be solved through applications of the pumping lemma, of closure properties, or through sheer ingenuity! Section 4.1 is a review of models, but the translations are worth covering in some detail to set the stage for later arguments about complexity classes. I then spend three to four weeks on Chapter 5, focusing on Section 5.5 (recursive and ne. sets) with a large number of exercises. The second half of the semester is devoted to complexity theory, with a thorough coverage of Chapter 6, and Sections 7.1, 7.3, 8.1, 8.2, and 8.4. Depending on progress at that time, I may cover some parts of Section 8.3 or return to 7.2 and couple it with 9.4 to give an overview of parallel complexity theory. In the last few lectures, I give highlights from Chapter 9, typically from Sections 9.5 and 9.6. Second-Year Graduate Students: A course on the theory of computation given later in a graduate program typically has stronger prerequisites than

in addition. and 9 should be covered thoroughly. A sampling of such exercises can be found on the Web site.x
Preface one given in the first year of studies. Exercises within the main body of the chapters are invariably simple exercises. wherein they are ordered roughly according to the order of presentation of the relevant material within the chapter. the instructor needs only ten weeks for this material and should then supplement the text with a selection of current articles. I have turned what would have been a challenge problem into an advanced exercise by giving a series of detailed hints. Exercises This text has over 250 exercises. none should require more than 10 to 15 minutes of critical thought. in which case Chapters 4 (which may just be a review). others special skills. who can generate new ones in little more time than it would take to read them in the text. in a few cases. The rare challenge problems bear two asterisks. The course may in fact be on complexity theory alone. in particular a large number of NP-complete problems with simple completeness proofs. while some may take a fair amount of time to complete. 6. I have deliberately refrained from including really easy exercises-what are often termed "finger exercises. one or two). Some are part of the main text of the chapters themselves. I would remind the reader that solutions to almost all of the exercises can be found on the Web site. With well-prepared students. Accordingly. when I assign starred exercises. some may require additional background. most of these were the subject of recent research articles. they should be within the reach of any student and. these exercises are an integral part of the presentation of the material. A student does well in the class who can reliably solve two out of three of these exercises. Some of the exercises are given extremely
. Most are collected into exercise sections at the end of each chapter. the Web site stores many additional exercises. Advanced exercises bear one asterisk. I have attempted to classify the exercises into three categories. but most simply require more creativity than the simple exercises. Simple exercises bear no asterisk. 7. with some material from Chapter 5 used as needed. flagged by the number of asterisks carried by the exercise number (zero. but often cover details that would unduly clutter the presentation. It would be unreasonable to expect a student to solve every such exercise. I have included them more for the results they state than as reasonable assignments. I usually give the students a choice of several from which to pick." The reason is that such exercises have to be assigned in large numbers by the instructor. 8.

These must be three of the most efficient and pleasant professionals with whom I have had a chance to work: my heartfelt thanks go to all three. this material became the core of Sections 6. The wonderful staff at Addison-Wesley proved a delight to work with: Lynne Doran Cote. in particular. the Associate Editor. my colleague and friend Henry Shapiro and I started work on a text on the design and analysis of algorithms. instead. we did not include this material in our text (Algorithms from P to NP. Paul C. and.Preface
xi
detailed solutions and thus may serve as first examples of certain techniques (particularly NP-completeness reductions).1 and the nucleus around which this text grew. The faculty of the Department of Computer Science at the University of New Mexico. she uncovered some technical errors. 1991). the Production Editor. Carol Fryer. have been very supportive. my wife.
Acknowledgments
As I acknowledge the many people who have helped me in writing this text. The text is much the better for it. I took the notes and various handouts that I had developed in teaching computability and complexity classes and wrote a draft. In 1988. who handled with complete cheerfulness the hundreds of questions that I sent her way all through the last nine months of work. not just in terms of readability. James Hollan. so that the student may use them as tutors in developing proofs. Deborah Lafferty. with Henry Shapiro's gracious consent. a text that was to include some material on NP-completeness. took my initial rough design and turned it into what you see. others are given incremental solutions. Anagnostopoulos. two individuals deserve a special mention.3. which we then proceeded to rewrite many times. in the process commiserating with me on the limitations of typesetting tools and helping me to work around each such limitation in turn. with whom I worked very closely in defining the scope and level of the text and through the review process. who signed me on after a short conversation and a couple of email exchanges (authors are always encouraged by having such confidence placed in them!). the department chairman. the Technical Advisor. The department has allowed me to teach a constantly-changing complexity class year after year for over 15 years. as well as advanced seminars in complexity and computability theory. but also in terms of correctness: in spite of her minimal acquaintance with these topics. and Amy Willcutt. thereby enabling me to refine my vision of the theory of computation and of its role within theoretical computer science. Volume I at Benjamin-Cummings. the Editor-in-Chief. not only put up with my long work hours but somehow found time in her even busier schedule as a psychiatrist to proofread most of this text.1 and 7. Eventually.
.

Those who suffered through the course challenged me to present the material in the most accessible manner. closer to what I would expect of referees on a 10-page journal submission than reviewers on a 400-page text. Through the years.E. Moret Albuquerque. At least two of the reviewers gave me extremely detailed reviews. every student contributed stimulating work: elegant proofs. the several hundred students who have taken my courses in the area have helped me immensely. James Foster (University of Idaho).) Since I typeset the entire text myself. streamlined reductions. Those students who took to theory like ducks to water challenged me to keep them interested by devising new problems and by introducing ever newer material. I used LATEX2e. any errors that remain (typesetting or technical) are entirely my responsibility. I even occasionally found time to go climbing and skiing! Bernard M. using a laptop certainly eased my task: typesetting this text was a very comfortable experience compared to doing the same for the text that Henry Shapiro and I published in 1991. editing. new problems. My thanks to all of them: Carl Eckberg (San Diego State University). An instructor learns more from his students than from any other source. as well as enlightening errors. using gv to check the results. (I have placed a few flawed proofs as exercises in this text. and typesetting the text. The text was typeset in Sabon at 10. particularly to distill from each topic its guiding principles and main results. and formatted everything on my laptop under Linux. curious gadgets. had many helpful suggestions. Jr.5 pt. In addition to saving a lot of paper. but look for more on the Web site. Roy Rubinstein. (University of South Alabama). Greensboro). several of which resulted in entirely new sections in the text. New Mexico
. in addition to making very encouraging comments that helped sustain me through the process of completing.xii
Preface
The reviewers. and Jie Wang (University of North Carolina. Ward. using the MathTime package for mathematics and Adobe's Mathematical Pi fonts for script and other symbols. Last but not least. William A. wrote a lot of custom macros. Desh Ranjan (New Mexico State University).

" the asymptotic lower bound "little Omega. i) K(x)
IC(x [I) 0. E) K." the asymptotic unreachable upper bound "big Omega. *
hi
domo rano
sets the set of edges of a graph the set of vertices of a graph a graph the complete graph on n vertices the diagonal (halting) set the set of states of an automaton states of an automaton an automaton or Turing machine the set of natural numbers the set of integer numbers the set of rational numbers the set of real numbers the cardinality of set S aleph nought. h
P() g (x) Xs A() s(k." the asymptotic unreachable lower bound "big Theta." the asymptotic upper bound "little Oh. U E V G = (V." the asymptotic characterization functions (total) a polynomial the transition function of an automaton a probability distribution the characteristic function of set S Ackermann's function (also F in Chapter 5) an s-1-1 function the descriptional complexity of string x the instance complexity of x with respect to problem H functions (partial or total) the ith partial recursive function in a programming system the domain of the partial function 0 the range of the partial function 0
xiii
.NOTATION
5.. K
Q
q. the cardinality of countably infinite sets "big Oh. g. T. qj M N
R
ISo O() o() Qo() 0 0
(00
f.

y lxI
U
n
V
A x
Zero
Succ
pk
x#y
x Iy
.ux[]
(x.4 ¢(x) f
+
S*
S+ a. . a basic primitive recursive function the choice function. 12(Z)
(XI.xiv
Notation 0(x).. c
E*
w. a basic primitive recursive function the "guard" function. but also set difference addition." a primitive recursive predicate ft-recursion (minimization). -. . a basic primitive recursive function the successor function." the Kleene closure of set S "S plus." a complexity class a Turing reduction a many-one reduction an algorithm the approximation ratio guaranteed by sA
. Xk the general projection functions that reverse pairing generic classes of programs or problems a co-nondeterministic class in the polynomial hierarchy a nondeterministic class in the polynomial hierarchy a deterministic class in the polynomial hierarchy "sharp P" or "number P. * Xk)k
rIk(z)
UTP
#P
S`ET
R.4
0(x) converges (is defined) +(x) diverges (is not defined) subtraction. y)
fldz). b. ." S* without the empty string the reference alphabet characters in an alphabet the set of all strings over the alphabet E strings in a language the empty string the length of string x set union set intersection logical OR logical AND the logical complement of x the zero function. a partial recursive scheme the pairing of x and y the projection functions that reverse pairing the general pairing of the k elements xi. a primitive recursive function "x is a factor of y. but also union of regular expressions "S star. x..

but would also perform some "simple" checks on the code. every student and professional has longed for a compiler that would not just detect syntax errors. as we shall see. Moreover. again. consider the problem of determining the 1
. theory tells us that no such tool can exist: whether or not a program halts under all inputs is an unsolvable problem. Yet no such tool exists to date. The types of questions that we have so far been most successful at answering are: "What cannot be computed at all (that is.1
Motivation and Overview
Why do we study the theory of computation? Apart from the interest in studying any rich mathematical theory (something that has sustained research in mathematics over centuries). For example. they contribute in a practical sense by preventing us from seeking unattainable goals. we also obtain better characterizations of what can be solved and even. sometimes. in fact. better methods of solution. no such tool exists and. in the process of deriving these negative results.CHAPTER 1
Introduction
1. Again. To a large extent. As a third example. theory tells us that deciding whether or not two programs compute the same function is an unsolvable problem. the theory of computation is about bounds. we study computation to learn more about the fundamental principles that underlie practical applications of computing. such as detecting the presence of infinite loops. Another tool that faculty and professionals would dearly love to use would check whether or not two programs compute the same function-it would make grading programs much easier and would allow professionals to deal efficiently with the growing problem of "old" code. what cannot be solved with any computing tool)?" and "What cannot be computed efficiently?" While these questions and their answers are mostly negative.

merely adequate. but
. come up with a very poor sorting algorithm. At this point. we may want to measure it in profits from sales. we would expect that we cannot determine the shortest program that computes a given function. you come up with a type of bubble sort. because all lack some basis for comparison. but few are such that they can be applied immediately after completion of the design. you could attempt to reverse the attack and ask the knowledgeable person if such faster algorithms are themselves good? Granted that they are better than yours. because you have never opened an algorithms text and are. since ultimate conciseness often equates with ultimate obfuscation! Since we cannot determine whether or not two programs compute the same function. they may run faster on one platform. might they still not be pretty poor? And. in any case. in fact. If we are designing an algorithm or data structure. we can measure its running time and overhead. When we have completed the design and perhaps implemented it. faster for certain data. an algorithm. All of us have worked at some point at designing some computing tool-be it a data structure. how do you verify that they are better than your algorithm? After all. good. We can devise other measures of quality. or even during the design process. a user interface. if it is an interrupt handler. Yet such a measure would give us extremely useful feedback and most likely enable us to improve the design. from a historical point of view. For instance. While this intuition does not constitute a proof. we can analyze its performance. we may judge it in 10 or 20 or 100 years by the impact it may have had in the world. or even poor. in fact. though. assume you are tasked to design a sorting algorithm and. or an interrupt handler. but slower on another. You can verify experimentally that your algorithm works on all data sets you test it on and that its running time appears bounded by some quadratic function of the size of the array to be sorted. theory does indeed tell us that determining the shortest program to compute a given function is an unsolvable problem. if it is a user interface. you may even be able to prove formally both correctness and running time. Yet someone more familiar with sorting than you would immediately tell you that you have. we would need to verify that the alleged shortest program does compute the desired function. unaware of the existence of such a field. because there exist equally simple algorithms that will run very much faster than yours." Yet none of these measures tells us if the design is excellent. we can verify its robustness and flexibility and conduct some simple experiments with a few colleagues to check its "friendliness. how can we assess the quality of our work? From a commercial point of view. after all.2
Introduction shortest C program that will do a certain task-not that we recommend conciseness in programs as a goal. by which time you might feel quite proud of your achievement.

while solvable. faster for certain amounts of data. Thus are we led to a major concern of the theory of computation: what is a useful model of computation? By useful we mean that any realistic computation is supported in the model. To return to our sorting example. the question you might have asked the knowledgeable person can be answered through a fundamental result: a lower bound on the number of comparisons needed in the worst case to sort n items by any comparison-based sorting method (the famous n log n lower bound for comparison-based sorting). Whether we want relative or absolute measures of quality. because none is possible). since it would appear that the choice of platform strongly affects the performance (the running time on a 70s vintage. if each platform is different. but we surely would want some universality in any measure. however. and such. yet very useful. results derived in the model apply to as large a range of platforms as possible. yet the model should not. Since the equally simple-but very much more efficient-methods mentioned (which include mergesort and quicksort) run in asymptotic n log n time. in fact.1. we need to know about the platform that will support the computing activities.
. there is no longer any need to look for major improvements. The theory of computation attempts to establish the latter-absolute measures. In this text. Most of all. for instance). that results derived in the model apply to actual platforms. and that.1 Motivation and Overview
3
slower for others. Yet even this ambitious agenda is not quite enough: platforms will change very rapidly. cannot be solved efficiently. about sizes of data sets. in particular. how can we derive measures of quality? We may not want to compare code designed for a massively parallel supercomputer and for a single-processor home computer. but slower for others. Even judging relative merit is difficult and may require the establishment of some common measuring system. We want to distinguish relative measures of quality (you have or have not improved what was already known) and absolute measures of quality (your design is simply good. Such lower bounds are fairly rare and typically difficult to derive. Yet. We may need to know about data distributions. they are as good as any comparison-based sorting method can ever be and thus can be said without further argument to be good. and so forth. 16bit minicomputer will definitely be different from that on a state-of-the-art workstation) and perhaps the outcome (because of arithmetic precision. we shall need some type of common assumptions about the environment. Questions such as "What can be computed?" and "What can be computed efficiently?" and "What can be computed simply?" are all absolute questions. we derive more fundamental lower bounds: we develop tools to show that certain problems cannot be solved at all and to show that other problems.

We shall develop a universal model of computation. the register-addressed
. however. universal models must have many complex characteristics and may prove too big a bite to chew at first. Thus we can identify two major tasks for a useful "theory of computation": * to devise a universal model of computation that is credible in terms of both current platforms and philosophical ideas about the nature of computation. We shall present the Turing machine for such a model. 2. We shall look at the model known as a finite automaton. no matter how sophisticated. finite memory. analytical. In order to develop a universal model and to figure out how to work with it. Turing machines are not really anywhere close to a modern computer. it pays to start with less ambitious models. by their very nature. So. it is very limited in what it can do-for instance. enables us to derive powerful characterizations and to get a taste of what could be done with a model. efficiently solvable. deductive.) and to obtain a model useful for certain limited tasks. scientists and engineers pretty much agree on a universal model of computation. So we need to devise a model that is as universal as possible. so we shall also look at a much closer model. and so on.4
Introduction indeed. but with respect to an abstract notion of computation that will apply to future platforms as well. simply solvable. and * to use such models to characterize problems by determining if a problem is solvable. we shall proceed in three steps in this text: 1. not just with respect to existing computation platforms. As we shall see. it cannot even count! This simplicity. Because a finite automaton (as its name indicates) has only a fixed-size. We shall present a very restrictedmodel of computation and work with it to the point of deriving a number of powerful characterizations and tools. We shall need to justify the claims that it can compute anything computable and that it remains close enough to modern computing platforms so as not to distort the theory built around it. the model should still apply to future platforms. After all. but agreement is harder to obtain on how close such a model is to actual platforms and on how much importance to attach to theoretical results about bounds on the quality of possible solutions. etc. However. The point of this part is twofold: to hone useful skills (logical.

. any true assertion in a mathematical system) could be proved if one was ingenious and persevering enough.
1. he insisted that each proof be written in an explicit and unambiguous notation and that it be checkable in a finite series of elementary. 3. Hilbert and most mathematicians of that period took it for granted that such a proof-checking algorithm existed. however. they cannot be solved efficiently).2
History
Questions about the nature of computing first arose in the context of pure mathematics. most problems of any interest are provably unsolvable and that. most are provably intractable (that is. typically. unfortunately. The German mathematician Gottlob Frege (1848-1925) was instrumental in developing a precise system of notation to formalize mathematical proofs.1. in today's language we would say that Hilbert wanted all proofs to be checkable by an algorithm. In the process. as in the analysis of algorithms. Since mathematicians were convinced in those days that any theorem (that is. they began to study the formalisms themselves-they began to ask questions such as "What is a proof?" or "What is a mathematical system?" The great German mathematician David Hilbert (1862-1943) was the prime mover behind these studies. We shall see that. in terms of both ultimate capabilities and efficiency.2 History
5
machine (RAM). in particular. Most of us may not realize that mathematical rigor and formal notation are recent developments in mathematics. mechanical steps. It is only in the late nineteenth century that mathematicians started to insist on a uniform standard of rigor in mathematical arguments and a corresponding standard of clarity and formalism in mathematical exposition. time or space). but his work quickly led to the conclusion that the mathematical system of the times contained a contradiction. we shall learn a great deal about the nature of computational problems and. apparently making the entire enterprise worthless. about relationships among computational problems. of the few solvable problems. We shall prove that Turing machines and RAMs have equivalent modeling power. We shall use the tool (Turing machines) to develop a theory of computability (what can be solved by a machine if we disregard any resource bounds) and a theory of complexity (what can be solved by a machine in the presence of resource bounds.

" In order to do this. The 1930s and 1940s saw a blossoming of work on the nature of computation. he stated that the first step would be to show that the arithmetic of natural numbers. where are we to find truth and certitude?" In spite of the fact that Hilbert's program as he first enounced it in 1900 had already been questioned. which no one doubts and where contradictions and paradoxes arise only through our own carelessness. The second condition is intolerable. the Austrian-American logician Kurt Godel (1906-1978) put an end to Hilbert's hopes by proving the incompleteness theorem: any formal theory at least as rich as integer arithmetic is incomplete (there are statements in the theory that cannot be proved either true or false) or inconsistent (the theory contains contradictions). Hilbert had said "If mathematical thinking is defective. The German mathematician Georg Cantor (1845-1918) showed in 1873 that one could discern different "grades" of infinity. since anything can be proved from a contradiction. However. In 1931. but any formal basis for reasoning about such infinities seemed to lead to paradoxes. unambiguous.." He famously pledged that "no one shall drive us out of the paradise that Cantor has created for us" and restated his commitment to "establish throughout mathematics the same certitude for our deductions as exists in elementary number theory. a modest subset of mathematics. thereby reducing reasoning about infinitesimal values to reasoning about the finite values a and E. consistent basis. could be placed on such a firm. discussed the problems associated with the treatment of the infinite and wrote ". He went on to build an elegant mathematical theory about infinities (the transfinite numbers) in the 1890s. a few mathematicians are still trying to find flaws in his reasoning). Godel's result was so sweeping that many mathematicians found it very hard to accept (indeed. in an address to the Westphalian Mathematical Society in honor of Weierstrass. deductive methods based on the infinite [must] be replaced by finite procedures that yield exactly the same results. As late as 1925.6
Introduction Much of the problem resided in the notion of completed infinitiesobjects that. Hilbert. if they really exist. The French Augustin Cauchy (1789-1857) and the German Karl Weierstrass (1815-1897) had shown how to handle the problem of infinitely small values in calculus by formalizing limits and continuity through the notorious a and c. including the development of several utterly different and
.. such as the set of all natural numbers or the set of all points on a segment-and how to treat them. his result proved to be just the forerunner of a host of similarly negative results about the nature of computation and problem solving.. the first condition is at least very disappointing-in the 1925 address we just mentioned. are truly infinite.

Herbrand proved a number of results about quantification in logic. No fewer than four important models were proposed in 1936:
* Godel and the American mathematician Stephen Kleene (1909-1994)
proposed what has since become the standard tool for studying computability. (Most of the pioneering papers are reprinted in The Undecidable. all of the others can too! This equivalence among the models (which we shall examine in some detail in Chapter 4) justifies the claim that all of these models are indeed universal models of computation (or problem solving). defined through an equational mechanism. each purported to be universal. In 1954. along with the French logician Jacques Herbrand (1908-1931) proposed general recursive functions. based on a particularly constrained type of inductive definitions. Lambda calculus later became the inspiration for the programming language Lisp. in which he proposed a model very similar to today's formal grammars. This claim has become known as the Church-Turing thesis. Turing machines have since become the standard tool for studying complexity. he had already worked on the same lines in the 1920s.) Finally. the theory of partial recursive functions. nowadays. based on a mechanistic model of problem solving by mathematicians. based on an inductive mechanism for the definition of functions.A. based on deductive mechanisms. but had not published his work at that time.1. results that validate the equational approach to the definition of computable functions. the universal registermachines. thesis.D. in 1943. and are well worth the reading: the clarity of the authors' thoughts and writing is admirable. the Polish-American logician Emil Post (18971954) proposed his Post systems. as is their foresight in terms of computation. a The same two authors. Markov published his Theory of Algorithms. edited by M. in 1963. The remarkable result about these varied models is that all of them define exactly the same class of computable functions: whatever one model can compute. In his Ph. many variants of that model have been devised and go by the generic name of register-addressable machines-orsometimes random access machinesor RAMs. v The British mathematician Alan Turing (1912-1954) proposed his Turing machine. A few years later. Even
. Davis. a The American logician Alonzo Church (1903-1995) proposed his lambda calculus.2 History
7
unrelated models. the American computer scientists Shepherdson and Sturgis proposed a model explicitly intended to reflect the structure of modern computers. the Russian logician A.

2 Parallel machines. and others began to define and characterize classes of problems defined through the resources used in solving them. Although they did not define it as such. but also a network connection). Turing's model (and. Juris Hartmanis. how efficiently could it be solved? Work with ballistics and encryption done during World War II had made it very clear that computability alone was insufficient: to be of any use. the class NP was defined much earlier by Gddel in a letter to von Neumann!) In 1971. to this day. researchers in computability theory have been able to characterize quite precisely what is computable. 2 A von Neumann machine consists of a computing unit (the CPU). independently. is devastating: as we shall shortly see.' Building on the work done in the 1930s. and a communication channel between the two (a bus. The answer. they still follow the von Neumann model closely. but are built from CPUs. Church's and Post's models as well) was explicitly aimed at capturing the essence of human problemsolving. however. In the 1960s.8
Introduction as Church enounced it in 1936. and busses. yet had solutions that were clearly easy to verify. thereby formalizing the insights of Cobham and Edmonds. Richard Stearns. several prominent mathematicians and physicists have called into question its applicability to humans. A year later. characterizes all computers ever produced. their work prefaced the introduction of the class NP. In 1965 Alan Cobham and Jack Edmonds independently observed that a number of problems were apparently hard to solve. while accepting its application to machines. memory units. and data flow machines may appear to diverge from the von Neumann model. this thesis (Church called it a definition. Computing pioneers at the time included Turing in Great Britain and John von Neumann (19031957) in the United States. Nowadays. As we shall see. (Indeed. for instance. Leonid Levin in the Soviet Union) proved the existence of NP-complete problems. at the same time. Stephen Cook (and. Richard Karp showed the importance of this concept by proving that over 20 common optimization problems (that had resisted
l Somewhat ironically. a memory unit. alas.1) that established the existence of problems of increasing difficulty. in that sense. the Church-Turing thesis is widely accepted among computer scientists. researchers turned their attention from computability to complexity: assuming that a problem was indeed solvable. the solution had to be computed within a reasonable amount of time. the latter defined a general model of computing (von Neumann machines) that. and Kleene a working hypothesis. but Post viewed it as a natural law) was controversial: much depended on whether it was viewed as a statement about human problem-solving or about mathematics in general.
. tree machines. they proved the hierarchy theorems (see Section 6. most functions are not computable. As actual computers became available in the 1950s.2.

e. theoreticians have turned to related fields of enquiry.1.2 History
9
all attempts at efficient solutions-some for more than 20 years) are NPcomplete and thus all equivalent in difficulty. they cannot be solved efficiently. approximation. and has enabled researchers to derive extremely impressive results. quantum computing. This new view fits in better with the experience of most scientists and mathematicians. Since then. yet the verifier can assert with high probability that the proof is correct! It is hard to say whether Hilbert would have loved or hated this result. randomized computing. and. can be checked with high probability of success with the help of a few random bits by reading only a fixed number (currently. a bound of 11 can be shown) of characters selected at random from the text of the proof. a proof was considered absolute (mathematical truth). DNA computing). The last is most interesting in that it signals a clear shift in what is regarded to be a proof. Most of the proof remains unread. The most celebrated of these shows that a large class of "concise" (i. proof theory. a rich theory of complexity has evolved and again its main finding is pessimistic: most solvable problems are intractable-thatis. a proof is a communication tool designed to convince someone else of the correctness of a statement. alternate (and perhaps more efficient) models of computation (parallel computing. when suitably encoded. In Hilbert's day. polynomially long) proofs.
. including cryptography. In recent years.. whereas all recent results have been based on a model where proofs are provided by one individual or process and checked by another-that is. in a return to sources. reflects Godel's results.

1
Numbers and Their Representation
The set of numbers most commonly used in computer science is the set of natural numbers. we must resort to representations that use an unbounded number of basic units and thus become quite expensive for large ranges. along with some secondary considerations. including irrational or complex values. Other useful sets are /.CHAPTER 2
Preliminaries
2. This set will sometimes be taken to include 0. the set of rational numbers. the context will make clear which definition is used. in order to manipulate an arbitrary range of numbers. the set of real numbers. while at other times it will be viewed as starting with the number 1. and X. (. The basic. Indeed.
11
. However. for instance). in order to perform arithmetic efficiently. but. since their description in almost any encoding would take an infinite number of bits. from a theoretical standpoint. The last is used only in an idealistic sense: irrational numbers cannot be specified as a parameter. the only critical issue is whether the base is 1 or larger. Number representation depends on the choice of base. The choice of base is important for a real architecture (binary is easy to implement in hardware. we are more or less forced to adopt some simple number representation and to limit ourselves to a finite subset of the integers and another finite subset of the rationals (the so-called floatingpoint numbers). the set of all integers (positive and negative). we must remember that the native instruction set of real computers can represent and manipulate only a finite set of numbers. finite set can be defined in any number of ways: we can choose to consider certain elements to have certain values. denoted N. as quinary would probably not be.

The first is one of the most studied problems in computer science and remains a useful model for a host of applications. even though its original motivation and gender specificity
. either a or b is 1. We shall use log n to denote the logarithm of n in some arbitrary (and unspecified) base larger than one. in which case this factor is either 0 or infinity). formed from the latter by fixing the parameter S to the specific set So. define an instance of the problem. The former ("membership in So") is a special case of the latter. whereas the reverse need not be true. Since we have log. while the question asks if the input element belongs to S(. the value n expressed in binary requires only [log 2 nj + I digits. where So is defined through some suitable mechanism. an algorithm to decide membership for the latter problem automatically decides membership for the former problem as well. The parameters. and more likely much harder than. in quinary. In contrast. we call such special cases restrictions of the more general problem. A problem is defined by a finite set of (finite) parameters and a question. computer scientists use base 2. We would expect that the more general problem is at least as hard as. typically.12
Preliminaries
In base 1 (in unary. n = log. using a different base only contributes a constant factor of log. b (unless. once instantiated. we shall use Inn. unary notation simply represents each object to be counted by a mark (a digit) with no other abstraction. we shall assume throughout that numbers are represented in some base larger than 1. of course. A simple example is the problem of deciding membership in a set So: the single parameter is the unknown element x. Unless otherwise specified. Instances. [log5 nj + 1 digits. After all. and Solutions
Since much of this text is concerned with problems and their solutions. Thus number representations in bases larger than I are all closely related (within a constant factor in length) and are all exponentially more concise than representation in base 1.2
Problems. This problem ("membership in So") is entirely different from the problem of membership of x in S.
2. b logb n. and so on. the question typically includes a fair amount of contextual information. its restriction. when specifically using natural logarithms. where both x and S are parameters. that is). it behooves us to examine in some detail what is meant by these terms. We consider a few more elaborate examples. so as to avoid defining everything de novo. the value n requires n digits-basically.

. . the following is also a well-defined (and extremely famous) problem: Question: is it the case that. A sample instance of the problem for 9 eastern cities is illustrated in Figure 2. there cannot
+ bk
=
exist a triple of natural numbers (a.
(dij).2. Buffalo. you will have recognized it as Fermat's conjecture. like Traveling Salesman. and a target value B E FN. obeying Yx~s. requires an answer that will vary from instance to instance. c) obeying ak
ck?
This problem has no parameters whatsoever and thus has a single instance. Question: what is the permutation ir of the index set {1.1. whereas Subset Sum. 2. Perhaps surprisingly. Thus we can speak of the answer to a particular instance. The second problem is known as Subset Sum and generalizes the problem of making change: Instance: a set S of items. we can make change for the amount B. v(x) = B? We can think of this problem as asking whether or not. . we can define this problem formally as follows:
Instance: a number n> 1 of cities and a distance matrix (or cost function). Detroit. Question: does there exist a subset S' C S of items such that the sum of the values of the items in the subset exactly equals the target value. Cleveland..2 Problems. and Solutions
13
have long been obsolete: the Traveling Salesman Problem (TSP) asks us to find the least expensive way to visit each city in a given set exactly once and return to the starting point. Cincinnati. the optimal tour for this instance has a length of 1790 miles and moves from Washington to Baltimore. where dij is the cost of traveling from city i to city j. Philadelphia. While two of the last three problems ask for yes/no answers. . each associated with a natural number (its value) v: S -A N. finally proved correct nearly 350 years after the French mathematician Pierre de Fermat (16011665) posed it. but we must distinguish that from a solution to the entire problem. but common in mathematics) where the
. New York. there is a fundamental difference between the two: Fermat's conjecture requires only one answer because it has only one instance. b. for any natural number k
>
3. n} that I r(i)(i+1) + c4 (n)7(l)? minimizes the cost of the tour. before returning to its starting point. Instances. i. given the collection S of coins in our pocket.e. and Pittsburgh. except in the cases (rare in computer science. Since each possible tour corresponds to a distinct permutation of the indices of the cities.

The answer to an instance of a problem can be a single bit (yes or no).2 Problems.2. prints the corresponding answer. since such a list could never be completed nor stored. Knowing a few more such answers will not really improve the situation: the trouble arises from the fact that Subset Sum has an infinite number of instances! In such a case. knowing that the answer to the instance of Subset Sum composed of three quarters. When the answer is a single structure (e. instead. signaling the fact that a solution algorithm must search for the correct structure to return. we shall distinguish several basic varieties of answers. but we do know that such an algorithm exists. prints the second. A solution is then an algorithm that." but we now know that it is the former.g. since all others can be solved very efficiently by prior tabulation of the instance/answer pairs and by writing a trivial program to search the table. we call problems with such answers decision problems. For instance. which. the solution could have been the program "print yes" or the program "print no. For a problem with a single instance such as Fermat's conjecture the algorithm is trivial: it is a one-line program that prints the answer. but it can also be a very elaborate structure. view of a problem: a problem is a (possibly infinite) list of pairs." they can be regarded as a simple decision to accept (yes) or reject (no) the instance. we could ask for a list of all possible legal colorings of a graph or of all shortest paths from one point to another in a three-dimensional space with obstacles. we call the corresponding problems search problems. the solution must be an algorithm. when given any instance. a very large) number of instances are of interest. When
15
. and four pennies and asking to make change for 66 cents is "yes" does not entitle us to conclude that we have solved the problem of Subset Sum. From the point of view of computer science. Instances. one dime. a solution cannot be anything as simple as a list of answers. in that sense. When the answers are simply "yes" or "no. a path from A to B or a truth assignment that causes a proposition to assume the logical value "true" or a subset of tools that will enable us to tackle all of the required jobs). This informal view makes it clear that any problem with a finite number of instances has a simple solution: search the list of pairs until the given instance is found and print the matching answer. only problems with an infinite (in practice. in turn defining corresponding varieties of problems. This discussion leads us to an alternate. where the first member is an instance and the second the answer for that instance. In practice. the problem is known to be solvable efficiently. when given the first member of a pair. We may not have the table handy and so may not be able to run the algorithm.. if somewhat informal. three nickels. Clearly. and Solutions problem is made of a single instance. until the mid-1990s. In the case of Fermat's conjecture.

and hardest is enumeration (if we can list all structures. but not the question. return the number of distinct paths from A to B). For instance. Among four of these five fundamental types of problems.. given a graph problem. however. we have an enumeration problem. For instance. need not hold: the subproblem may be much easier to solve than the general problem. we have a counting problem. we can solve the subproblem obtained by restriction. In our first example. The converse. The counting version is somewhat apart: if we can count suitable structures. Finally. Be sure to realize that a restriction alters the set of instances. we can certainly answer the decision problem (which is equivalent to asking if the count is nonzero) and we can easily count suitable structures if we can enumerate them.. return all paths from A to B or return all shortest paths from A to B). search comes next. optimization. we can restrict its set of instances to obtain a new subproblem. we can restrict it to rectilinear segments (where all segments are aligned with the axes) or restrict it to one dimension. enumeration. since it then suffices to find the two points with the smallest and largest abscissae. The same basic problem can appear in all guises: in the case of paths from A to B.
. we can restrict instances to planar graphs or to acyclic graphs.g. since a solution to the search version automatically solves the decision version. we call the associated problems optimization problems. altering the question. and counting versions of that problem. if we know how to solve the general problem.g. devising an efficient algorithm to find the farthest-neighbor pair among a collection of points in the plane is a difficult task. when the answer is a count of such structures rather than a list (e. we restricted the problem of set membership (Does input element x belong to input set S?) to the problem of membership in a fixed set So (Does input element x belong to So?). for instance.g. but the subproblem obtained by restricting the points to have the same ordinate (effectively turning the problem into a one-dimensional version) is easy to solve in linear time. we can restrict it to finite sets or to sets with an even number of elements.16
Preliminaries
the answer is a structure that not only meets certain requirements. When the answer is a list of all satisfactory structures (e. we have a natural progression: the simplest version is decision. the shortest path from A to B rather than just any path or the least expensive subset of tools that enable us to tackle all of the required jobs rather than just any sufficient subset). but also optimizes some objective function (e. given a set problem. next comes optimization (the best structure is certainly an acceptable structure). we can ask if there exists a path from A to B (decision) and we have just seen search. however minutely. completely changes the problem. given a geometric problem based on line segments. Given a problem.. we can easily determine which is best). Clearly.

In contrast. concentrating on the structure of problems rather than on the behavior of algorithmic solutions. behavior. for instance. as does calculus in defining the limit of a function when its argument grows unbounded. If(n) . Recall that limn. for instance) and concentrates instead on the growth rate of the running time (or space) of the algorithm. we can only state that. asymptotic characterizations become even more important. Thus we briefly review some terminology. for each number N e N. thereby providing a framework within which a new function can be located and thus characterized.) and that which focuses on behavior exhibited infinitely often (or i. f (n) = a is defined by
Ve >0. there must exist some number N EN such that the function exhibits that behavior for all arguments n 3 N.al
In other words. whether through a worst-case or an average-case (or an amortized) analysis.e. 3N >0.3
Asymptotic Notation
In analyzing an algorithm. which are mostly functions from N to N. we should distinguish two types of analysis: that which focuses on behavior exhibited almost everywhere (or a. Asymptotic analysis ignores start-up costs (constant and lower-order overhead for setting up data structures.o. in the same spirit.3 Asymptotic Notation
17
2. For functions of interest to us. in consequence. on all perfect squares).). behavior is behavior that is observed on all but a finite number of function arguments.e.o. it accomplishes this by grouping functions with "similar" growth rate into families. While
asymptotic analysis has its drawbacks (what if the asymptotic behavior appears only for extremely large instances that would never arise in practice? and what if the constant factors and lower-order terms are extremely large for reasonable instance sizes?).. there exists some larger number N' . algorithm designers use asymptotic notation to describe the behavior of the algorithm on large instances of the problem. In working with asymptotics. Traditional asymptotic analysis uses a. behavior is observable on an infinite number of arguments (for instance. it does provide a clean characterization of at least one essential facet of the behavior of the algorithm. i.al is almost everywhere no
.e. Asymptotic analysis aims at providing lower bounds and upper bounds on the rate of growth of functions. Vn ¢ N. that classes of complexity used in characterizing problems are defined in terms of asymptotic worst-case running time or space. a. the value If(n) .N such that the function exhibits the desired behavior on argument N'. We shall see. Now that we are working at a higher level of abstraction.2. for all E > 0.

for all n > N. analysis is best). and transitive). lower bounds may prevent us from deriving a big Theta characterization of a function. However. Thus writing "2n 2 + 3n . even when we understand all there is to understand about this function.e. For instance. a polynomial is represented only by its leading (highestdegree) term stripped of its coefficient. many authors write "f E 0(g)" (read "f is in big Oh of g") rather than "f is 0(g). we do not pursue the issue any further and instead adopt the convention that all asymptotic analysis. Both O() and Q() define partial orders (reflexive. while writing "3n2 . note that our focus on a. analysis of worst-case behavior and since it mostly concerns upper bounds (where a. unless explicitly stated otherwise. such characterizations keep the representative function g as simple as possible. In order to be useful. we can correctly write "2n + 1 is 0 ( 2fl).e. it pays to remember that the latter is to be represented by the big Theta notation.e. big Omega a (potentially reachable) lower bound. while 0() is an equivalence relation. a good case can be made that io. For instance. we can write "3n2 ."
Many authors and students abuse the big Oh notation and use it as both an upper bound and an exact characterization.e. antisymmetric. While a. for instance." All three notations carry information about the growth rate of a function: big Oh gives us a (potentially reachable) upper bound.10 is 0(n 2 )" expresses the fact that our polynomial grows asymptotically no faster than n2 . * f is 0(g) (pronounced "big Theta" of g) if and only if f is both 0(g) and Q (g). When we use the big Theta notation. analysis is justified for upper bounds. Naturally the bounds need not be tight. we have managed to bring our upper bounds and lower bounds together. however. Since most of complexity theory (where we shall make the most use of asymptotic analysis and notation) is based on a. is done in terms of a. * f is Q(g) (pronounced "big Omega" of g) if and only if g is 0(f). the running time of an algorithm that decides if a number is prime by trying as potential
. and big Theta an exact asymptotic characterization. behavior. so that the characterization is tight.2n + 15
is E(n
2
). Since 0(g) is really an entire class of functions. we have f (n) S c g(n). analysis is a better choice for lower bounds.e.18
Preliminaries larger than E.2n + 22 is 2(n2 )" expresses the fact that our polynomial grows at least as fast as n2 . Consider." but such a bound is so loose as to be useless. Let f and g be two functions mapping the natural numbers to themselves: * f is 0(g) (pronounced "big Oh" of g) if and only if there exist natural numbers N and c such that.

since there is no natural number N beyond which it will always require on the order of a trials. If we set out to improve on an existing algorithm.
divisors all numbers from 2 to the (ceiling of the) square root of the given number. we need more than just analyses: we also need goals. When we define complexity classes so as to be independent of the chosen model of computation.o. the best we can say is that the algorithm runs in Q (1) time. On half of the possible instances (i. (An i. lower bound would have allowed us to state a bound of . hence our goal is to design an algorithm with a running time or space that grows asymptotically more slowly than that of the best existing algorithm. pseudocode for this algorithm is given in Figure 2. If the best algorithm known for our problem runs in ((g) time.) time. Indeed.
.. Ignoring the cost of arithmetic.2. we group together an entire range of growth rates. asymptotically faster than the best algorithm known. that is. we may want to set ourselves the goal of designing a new algorithm that will run in o(g) time. we see that the algorithm runs in O(Ja) time. but we cannot state that it takes Q (+/i. on a third of the remaining instances.2
A naive program to test for primality. If f is o(g).2. it terminates with an answer of "no" after two trials. To define these goals (and also occasionally to characterize a problem).) When designing algorithms.3 Asymptotic Notation divisor loop
divisor if
19
1
:= divisor + 1 if (n mod divisor == 0) exit("no") (divisor * divisor >= n) exit('yes")
endloop
Figure 2. then its growth rate is strictly less than that of g. which is clearly a poor lower bound. we need notation that does not accept equality: * f is o(g) (pronounced "little Oh" of g) if and only if we have lim f (n) =
_
n-+oc g(n)
* f is co(g) (pronounced "little Omega" of g) if and only if g is o(f).A/h.e. since there is an infinite number of primes. and so forth. this algorithm terminates with an answer of "no" after one trial. we want that improvement to show in the subsequent asymptotic analysis. on the even integers). Yet every now and then (and infinitely often). the algorithm encounters a prime and takes on the order of +/i trials to identify it as such.

the graph is said to be undirected and a pair of vertices {u. with a total of seven bridges joining the various parks. and thereby defined a (multi)graph. with u and v the endpoints of the edge. growth bounded by log' n for some positive constant a) as
0(logo(') 0. Ladies of the court allegedly asked Euler whether one could cross every bridge exactly once and return to one's starting point. If the pairs are unordered. polynomial time can also be defined as 0(n0 (')) time. v} is called an edge.20
Preliminaries a typical example is polynomial time.3
The bridges of Konigsberg. an arc is said
Figure 2. but this time to denote the fact that the exponent is an arbitrary positive constant. as illustrated in Figure 2.
2. with u the tail and v the head of the arc. A).3. In such a case. A directed graph is then given by the pair G = (V. which groups under one name the classes O(na) for each a E N. Euler modeled the four parks with vertices.4
Graphs
Graphs were devised in 1736 by the Swiss mathematician Leonhard Euler (1707-1783) as a model for path problems. where V is the set of vertices and E the set of edges. E). where V is the set of vertices and A the set of arcs. the seven bridges with edges.
. A (finite) graph is a set of vertices together with a set of pairs of distinct vertices. If the pairs are ordered. we can use asymptotic notation again. the graph is said to be directed and a pair of vertices (u. v) is called an arc. An undirected graph is then given by the pair G = (V. Since an arbitrary positive constant is any member of the class 0(l). The city of Konigsberg had some parks on the shores of a river and on islands.
Similarly we can define exponential growth as 0(2 0(n)) and polylogarithmic
growth (that is. Euler solved the celebrated "Bridges of Konigsberg" problem. Two vertices connected by an edge or arc are said to be adjacent.

in the graph of Figure 2.
(a) a directed graph (b) an undirectedgraph
Figure 2. whereas Euler's model for the bridges of Konigsberg had multiple edges: when multiple edges are allowed. Figure 2. A cycle (or circuit) is a walk that returns to its starting point-the first and last vertices in the list
. we distinguish between the outdegree of a vertex (the number of arcs. the tail of which is the given vertex) and its indegree (the number of arcs pointing to the given vertex). Graphically.4 shows examples of directed and undirected graphs. In the graph of Figure 2. An isolated vertex is not adjacent to any other vertex. in a directed graph. then this graph includes every possible edge between its vertices and is said to be the complete graph on n vertices. In an undirected graph. the leftmost vertex has indegree 2 and outdegree 1.4
Examples of graphs. an arc (u.2. A walk may pass through the same vertex many times and may use the same arc or edge many times. v. the leftmost vertex has degree 3. each example in Figure 2. and the graph is termed a multigraph. An isolated vertex has degree (indegree and outdegree) equal to zero.4(a). An undirected graph is said to be regular of degree k if every vertex in the graph has degree k. a subset of vertices of the graph such that no vertex in the subset is adjacent to any other vertex in the subset is known as an independent set. the collection of edges is no longer a set.4 has one isolated vertex. we represent vertices by points in the plane and edges or arcs by line (or curve) segments connecting the two points. but a bag. Note that our definition of graphs allows at most one edge (or two arcs) between any two vertices. denoted KnA walk (or path) in a graph is a list of vertices of the graph such that there exists an arc (or edge) from each vertex in the list to the next vertex in the list. If an undirected graph is regular of degree n -1 (one less than the number of vertices). if the graph is directed.4 Graphs
21
\V/ . while an edge is incident upon both its endpoints. which is the number of edges that have the vertex as one endpoint.4(b). v) also includes an arrowhead pointing at and touching the second vertex.
to be incident upon its head vertex. while. each vertex has a degree.

Many questions about graphs revolve around the relationship between edges and their endpoints.4(b) is not connected but can be partitioned into three (maximal) connected components.
D
It also follows that a tree is a minimally connected graph: removing any edge breaks the tree into two connected components.1 edges. every complete graph is Hamiltonian. A vertex cover for a graph is a subset of vertices such that every edge has one endpoint in the cover. a spanning tree for the graph is a subset of edges of the graph that forms a tree on the vertices of the graph.4(a) is not strongly connected but can be partitioned into two strongly connected components.5 shows a graph and one of its spanning trees. A graph without any cycle is said to be acyclic. this property is particularly important among directed graphs. whereas the directed graph may have two entirely distinct paths for the two directions. The undirected graph of Figure 2. Figure 2. The directed graph of Figure 2.22
Preliminaries
are identical. A simple path is a path that does not include the same vertex more than once-with the allowed exception of the first and last vertices: if these two are the same. The requirements are now stronger. Trivially. a graph with such a cycle is a Hamiltoniangraph. similarly. A simple cycle that includes all vertices of the graph is known as a Hamiltonian circuit. Exercise 2. then the simple path is a simple cycle. A cycle that goes through each arc or edge of the graph exactly once is known as an Eulerian circuit-such a cycle was the answer sought in the problem of the bridges of Kdnigsberg. The same property applied to a directed graph defines a strongly connected graph. A directed acyclic graph (or dag) models such common structures as precedence ordering among tasks or dependencies among program modules. The first theorem of graph theory was stated by Euler in solving the problem of the bridges of Konigsberg: a connected undirected graph has an Eulerian cycle if and only if each vertex has even degree. Both graphs of Figure 2. an immediate consequence of this definition is that a tree on n vertices has exactly n . a graph with such a cycle is an Eulerian graph.4 have cycles.1 Prove this last statement. an edge cover is a subset of edges such that every vertex of the graph is the endpoint of an edge in the cover. An undirected graph in which there exists a path between any two vertices is said to be connected. A tree is a connected acyclic graph. one composed of the isolated vertex and the other of the remaining six vertices. A legal vertex coloring of a graph is an assignment of
. since the undirected graph can use the same path in either direction between two vertices. Given a connected graph.

When the graph is bipartite. we can view the vertices on one side as
. V}. all edges of a bipartite graph have one endpoint in one subset of the partition and the other endpoint in the other subset. A bipartite graph is often given explicitly by the partition of its vertices. the smallest number of colors needed to produce a legal edge coloring is known as the chromatic index of the graph. and its set of edges and is thus written G = (U. the smallest number of colors needed to produce a legal vertex coloring is known as the chromatic number of a graph. we then seek the maximum matching that minimizes the sum of the costs of the selected edges. In the minimum-cost matching problem..
colors to the vertices of the graph such that no edge has identically colored endpoints. In particular. (Viewed differently. E)..2 Prove that the chromatic number of K& equals n and that the chromatic number of any tree with at least two vertices is 2. each of which is an independent set. If the matching includes every vertex of the graph (which must then have an even number of vertices). each subset of vertices of the same color forms an independent set. a legal edge coloring is an assignment of colors to edges such that no vertex is the endpoint of two identically colored edges.4 Graphs
23
(a) the graph
(b) a spanning tree
Figure 2. Exercise 2. it is called a perfect matching. In a legal vertex coloring.2. V}. if a graph has a chromatic number of two or less. it is said to be bipartite: its set of vertices can be partitioned into two subsets (corresponding to the two colors). A matching in an undirected graph is a subset of edges of the graph such that no two edges of the subset share an endpoint.) A bipartite graph with 2n vertices that can be partitioned into two subsets of n vertices each and that has a maximum number (n2 ) of edges is known as a complete bipartite graph on 2n vertices and denoted Kn. a maximum matching is a matching of the largest possible size (such a matching need not be unique).5
A graph and one of its spanning trees. :1 Similarly. say {U. edges are assigned costs.

Two graphs are isomorphic if there exists a bijection between their vertices that maps an edge of one graph onto an edge of the other. If costs are assigned to the edges of the bipartite graph. with the edges denoting the relation "this task can be accomplished by that worker.6
Isomorphic and nonisomorphic graphs." A matching can then be viewed as a selection of a distinct individual to represent each committee.6 shows three graphs. the problem is often interpreted as being made of a set of tasks (the vertices on one side) and a set of workers (the vertices on the other side). and the edges as defining the compatibility relationship "this man and this woman are willing to marry each other.
. Figure 2. A different interpretation has the vertices on one side representing individuals and those on the other side representing committees formed from these individuals. A graph G' is a homeomorphic subgraph of a graph G if it can be obtained from a subgraph of G by successive removals of vertices of degree 2." The minimum-cost matching in this setting is called the Assignment problem. The subgraph was obtained by removing a
Figure 2." The maximum matching problem is then generally called the marriage problem. the matching requires that an individual may represent at most one committee.7 shows a graph and one of its homeomorphic subgraphs.24
Preliminaries
men. but neither is isomorphic to the third. Isomorphism defines an equivalence relation on the set of all graphs. (While an individual may sit on several committees. the vertices on the other side as women. since each selected edge can be viewed as a couple to be married. the problem is known as finding a Set of Distinct Representatives. the first two are isomorphic (find a suitable mapping of vertices). Exercises at the end of this chapter address some basic properties of these various types of matching. an edge denotes the relation "this individual sits on that committee.) In this interpretation. Figure 2. with the obvious cascading of the edge-replacement mechanism. where each pair of edges leading to the two neighbors of each deleted vertex is replaced by a single edge in G' connecting the two neighbors directly (unless that edge already exists). Entire chains of vertices may be removed.

1.
single vertex. We define A+ to be the set of all non-null strings over E. we may write x = 001001 or y = aabca. 11* = {I. d}. kO Ek. I I} and. 000. 00. . For specific alphabets. Strings. 1}. and Languages
25
Figure 2. b. Of special interest to us is the binary alphabet. 1}: 001010. we can also write A* = Uk1N Ek. which has zero symbols and zero length. and so on. for instance.5
Alphabets. we use the star operator directly on the alphabet set.3. For example. }. In particular. 00. . {0.5 Alphabets. the following are strings over the alphabet 10. {0. and Languages
An alphabet is a finite set of symbols (or characters). for instance. c. To denote the set of all strings of length k over E.0. A graph is said to be planarif it can be drawn in the plane without any crossing of its edges. 01. . A string is defined over an alphabet as a finite ordered list of symbols drawn from the alphabet. K 5 . 11. We often denote a string by a lowercase English character. The length of a string x is denoted lxi. The universe of all strings over the alphabet E is denoted A*.2. The special empty string. We shall typically denote an alphabet by E and its symbols by lowercase English letters towards the beginning of the alphabet. the resulting edge was not part of the original graph and so was added to the homeomorphic subgraph. 10. for instance. we have lxI = 1001001I = 6 and lyI = laabcal= 5. 1}2 is the set {00. for any alphabet E. 01. usually one at the end of the alphabet. 10. we can write E+ = E*-{£3 = UkN. for instance. or the complete bipartite graph on six vertices. we use the notation Ek. e.
2. t0. is denoted E.. E {0. Strings. 11* is the set of all binary strings.g. we have E° = (E}.
. A famous theorem due to Kuratowski [1930] states that every nonplanar graph contains a homeomorphic copy of either the complete graph on five vertices. X = {a. 1. An algorithm due to Hopcroft and Tarjan [1974] can test a graph for planarity in linear time and produce a planar drawing if one exists.7
A graph and a homeomorphic subgraph. K3 .

e. Other questions of
interest about languages concern the result of simple set operations (such
as union and intersection) on one or more languages. where
.. an is a string. that is. then we get z = aIa 2 . the latter is not an empty set. then we say that y (and also x and z) is a substring of w. we need an
algorithm that computes the characteristicfunction of the set L-i. in particular. To settle this question for large numbers of strings. if we have x = aabbacbbabacc.. Finally. then we denote its reverse. is a palindrome. If we have a string x = a I a2 .
we have k .. we have xE = Ex = x.. 1
which is a consecutive run of symbols occurring with the original string. . a string that is its own reverse.an and y = b 1b2 . which contains no strings whatsoever. is a prefix of itself. . . an. is the sum of the lengths of the two operand strings. These questions are trivially settled when the language is finite and specified by a list of its members. Concatenation with the empty string does not alter a string: for any string x. . Unlike a substring. b. For instance. lxi + Jyl. if x = aIa2 . that
returns 1 when the string is in the set and 0 otherwise. most languages with which we work are defined implicitly. if some string w can be written as the concatenation of three strings.
a language is a set of strings over the given alphabet. thus. a . we may want to know whether x belongs to L.26
Preliminaries
The main operation on strings is concatenation. we write CL for the characteristic function of the set L.. bm. L c E*. then we say that x is a prefix of w and y is a suffix of w. through some logical predicate.then aaaaaand abc are both subsequences of x. anblb 2 . ixyl. w = xyz.. A language L over the alphabet E is a subset of E*. namely membership: given some string x.. L = {c}.. More generally. with the language that consists only of the empty string. 1) such that CL(X) = holds if and only if x is an element of L. a2aI. any string is a substring of itself. ai2 . x = x R.
by xR. . then any string of the form ai. is a subsequence of x. . Asking whether some string w belongs to some language L is then a simple matter of scanning the list of the members of L for an occurrence of w. If some string w can be written as the concatenation of two strings x and y. but neither is a substring of x. and is a suffix of itself. a subsequence is just a sampling of symbols from the string as that string is read from left to right. Any of the substrings involved in the concatenation can be empty. with CL: A -* j0. The length of the resulting string. The predicate mechanism allows us to define
.. aik. However. Do not confuse the empty language. if we let x = a1 a2 . w = xy.. . The key question we may ask concerning languages is the same as that concerning sets.n and ij < ij+ . Concatenating string x and string y yields a new string z = xy where. by a statement of the form Ix I x has property P). Formally. A language may be empty: L = 0.

1* Ix ends with a single 0)
It also allows us to provide concise definitions for large. sometimes called a one-to-one correspondence.6
Functions and Infinite Sets
A function is a mapping that associates with each element of one set. such as the language
L ={x e {0. the function is said to be invective (also one-toone). However.
2.
. An invective function has a well-defined inverse.2. not from the co-domain to the domain. the function is said to be surjective (also onto). f -1: B -+ A (see Exercise 2.10. complex.000}
When a language is defined by a predicate. the co-domain.2) would run in time proportional to 21x1 whenever the number is prime.33). The set of all elements of B that are associated with some element of A is the range of the function. or at least very time-consuming. the language L = {x I considered as a binary-coded natural number. 1l* Ix =xR and jxI . for instance.6 Functions and Infinite Sets
27
infinite sets (which clearly cannot be explicitly listed!). an element of another set. If the function maps distinct elements of its domain to distinct elements in its range. the inverse of a function is not well defined. Generally. f (A) = B. for each element in the range. since. denoted f(A). x is a prime) The obvious test for membership in L (attempt to divide by successive values up to the square root. A bijection. on the other hand. since it may not be defined on all elements of the co-domain. A function that is both invective and surjective is said to be bijective. such as the language
L = {x e
{0. that inverse is a function from the range to the domain. has a well-defined inverse from its co-domain to its domain. since several elements in the domain can be mapped to the same element in the range. (x : y) X= (f (x) 0 f (y)). the domain. A function f with domain A and co-domain B is written f: A -+ B. there is a unique element in the domain with which it was associated. yet finite sets-which could be listed only at great expense. If the range of a function equals its co-domain. deciding membership in that language can be difficult. as illustrated in Figure 2. Consider.

. . the manager accommodates that night an infinite number of new guests: all current guests are asked to move from their current room (say room i) to a room with twice the number (room 2i). say n. However. 8. they turned to alphabets a bit farther afield. in general. consider the two sets N = {1. we have No + No = to. but the manager states that any new guest is welcome: all current guests will be asked to move down by one room (from room i to room i + 1). Hence these two infinite sets are of the same size.
. However. we can simply count the number of elements in each set and compare the values. even though one (the natural numbers) appears to be twice as large as the other (the even numbers). If two finite sets have the same size.28
Preliminaries How do we compare the sizes of sets? For finite sets. then we have shown that A. the counting idea fails with infinite sets: we cannot directly "count" how many elements they have. No is usually pronounced as "aleph nought"). 4. On a busy day in the holiday season. but one suffices): just map the first element of one set onto the first element of the other and. then the new guest will be assigned room 1. the hotel is full (each of the infinite number of rooms is occupied).and upper-case letters (with and without subscripts and superscripts) of the Roman and Greek alphabets. is the first letter of the Hebrew alphabet'. Example 2. the notion that two sets are of the same size whenever there exists a bijection between the two remains applicable. I (the set of the natural numbers) and E= (2. 6. and 0 all have cardinality No. Unfortunately. We say that the natural numbers form a countably infinite set-after all. Thus having exhausted the lower. 2. Cantor. We denote the cardinality of N by Ro (aleph. with
'A major problem of theoreticians everywhere is notation. a deeply religious man. } (the set of the even numbers). The hotel has an infinite number of rooms. . There is a very natural bijection that puts the number n E N into correspondence with the even number 2n E E . there exist n! bijections. after which the (infinite number of) new guests are assigned the (infinite number of) odd-numbered rooms. In fact. 3. E. A Swiss hotelier runs the Infinite Hotel at a famous resort and boasts that the hotel can accommodate an infinite number of guests. the ith element of one onto the ith element of the other. .1 We can illustrate this correspondence through the following tale. Mathematicians in particular are forever running out of symbols. used the Hebrew alphabet to represent infinities for religious reasons. In view of our example. If we let 0 denote the set of odd integers. t. numbered starting from 1. these are the very numbers that we use for counting! Thus a set is countable if it is finite or if it is countably infinite. then there exists a simple bijection between the two (actually. yet we also have N = E U 0. As a simple example. 4.

1/3 2/3 1/4 2/4 1/5 2/5 . all numbers that can be expressed as a fraction. Strictly speaking. We claim that this set is also countably infinite-even though it appears to be much "larger" than the set of natural numbers. each successive diagonal starts enumerating a new row while enumerating the next element
1 2
1
1/1 2/l 1/2 2/2
3 4 5. the rational numbers can be placed in a one-to-one correspondence with the natural numbers. (Furthermore. such an enumeration would never terminate. This arrangement is illustrated in Figure 2. all diagonal elements equal one. the second of the second row.35.8
Placing rational numbers into one-to-one correspondence with natural numbers. we use the backwards diagonals in the table. and so on. so that we would never get around to enumerating the next row or column).
yielding the desired result. if you prefer to view it that way. even with all these repetitions. and the first of the third row. since we show that. followed by the third element of the first row. More interesting yet is to consider the set of all (positive) rational numbers-that is.6 Functions and Infinite Sets
29
E n ( = 0 (that is. E and 0 form a partition of NJ) and thus NI = E I + 1'10 R . for instance.. these repetitions can be removed if so desired: see Exercise 2.
Figure 2. we shall use a process known as dovetailing.2.
. one after the other. the same rational number appears infinitely often in the table.
5
5/1 5/2
5/3 5/4 5/5
. We arrange the rational numbers in a table where the ith row of the table lists all fractions with a numerator of i and the jth column lists all fractions with a denominator of j.
2
3
4
3/1
3/2 3/3 3/4 3/5
4/l 4/2 4/3 4/4 4/5
. which consists of enumerating the first element of the first row. followed by the second element of the first row and the first of the second row. This redundancy does not impair our following argument. Graphically...8.) The idea is very simple: since we cannot enumerate one row or one column at a time (we would immediately "use up" all our natural numbers in enumerating one row or one column-or..

Let I be the least integer with 1(1 . all asking for rooms.2 = (i + j)(i + j .30
Preliminaries
1 2 3 4
1 2
2 3 4 3
4
8
7
5 3/5/ 9
6
10
Figure 2. our bijection maps the pair (i.3j + 2)/2.1)(i + j . which gives us i and j. it follows that we have No o = No.I)st back diagonal. Consider the fraction in the ith row and jth column of the table.l)st back diagonal. nonzero rational numbers and the nonzero natural numbers. yet eventually covering all elements in all rows. so that all back diagonals before it must have already been listeda total of Z'-+ 2/ = (i + j .2)/2.1)(i + j . then we have I = i + j and 2j = 1(I . Moreover.
. Exercise 2. the cardinality of the natural numbers acts in the arithmetic of infinite cardinals much like 0 and I in the arithmetic of finite numbers.3 Use this bijection to propose a solution to a new problem that just arose at the Infinite Hotel: the hotel is full.
of all rows started so far. thereby never getting trapped in a single row or column.1)-(2k -2).2)/2 elements. Basically.i .4 Verify that the function defined below is a bijection between the positive. so that its index is f (i. How will our Swiss manager accommodate all of these new guests and keep all the current ones? cz Since our table effectively defines Cl (the rational numbers) to be C = N x A. it is the ith element enumerated in the (i + j .1) . We can define the induced bijection in strict terms. In other words. It sits in the (i + j . or 2k .2j. each loaded with an infinite number of tourists. Exercise 2. We have k = i + (i + j . Conversely. yet an infinite number of tour buses just pulled up.2)/2 + i.9
A graphical view of dovetailing.1) > 2k-2.1)(i + j . j) to the value (i 2 + j2 + 2ij . we can determine what fraction defines this element as follows.9. This process is illustrated in Figure 2. j) = (i + j . if we know that the index of an element is k.

we have seen the answer to that question. y) = 2X(2y + 1) . We call these two functions projections and write them as 11I(z) and 112(z). we should stop and ask ourselves whether pairing and projection functions actually exist! In fact.7
Pairing Functions
A pairing function is a bijection between N x N and N that is also strictly monotone in each of its arguments. y) =x + (y + L( + 1)]. y). so that the bijection was monotone in each argument. y) <p(x. Moreover. column j + 1. Thus. then we also I(z) = x and 112(Z) = yhave Il At this point. then we require: * p is a bijection: it is both one-to-one (injective) and onto (surjective). y) with pointed brackets as (x. the item in row i.L[fj 2 and F12(Z) = LKFJ . by definition. Thus our dovetailing process defined a pairing function. column j of our infinite table was enumerated before the item in row i + 1.2
2
Verify that it is a pairing function and can be reversed with 11I(z)= z . If we let p: N x N -. y). y + 1).1. y C N. y) and p(x. We shall denote an arbitrary pairing function p(x.7 Pairing Functions and define a procedure to reverse it:
f(1)
31
=1
F
f(2n)= f(n)+ I f(2n + 1) = 1/f(2n)
2. y) <p(x + 1.5 Consider the new pairing function given by
(x. we thus need two functions. Given some pairing function. we had seen how to reverse the enumeration. if we have z = (x. Our dovetailing process to enumerate the positive rationals was a bijection between N x N and N. Exercise 2.2. it is reversed simply by factoring z + 1 into a power of two and an odd factor.L(FII(Z)+l)J
. Perhaps a more interesting pairing function is as follows: z = (x. We can easily verify that this function is bijective and monotone in each argument. one to recover each argument. * p is strictly monotone in each argument: for all x. we need a way to reverse it and to recover x and y from (x. Finally. we have both p(x.N be a pairing function. column j and before the item in row i. y).

a pairing function is at the heart of a dovetailing process. Vi
trivial cases
FI'(z)={III(z) FIn (z)
nH
i --1 i <n
i
=
normal base cases
1(z) =
l(fI(z))
n12 (rn(Z))
n
inductive step
i >
n
. or more items. . We define W(i. 4.)). which basically takes z to be the result of a pairing of n natural numbers and then returns the ith of these. which pairs n natural numbers (and thus is a bijection between FNn and N that is strictly monotone in each argument) recursively as follows: ( )O= 0 and (x)
(x. For these general pairing functions. not every bijection is a valid pairing function.). = (xi. x. In computability and complexity theory. n. 4.. For the purpose of dovetailing along two dimensions. we cannot tell if it is the result of pairing 2. we need pairing functions that pair 3. . since we now have pairing functions of various arity. 4.. . which we shall normally write as l1 7(z). we need matching general projections. For three or more dimensions.-t.. . so that. x. Y)
x (x -l.-i
(xi. for instance. by using two-dimensional pairing functions as a base case. Formally. we define the function (. ." not just 2 as we used before. As we just saw. We need the n as argument. Fortunately. when given just z. a moment of thought reveals that we can define such "higher" pairing functions recursively. z). or more natural numbers together. Y)2 = (x. -). n..4 is not monotonic in each argument and hence not a pairing function. recursively as follows: FI?(z) = 0 and l (z)
n
=
z. z).32
Preliminaries So pairing functions abound! However.. x. or even more "dimensions. Perhaps even more intriguing is the fact that we can always project back from z as if it had been the pairing of n items even when it was produced differently (by pairing m < n or k > n items). 3. a pairing function works well. we are often forced to resort to dovetailing along 3. we would like to define a function HI(i. . the bijection of Exercise 2.
trivial cases normal base case inductive step
where we have used our two-dimensional pairing function to reduce the number of arguments in the inductive step.

.e. (0. we will show the somewhat stronger result that no bijection can exist between the natural numbers and the real numbers whose integer part is 0-i. 1. where k may vary. and consider i to be the pairing of k
natural numbers. . for any n > 2 and any x. . . .. hence the ith number gives us the k arguments.. such a proof is a proof by contradiction. M (z) = 17' (z) and nl (z) = M (z) for all n. 1 2. .
2. 0. . then pair the result with k Xk)k. . each taken from a countably infinite set. rl'(z) S z for all n and all z. (xI. we can just pair the k arguments together to form a single argument. xk)k. In particular. all z. i. . . 1). but with a built-in enumeration/construction process in the middle.. and all i > n. in which case we are at liberty to define the result in any convenient way. we can encode it as part of the pairing: if we need to pair together k values. x2 . The proof technique is known as diagonalization. Xk = nkl(i). for reasons that will become obvious as we develop the proof. 3.8
Cantor's Proof: The Technique of Diagonalization
We now show that no bijection can exist between the natural numbers and the real numbers.. Keep in mind that the notation fl7(z) allows argument triples that do not correspond to any valid pairing. 0). .2. I X. Specifically. = 0 and (x. (x . Xk. 0).
. allowing us to dovetail. such as fl' 0 (z). those in the semi-open interval [0.6 Verify that our definition of the projection functions is correct. Essentially. we begin by pairing the k values together. O) = (x. i = (xI.. Conversely. D Exercise 2. to obtain finally z = (k. xi. we simply enumerate the
natural numbers. .8 Cantor's Proof: The Technique of Diagonalization Exercise 2.
D
When we need to enumerate all possible k-tuples of natural numbers (or arguments from countably infinite sets). 1 itself. . . this k-tuple pairing will allow us to define theories based on one.or two-argument functions and then extend them to arbitrary numbers of arguments by considering one argument to be the pairing of k > 1 "actual" arguments.xk)k).7 Prove the following simple properties of our pairing and
projection functions:
33
1. 2. then we have k = HI(z) and
Xi n= l (n2(z)). 3. . This result was first proved in 1873 by Cantor-and provoked quite a furor in the world of mathematics. . If we do not know in advance the arity of the tuple (the value of k). xi = k(i).. whenever we need to handle a finite number k of arguments.

and. so that di is a valid digit and we have ensured di :A dii. 3.. they have a zero integer part followed by an infinite decimal expansion.d Idl2d]3dl4dl5 .10
The hypothetical list of reals in the range [0.d d32 d33 d34 d3 5 . In fact. We have constructed x by moving down the diagonal of the hypothetical table. Our list can be written in tabular form as shown in Figure 2. A minor technical problem could arise if we obtained x = 0. 1) clearly contains numbers that have no decimal equal to 8. since then our claimed bijection is not between the natural numbers and [0.. alternatively viewed.. with a decimal period of 0) has an "identical twin.. If a bijection exists.34
Preliminaries
1
O. obtaining such an x would require that each dii equal 8. hence the name of the technique. 1).
Let us then assume that a bijection exists and attempt to derive a contradiction.wxyz .9 0. Our new number x will be the number O. . 31 0. . . we can use it to list in order (1.x < 1. 2. in compensation. Thus x = 0. 1).. . in general. ) all the real numbers in the interval [0.1. . a repeating period of 9 is added after this changed decimal. This ambiguity gives rise to another minor technical problem with our diagonalization: what if the number x
.d4 1d 2d4 3d4 4d4 5 .10. Yet it should be. by construction. 1).9999.d 2 ld2 2d23 d24d25 . O.djd 2 d 3 d 4 d5 .. 5 5
2
3 4
5
Figure 2. 4 O. . . escaping the contradiction. where dij is the jth decimal in the expansion of the ith real number in the list. be in the list-yielding the desired contradiction. Now we shall construct a new real number.. where we set di = (dii + 1 mod 10).. which cannot. 1).09. .9: any number with a finite decimal expansion (or. 1). but only between the natural numbers and some subset of [0. 0 . such as the number 0.. . that is. because then we actually would have x = 1 and x would not be in the interval [0. hence we have the desired contradiction. this problem is due to the ambiguity of decimal notation and is not limited to 0. 1).." where the last decimal is decreased by one and. because it belongs to the interval [0. . Thus x differs from the first number in the list in its first decimal. from the ith number in the list in its ith decimal-and so cannot be in the list. from the second number in its second decimal. However. All such real numbers are of the form 0. 1
0.dAd 5 2d5 3d 4 ds . which is absurd. a subset that does not include x.1 is the same as y = 0. because the interval [0.

The reasoning can proceed as above until we get to the construction of x. but our table would then contain only a finite number of entries that do not use the digit 8 (or the digit 9). even if not in the particular form in which we generated it. not being rational itself. while any number with a nonrepeating expansion is irrational. since the table would indeed list the number x. This defect provides a way to escape the contradiction: the existence of x does not cause a contradiction because it need not be in the enumeration.9
Implications for Computability
The set of all programs is countable: it is effectively the set of all strings over. we write on each
2 Actually.2.
35
2. So the real numbers form an uncountable set.2 This time. 1) that contain neither digit (8 or 9) in their decimal expansion.
.) However. and which can also be viewed as the set of all decision problems-is easily shown to be uncountable by repeating Cantor's argument. and yet (ii) it cannot be in the enumeration. the set of all functions from N to {O. in order to generate such a number (either member of the twin pair). I-which is simply a way to view 2N. (This set includes illegal strings that do not obey the syntax of the language. note that. all diagonal elements in the table beyond some fixed index must equal 8 (or all must equal 9. The problem is that we have no proof that the x as constructed is a bona fide rational number: not all numbers that can be written with an infinite decimal expansion are rational-the rational numbers share the feature that their decimal expansion has a repeating period. say. A good exercise is to consider why we could not have used exactly the same technique to "prove" that no bijection is possible between the natural numbers and the rational numbers in the interval [0. However. since 0/1-valued functions can be regarded as characteristic functions of sets. the ASCII alphabet.9 Implications for Computability constructed through diagonalization. since there clearly exist infinitely many real numbers in [0. a result that < directly implies that 0/1-valued functions are uncountable. but this has only the effect of making our claim stronger. A proof by diagonalization thus relies on two key pieces: (i) the element constructed must be in the set. which is clearly false. the set of all subsets of RN. Cantor had proved the stronger result S1 21S for any nonempty set S. had its "identical twin" in the table? There would then be no contradiction. the cardinality of the real numbers is larger than that of the natural numbers-it is sometimes called the cardinality of the continuum. depending on the chosen twin). while not in the table. 1).

..11
Cantor's argument applied to functions. given any input program and any input data. there are many functions (a "large" infinity of them.. what can we say about unsolvable problems? Are they characterizable in some general way? * How hard to solve are specific instances of unsolvable problems? This question may seem strange.. (The prototype of the unsolvable problem is the "halting problem": is there an algorithm that.. Among the questions we may want to ask are: * Do we care that most problems are unsolvable? After all..11. Denoting the jth function in the list by fj... a prerequisite to any correctness proof. f3(1) f3(2) f3(3) f 3 (4) f3(5) f4(1) f4(2) f4(3) f4(4) f4(5) . Since the number of programs is countable and the number of 0/1valued functions (and.36
Preliminaries 1
2
3
4
5
. Now we use the diagonal to construct a new function that cannot be in the enumeration: recalling that fj(i) is either O or 1. we switch the values along the diagonal. motivates our study of computability and computation models.(n). (In other words.
fi
f2
f3
f4
fs
fl (1) (2) fl (3) fl (4) fl (5) . in fact) for which no solution program can exist. f5(1) fs(2) f5(3) f5(4) fs(5) . it may well be that none of the unsolvable problems is of any interest to us.. each function can be written as an infinite list of Os and is-the ith element in the list for f is just f(i). but many of us regularly solve instances
.) The same line of reasoning as in Cantor's argument now applies. allowing us to conclude that the set of 0/1-valued functions (or the set of subsets of N) is uncountable. if nothing else. determines whether the input program eventually halts when run on the input data? This is surely the most basic property that we may want to test about programs. we obtain the scheme of Figure 2.) That being the case.
successive line the next function in the claimed enumeration. * We shall see that unsolvable problems actually arise in practice. fi f2(1) f2(2) f2(3) f2(4) f2(5) .
Figure 2. Hence most functions are not computable! This result. a fortiori. we define our new function as f'(n) = 1 -f.. the number of integer-valued functions) is uncountable.

. Ascending Subsequence: find the longest ascending subsequence in a string of numbers-a sequence is ascending if each element of the subsequence is strictly larger than the previous. that is. Find a cover of size n for the set. C T-those classes where the test returns a positive answer. 1. C and a collection of binary-valued tests 3T = (T. many of us regularly determine whether or not some specific program halts under some specific input.. sizes are simply natural numbers (that is. Are all solvable problems easy to solve? Of course not. (A typical application is in the microbiology laboratory of a hospital. . it will turn out that most solvable problems are intractable. where
.)
2. if any..8 Give a formal definition (question and instance description) for each of the following problems: 1.. Find the smallest cover for the set. . Satisfiability: find a way. Tk).10
Exercises
Exercise 2. to satisfy a Boolean formula-that is. Find the smallest cover for the set. Find the smallest cover (i.9 Consider the following variations of the problem known as Set Cover. in which you are given a set and a collection of subsets of the set and asked to find a cover (a subcollection that includes all elements of the set) for the set with certain properties. Binpacking: minimize the number of bins (all of the same size) used in packing a collection of items (each with its own size)... subject to all subsets in the cover
being disjoint. T.
4.10 The Minimum Test Set problem is given by a collection of classes T = {C. one with the smallest number of subsets) for the set." 3. . the problem is one-dimensional). 3. Exercise 2. 2. Which variation is a subproblem of another? Exercise 2..e. so what can we say about their difficulty? (Not surprisingly. given that all subsets have exactly three elements each. 2. find a truth assignment to the variables of the formula that makes the formula evaluate to the logical value "true.10 Exercises
37
E
of unsolvable problems-for instance. Each test can be viewed as a subset of the collection of classes. cannot be solved efficiently.. given that the set has 3n elements and that all subsets have exactly three elements each.2.

Do these results hold if we replace n by nk for some fixed natural number k > 1? Exercise 2. A graph is self-complementary if it is isomorphic to its complement.Boog(n)
c
holds for some constant c > 0.17* Prove Euler's result. Exercise 2.11 Prove that.19 The complement of an undirected graph G = (V.16 Prove that the number of nodes of odd degree in an undirected graph is always even. An alternate definition of 0( ) is given as follows: f is @(g) whenever
f (n) lrm . One direction is trivial: if a vertex has odd degree.
. consider moving along some arbitrary circuit that does not reuse edges. that is. E) is the
graph G = (V.e.15 Asymptotic behavior can also be characterized by ratios. Prove that the number of vertices of a self-complementary graph must be a multiple of 4 or a multiple of 4 plus 1. then remove its edges from the graph and use induction.) The problem is to return a minimum subcollection C`' C 3 of tests that provides the same discrimination as the original collection-i. Exercise 2. if f is 0(g) and g is 0(h). To prove the other direction. no Eulerian circuit can exist. what can you state about the 0() behaviors of hi (x) = f (x) + g(x) and
h 2 (x) = f(x) g x)?
Exercise 2. Exercise 2.12 Verify that 3f is not 0(2n) but that it is 0(2"n) for some suitable constant a > 1.14 Derive identities for the 0( ) behavior of the sum and the product of functions..38
Preliminaries a battery of tests must be designed to identify cultures.. knowing the 0() behavior of functions f and g. Exercise 2.9.
Exercise 2. E) in which two vertices are connected by an edge if and only if they are no so connected in G. Exercise 2.18 Verify that a strongly connected graph has a single circuit (not necessarily simple) that includes every vertex.13 Verify that nn is not O(n!) but that log(n') is 0(log(n!)). Rephrase this problem as a set cover problem-refer to Exercise 2. similarly verify that n'Ylo is not (nlo~gn) for any ca > 1 but that log(nalogn) is O(log(nlogn)) for all a > 1. such that any pair separated by some test in the original collection is also separated by some test in the subcollection. Is this definition equivalent to ours? Exercise 2. then f is 0(h).

given n natural numbers. X C V. G = ({U.1). or faces. namely that. with IVI < IU. VI. E) is a tree. (di I i = 1. v} E El}l
¢
JXj
. . the removal of which creates a forest where no single tree includes more than half of the vertices of the original tree.21 Use the result of Exercise 2.21 to prove that the complement of a planar graph with a least eleven vertices must be nonplanar. if G = (V. Exercise 2.26 Devise a linear-time algorithm to find a maximum matching in a (free) tree.6. for each collection. Exercise 2.1). the number of distinct individuals making up these committees is at least as large as the number of committees in the collection-that is. of committees.2. (Any planar graph partitions the plane into regions.10 Exercises
39
Exercise 2. Exercise 2. {U. with di . then the sum of the degrees of its vertices is Yiv di = 2(lVI . we desire a matching of size IV I.20* A celebrated theorem of Euler's can be stated as follows in two-dimensional geometry: if G = (V. E). where U is the set of individuals and V the set of committees. .23 Prove that a tree is a critically connected acyclic graph in the sense that (i) adding any edge to a tree causes a cycle and (ii) removing any edge from a tree disconnects the graph. then the number of regions in any planar embedding of G is IE I. n). Exercise 2.1 for all i and Y= di = 2(n . Is eleven the smallest value with that property? Exercise 2..25 A collection of trees is called (what else?) a forest.24 Verify that.) Prove this result by using induction on the number of edges of G.27* The problem of the Set of Distinct Representatives (SDR) is given by a bipartite graph. a region is a contiguous area of the plane bordered by a cycle of the graph and not containing any other region. . A celebrated result known as Hall's theorem states that such a matching (an SDR) exists if and only if.IVI + 2. Now prove that the converse is true. Prove that every tree has at least one vertex. Exercise 2.22 Use the result of Exercise 2. there exists a tree of n vertices where the ith vertex has degree di.20 to prove that the number of edges in a connected planar graph of at least three vertices cannot exceed 31V1 . if and only if the following inequality holds for all collections X:
j{u E U I 3v e X. E) is a connected planar graph. Exercise 2.

the size of a maximum matching equals the size of a minimum cover. including itself-an operation known as prefix sum. .34 Prove that. . 2.
1. Verify that the following is an alternate formulation of the K6nig-EgervAry theorem (see Exercise 2.31 How many distinct surjective (onto) functions are there from a set of m elements to a set of n elements (assuming m > n)? How many injective functions (assuming now m S n)? Exercise 2. Exercise 2. Assign the value 1 to each left parenthesis and the value -1 to each right parenthesis. . (In the formulation in which we just gave it. Exercise 2.) Exercise 2. for any i in the set. an inverse for f is a function g: T -* S such that f g is the identity on T and g f is the identity on S. Exercise 2.)
. there must be two individuals who know the same number of people present at the party. no two in the same row or column.28* A vertex cover for an undirected graph is a subset of vertices such that every edge of the graph has at least one endpoint in the subset. this theorem may be more properly ascribed to Konig and Hall. since the necessity of the condition is obvious. Verify the following assertions:
. use induction on the size of X. Prove the Kbnig-Egervdry theorem: in a bipartite graph. How many derangements are there for a set of size n? (Hint: write a recurrence relation.30 A string of parentheses is balanced if and only if each left parenthesis has a matching right parenthesis and the substring of parentheses enclosed by the pair is itself balanced.32 A derangementof the set {1. 3. We denote the inverse of f by f-'.33 Given a function f: S -+ T. then the inverse of h is given by h-1 = (f g)-1 = g.
.) Exercise 2. If f and g are two bijections and h = f g is their composition.. If f has an inverse. n is a permutation i of the set such that.1 fj . Prove that a string of parentheses is balanced if and only if every value in the prefix sum is nonnegative and the last value is zero.29 The term rank of a matrix is the maximum number of nonzero entries. (It is assumed that the relation "a knows b" is symmetric. A function has an inverse if and only if it is a bijection. we have 7r(i) = i. it is unique.40
Preliminaries
Prove this result-only the sufficiency part needs a proof. Now replace each value by the sum of all the values to its left. .28): the term rank of a matrix equals the minimum number of rows and columns that contain all of the nonzero entries of the matrix.
Exercise 2. at any party with at least two people.

35 Design a bijection between the rational numbers and the natural numbers that avoids the repetitions of the mapping of Figure 2.45 (Refer to the previous exercise. but that no such map exists from the power set to S-the latter through diagonalization. Exercise 2.41 Prove Cantor's original result: for any nonempty set S (whether finite or infinite).36 How would you pair rational numbers. Exercise 2. and Cartesian product of two countable sets are themselves countable.39 Consider again the bijection of Exercise 2. i = 1.) Exercise 2.3. Exercise 2.38* Devise a new (a fourth) pairing function of your own with its associated projection functions. show that it can be used for dovetailing. How efficiently can each pairing function and its associated projection functions be computed? Give a formal asymptotic analysis.37 Compare the three pairing functions defined in the text in terms of their computational complexity. for some natural number n and integers as.) Exercise 2.) Is the set of all polynomials in the two variables x and y with integer coefficients countable? Is the set of all polynomials (with any finite number of variables) with integer coefficients countable?
.8. Is the set of all functions from S to T countable? Exercise 2. (A proof appears in Section A. Although it is not a pairing function.4. how would you define a pairing function p: t x Q? Exercise 2.10 Exercises
41
Exercise 2. each higher degree can be handled by one more application of dovetailing. Exercise 2. Polynomials of degree zero are just the set E.43 Let S be a finite set and T a countable set. 2 1S1.2.44 Show that the set of all polynomials in the single variable x with integer coefficients is countable. (Hint: use induction on the degree of the polynomials.42 Verify that the union. the cardinality of S is strictly less than that of its power set. n. You need to show that there exists an invective map from S to its power set. that is.40 Would diagonalization work with a finite set? Describe how or discuss why not. Such polynomials are of the form Z= 0 aix'.4. intersection. Exercise 2. Exercise 2.

we use them throughout this text. while that of Gibbons [1985] offers a more algorithmic perspective. A more complete coverage may be found in the outstanding text of Sahni [1981]. Gersting [1993]. Many texts on algorithms include a discussion of the nature of problems. While not required for an understanding of complexity theory. Graphs are the subject of many texts and monographs. Diagonalization is a fundamental proof technique in all areas of theory. Dovetailing and pairing functions were introduced early in this century by mathematicians interested in computability theory. Moret and Shapiro [1991] and Brassard and Bratley [1996] are good references on the topic.42
Preliminaries
2. particularly in computer science. so that the reader will see many more examples. any of them will cover most of the material in this chapter. the text of Bondy and Murty [1976] is a particularly good introduction to graph theory. Moret and Shapiro [1991] devote their first chapter to such a discussion.
. the reader will see many uses throughout this text. beginning with Chapter 5. with numerous examples. Examples include Rosen [1988].11
Bibliography
A large number of texts on discrete mathematics for computer science have appeared over the last fifteen years. a solid grounding in the design and analysis of algorithms will help the reader appreciate the results. and Epp [1995].

with doors open or closed. The current state of the system entirely dictates what the system does next-something we can easily observe on very simple systems such as single elevators or microwave ovens. it may have pending requests to move to certain floors. although "automatons" is considered acceptable in modern English) is a limited.1
Introduction States and Automata
A finite-state machine or finite automaton (the noun comes from the Greek.CHAPTER 3
Finite Automata and Regular Languages
3. mechanistic model of computation. or it can be moving between floors." the Greek-derived plural is "automata. giving rise to a fixed number of states). such as elevators. generated from inside (by passengers) or from outside (by would-be passengers). while a computer is certainly a finite-state system (its memory and registers can store either a 1 or a 0 in each of the bits. Its main focus is the notion of state. in addition. the elevator can be on any one of the floors. To a degree. simply because it defies comprehension by its users-namely humans. of course. All of these systems (but most obviously one like the elevator) can be in one of a fixed number of states.1
3. the singular is "automaton. the finite-state model ceases to be appropriate. every machine ever made by man is a finite-state system. For instance. This is a notion with which we are all familiar from interaction with many different controllers. ovens. however. the number of states is so large (a machine with 32 Mbytes of
43
.1. stereo systems. Inparticular. and so on. when the number of states grows large.

1
Informal finite automata. the set of states is Q = {qj. q2. a finite automaton is a four-tuple. q3). and a transition function.
(a) an informal finite automaton
-. However. Graphically. then the machine stops in error. Note that 8 is not defined for every possible input pair: if the machine is in state q2 and the current input symbol is 1.
. 1).10
q2
a t o
0
(b) a finite automaton with state transitions
Figure 3. the automaton goes through exactly n transitions before stopping. In the example of Figure 3. where the number of states remains small. The automaton stops when the input string has been completely processed. the alphabet is E = {O. At this level of characterization.. More formally. made of an alphabet. I/O handlers. a distinguished starting state. a finite automaton is characterized by a finite set of states and a transitionfunction that dictates how the automaton moves from one state to another. are directly based on finite automata). the finite-state model works well for logic circuit design (arithmetic and logic units. including the pattern-matching tools of editors.1(b). in which states are represented as disks and transitions between states as arcs between the disks. which uses the current state and current input symbol to determine the next state. we can introduce a graphical representation of the finite automaton. each symbol inducing a transition before being discarded. is given by the table of Figure 3. The starting state (the state in which the automaton begins processing) is identified by a tail-less arc pointing to it.2.44
Finite Automata and Regular Languages
memory has on the order of 103'°°°°°° states-a mathematician from the intuitionist school would flatly deny that this is a "finite" number!) that it is altogether unreasonable to consider it to be a finite-state machine. Figure 3. a set of states.1(b) shows an automaton with input alphabet {O. Informally. awk. etc. we label each transition with the symbol or symbols that cause it to happen. The input can be regarded as a string that is processed symbol by symbol from left to right. buffers. and others. this state is q]. 1). in Figure 3. and the transition function 8.) and for certain programming utilities (such well-known Unix tools as lex. the start state is q.1(a). grep. thus on an input string of n symbols.

In software. where the transducers compute maps from E* to (0. we distinguish the accepting states by double circles.
. 11.{0. We could define an automaton that produces a symbol from some output alphabet at each transition or in each state. as shown in Figure 3. i. one accepting and one rejecting. We shall instead remain at the simpler level of language membership.1 Introduction
45
6
ql q2 q3
0
q2 q3 q3
1
q2 q2
Figure 3. there are only two labels: 0 and 1. Graphically.1(b).2
Finite Automata as Language Acceptors
Finite automata can be used to recognize languages. the rejecting states and the accepting states. 1}.4(a) accepts only the empty string. As further examples. a finite automaton processes an input string but does not produce anything.
As defined. Such transducers are called sequentialmachines by computer engineers (or.3. In the case of language acceptance. Moore machines when the output is produced in each state and Mealy machines when the output is produced at each transition) and are used extensively in designing logic circuits. to implement functions f: E* .. this automaton accepts the empty string. Since the initial state is accepting. it can easily be seen to accept every string with an even (possibly zero) number of Is.e. the automaton of Figure 3. its input alphabet is {0. similar transducers are implemented in software for various string handling tasks (lex. The finite automaton decides whether the string is in the language with the help of a label (the value of the function) assigned to each of its states: when the finite automaton stops in some state q. or "reject" and "accept. more specifically.3.2 The transition function for the automaton of Figure 3. to name but a few.
3.1. an automaton that transforms an input string on the input alphabet into an output string on the output alphabet. the label of q gives the value of the function. thus producing a transducer. The results we shall obtain in this simpler framework are easier to derive yet extend easily to the more general framework. are all utilities based on finitestate transducers). 1} rather than to A* for some output alphabet A." Thus we can view the set of states of a finite automaton used for language recognition as partitioned into two subsets. and sed. grep. This finite automaton has two states.

46
Finite Automata and Regular Languages
Figure 3.
(a)a finite automaton that accepts (e}
(b)a finite automaton that accepts f0.5.
-0.4(b) accepts everything except the empty string.
.5
A more complex finite automaton.
1
.3
An automaton that accepts strings with an even number of Is.4 Some simple finite automata.1 Decide whether this idea works in all cases. This last construction may suggest that. The bottom right-hand state is a trap: once the automaton
1
Figure 3. it suffices to "flip" the labels assigned to states. in order to accept the complement of a language. Exercise 3. turning rejecting states into accepting ones and vice versa. 11+ Figure 3. It accepts all strings with an equal number of Os and is such that.
while that of Figure 3. in any prefix of an accepted string.
D
A more complex example of finite automaton is illustrated in Figure 3. the number of Os and the number of is differ by at most one.

Q the transition function. q0. a Our choice of the formalism for the transition function actually makes the automaton deterministic. F. Q..6. conforming to the examples seen so far. let us just look at some simple examples. it has made no progress towards acceptance. since this state corresponds to a remainder of 0 (i. our automaton will mimic the longhand division by 5 (101 in binary).e. Then consider the next bit. Moving from a finite automaton to a description of the language that it accepts is not always easy. represent numbers divisible by 5. If the input stopped at this point. q0 E Q the start state.1 A deterministicfinite automaton is a five-tuple. We are now ready to give a formal definition of a finite automaton. call the state corresponding to a remainder of 1 state B-a rejecting state. Later we shall see a formal proof of the fact. after seeing a 0 it is ready to accept. if the next bit is a 1.
. viewed as natural numbers in unsigned binary notation. Q the set of states. Nondeterministic automata can also be defined-we shall look at this distinction shortly.3.4(b) had an accepting trap. we can think of its having two states: when it starts or after it has seen a 1. it is an accepting state. an exact division by 5). This particular trap is a rejecting state. (A. the input (and also remainder)
Figure 3. Now. using its states to denote the current value of the remainder. The key here is to realize that division in binary is a very simple operation with only two possible results (1 or 0). F C Q the final states. The result is depicted in Figure 3. for now. where L is the input alphabet. we would have an input value and thus also a remainder of 1. the automaton of Figure 3. In designing this automaton.6
An automaton that accepts all strings ending with a 0. The reverse direction is more complex because there are many languages that a finite automaton cannot recognize. and 6: Q x E -. Leading Os are irrelevant and eliminated in the start state (call it A).1 Introduction
47
has entered this state. it cannot leave it. 6). Consider now the set of all strings that. Consider first the language of all strings that end with 0. on the other hand. but it is always possible. Definition 3. along with an exact characterization of those languages that can be accepted by a finite automaton. a 1 by assumption.

Moves from states C and E are handled similarly. gives us a remainder of 101. However. an input of 0 gives us a current remainder of 100. if the next bit is a 0. there was exactly one transition out of each state for each possible input symbol. such an extension to our
mechanism often proves useful.
3.
The resulting finite automaton is depicted in Figure 3. that is. Such transition functions are occasionally useful.48
Finite Automata and Regular Languages
Figure 3. what if. as to what it will accept. we looked at an automaton where the transition function remained undefined for one combination of current state and current input. which is the same as no remainder at all. when the automaton reaches a configuration in which no transition is defined.3
Determinism and Nondeterminism
In all of the fully worked examples of finite automata given earlier. so we move to a state (call it C) corresponding to a remainder of 3. The presence of multiple valid transitions leads to a certain amount of uncertainty as to what the finite automaton will do and thus. From state D.7 An automaton that accepts multiples of 5. a) can have only one value in Q. (In particular. That such must be the case is implied in our formal definition: the transition function S is well defined. however. it has at most one transition.) In a more confusing vein.1.
so far is I1. in some state. the standard convention is to assume that the automaton "aborts" its operation and rejects its input string. the input (and also remainder) is 10. on the other hand. since 6(qi. so we move back to state A. in our first example of transitions (Figure 3. We define a finite
automaton to be deterministic if and only if. an input of 1.7. our formal definition precludes this possibility. once again. so we move to a state (call it D) corresponding to a remainder of 2. there
had been two or more different transitions for the same input symbol? Again. for each combination of state
and input symbol. a rejecting trap has no defined
transitions at all. where the transition function 6 did not map every element of its domain. potentially. A finite automaton that
.2). so we move to a state (call
it E) corresponding to a remainder of 4.

3. a catastrophe known in computer science as indeterminacy. The branching of the tree corresponds to the several possible transitions available to the machine at that stage of computation.8
A stylized computation tree.1 Introduction
49
allows multiple transitions for the same combination of state and input
symbol will be termed nondeterministic. in others. rather than the single line of computation observed in a
deterministic machine.8. it may reject it-even though it is the same input. The key to understanding the convention adopted by theoreticians regarding nondeterministic finite automata (and other nondeterministic machines) is to realize that nondeterminism induces a tree of possible computations for
each input string. the timing vagaries at each site create an inherent unpredictability regarding the interactions among these systems. We can easily dispose of computation trees where all leaves correspond to accepting states: the input can be defined as accepted.
. While the operating system designer regards such nondeterminism as both a boon (extra flexibility) and a bane (it cannot be allowed to lead to different outcomes. What we need to
address is those computation trees where some computation paths lead
to acceptance and others to rejection.
Nondeterminism is a common occurrence in the worlds of particle physics and of computers. the convention adopted by the
branching point
j
leaf
Figure 3. A stylized computation tree is illustrated in Figure 3. In some of these computations. and so must be suitably controlled). Each of the possible computations eventually terminates (after exactly n transitions. as observed earlier) at a leaf of the computation tree. the machine may accept its input. we can equally easily dispose of computation trees where all leaves correspond to rejecting states: the input can be defined as rejected. the theoretician is simply concerned with suitably defining under what circumstances a nondeterministic machine can be termed to have accepted its input. It is a standard consequence of concurrency: when multiple systems interact.

Consider the nondeterministic finite automaton of Figure 3. 6).
.10. F C Q the final states.50
Finite Automata and Regular Languages (evidently optimistic) theory community is that such mixed trees also result in acceptance of the input. F. This change allows transition functions that map state/character pairs to zero. or more next states. corresponding to the detection of the substrings 000 and 1100. or 1100. and 8: Q x E
-*
2Q the transition function. This convention leads us to define a general finite automaton.9
An example of the use of nondeterminism. We can also think of there being a separate deterministic machine for each path in the computation tree-in which case there need be only one deterministic machine that accepts a string for the nondeterministic machine to accept that string. Q. rather than just into Q itself. where E is the input alphabet. it always chooses one that will allow it to accept the
input.2 A nondeterministic finite automaton is a five-tuple. qo. we say that a nondeterministic machine
accepts its input whenever there is a sequence of choices in its transitions that will allow it to do so. it chooses any of the transitions. since all will lead to rejection. 111. (The paths marked with an asterisk denote paths where the automaton is stuck in a state because it had no transition available.) There are two accepting paths out of ten. a)l < 1 for all q E Q and a E E. assuming any such transition is available-if such is not the case.
Using our new definition. one. Q the set of states. (E.
D2
Note the change from our definition of a deterministic finite automaton: the transition function now maps Q x E to 2Q. Definition 3.9. we can also view a nondeterministic machine as a perfect guesser: whenever faced with
a choice of transitions. The nondeterministic finite automaton thus accepts 01011000 because there is at least one way (here two) for it to do
0. which accepts all strings that contain one of three possible substrings: 000. qo E Q the start
state. Finally. the set of all subsets of Q. We say that a finite automaton is deterministic whenever we have 16(q. The computation tree on the input string 01011000 is depicted in Figure 3.1
Figure 3.

there are many guesses that
.10
The computation tree for the automaton of Figure 3. left. The nondeterministic automaton can simply guess which ending the string will have and proceed to verify the guess-since there are two possible guesses.1 Introduction
A A A D* B
51
A A
A
B
*
I I I
**
A I I I
A
A BC*
DB * * DB*CE ***
I I I
*
A I I
I I
*
A
A BC
F*
I I I I
* * * * * *
I
III
*
F
Figure 3. one for each possible choice of next state. The deterministic automaton has to consider both types of strings and so uses states to keep track of the possibilities that arise from either suffix or various substrings thereof.3. The nondeterministic automaton just "gobbles up" symbols until it guesses that there are only three symbols left.9 on input string 0 1011000. For example. -) in the
computation tree. -. consider the set of all strings that end in either 100 or in 001. it has no choice left to make and correctly ends in accepting state F when all of the input has been processed. it guesses that the next 1 indicates the substring 100 rather than 111 and thus moves to state B rather than E.) When exploiting nondeterminism. right. at which point it also guesses which ending the string will have and proceeds to verify that guess. left. We can view its behavior
as checking the sequence of guesses (left. left. we should consider the idea of choice. then guess that the next 1 is the start of a substring 1100 or 111 and thus move to state D. with all these choices. (That the tree nodes have at most two children each is peculiar to this automaton. The strength of a nondeterministic finite automaton resides in its ability to choose with perfect accuracy under the rules of nondeterminism. in general. For instance. -. a node in the tree can have up to IQ I children. there are two verification paths. as shown in Figure 3. In that state. From state B.11. it can decide to stay in state A when reading the first three symbols. Of course.
so.

12
A nondeterministic finite automaton that simply guesses whether to accept or reject. this aspect is very important. As we shall see later (when talking about complexity).11
Checking guesses with nondeterminism.52
Finite Automata and Regular Languages 0.12. this automaton accepts E*. because it is possible for it to accept any string and thus. whereas the deterministic automaton must painstakingly process the string.
Figure 3. This guessing model makes it clear that nondeterminism allows a machine to make efficient decisions whenever a series of guesses leads rapidly to a conclusion. left.1 0 m 0
Figure 3. In fact. this accurate guessing must obey the rules of nondeterminism: the machine cannot simply guess that it should accept the string or guess that it should reject it-something that would lead to the automaton illustrated in Figure 3. it must then do so. in view of the rules of nondeterminism. Consider the simple example of
.1. However. or fewer. Computing
A better way to view nondeterminism is to realize that the nondeterministic automaton need only verify a simple guess to establish that the string is in the language. but the string will be accepted as long as there is one accepting path for it. keeping information about the various pieces that contribute to membership.
3.4
Checking vs. or guess the wrong ending).
lead to a rejecting state (guess that there are three remaining symbols when there are more.

so the machine simply crunches through whatever has to be done to derive an answer. In the context of mathematics. In other words. What would it take for someone to convince you that the answer is "yes"? Basically. in this case. determinism is just straightforward computing-no shortcut is available. is one such question. but many others do not appear to have any concise or easily checkable proof. but any such machine will efficiently verify certain types of proofs and not others. we do not know which is the correct statement. "Is P equal to NP?" (about which we shall have a great deal to say in Chapters 6 and beyond). at chess. the attraction of nondeterministic finite automata resides in their relative simplicity. verifying may be easier than or just as hard as solving-often.1 Introduction
53
deciding whether a string has a specific character occurring 10 positions from the end. then follows that path. In contrast. Hence the question (which we tackle for finite automata in the next section) of whether or not nondeterministic machines are more powerful than deterministic ones is really a question of whether verifying answers is easier than computing them. A nondeterministic automaton can simply guess which is the tenth position from the end of the string and check that (i) the desired character occurs there and (ii) there are indeed exactly 9 more characters left in the string. Consider for instance the question of whether or not White.
. The most famous (and arguably the most important) open question in computer science. depending on the context (such as the type of machines involved or the resource bounds specified). verifying its guess in the process. the (correct) guess is the proof itself! We thus gain a new perspective on Hilbert's program: we can indeed write a proof-checking machine. it would appear that verifying the answer. nondeterminism is about guessing and checking: the machine guesses both the answer and the path that will lead to it. Thus. We shall soon see that nondeterminism does not add power to finite automata-whatever a nondeterministic automaton can do can also be done by a (generally much larger) deterministic finite automaton. it is easy to check a proof that a Boolean formula is satisfiable if the proof is a purported satisfying truth assignment). The simple guess of a position within the input string changes the scope of the task drastically: verifying the guess is quite easy.3. Many problems have easily verifiable proofs (for instance. a deterministic automaton must keep track in its finite-state control of a "window" of 9 consecutive input charactersa requirement that leads to a very large number of states and a complex transition function. has a forced win (a question for which we do not know the answer). whereas a direct computation of the answer is quite tedious. In contrast. is just as hard as deriving it.

Theorem 3.2
3. 8') as follows: EQl=2Q
* F'={sEQ'IjsnF:0}
* q = {qol
The key idea is to define one state of the deterministic machine for each possible combination of states of the nondeterministic one-hence the 2IQI possible states of the equivalent deterministic machine. What about the converse? Are nondeterministic finite automata more powerful than deterministic ones? Clearly there are problems for which a nondeterministic automaton will require fewer states than a deterministic one. but that is a question of resources. Our proof is a simulation: given an arbitrary nondeterministic finite automaton. we construct a deterministic one that mimics the behavior of the nondeterministic machine.e. one that accepts the same Ii language). in the state of the deterministic machine. F'. the deterministic machine uses its state to keep track of all of the possible states in which the nondeterministic machine could find itself after reading the same string. there is a unique state for the deterministic machine.2. q0. F. Q. no matter how many computation paths exist at the same step for the nondeterministic machine. Thus any language that can be accepted by a deterministic finite automaton can be accepted by a nondeterministic one-the same machine.1
Properties of Finite Automata
Equivalence of Finite Automata
We see from their definition that nondeterministic finite automata include deterministic ones as a special case-the case where the number of transitions defined for each pair of current state and current input symbol never exceeds one.. In particular. Q'.1 For every nondeterministic finite automaton.
Proof Let the nondeterministic finite automaton be given by the fivetuple (E.54
Finite Automata and Regular Languages
3. In that way. of all computation paths of
. '). q'. We construct an equivalent deterministic automaton (E'. We settle the question in the negative: nondeterministic finite automata are no more powerful than deterministic ones. there exists an equivalent deterministic finite automaton (i. we recall that the purpose of the simulation is to keep track. In order to define 8'. not an absolute question of potential.

b} 6(b.2 Properties of Finite Automata
55
the nondeterministic one. at that step-so that the corresponding deterministic machine is then in state {qi.F={a}. O) ={b} S(a.D. the deterministic machine must accept if it ends in a state that includes any of the final states of the nondeterministic machinehence our definition of F'. .
. It is clear that our constructed deterministic finite automaton accepts exactly the same strings as those accepted by the given nondeterministic finite automaton.b}.1 Consider the nondeterministic finite automaton given by E={0. a)
3
Since the nondeterministic machine accepts if any computation path leads to acceptance. a)-so that the corresponding deterministic machine moves to state
k
a ({qi. a)
=
j1i
6 (qij.. 1)={b) 6(b. qi2..O)={a.1
(b) the equivalent deterministic finite automaton
Figure 3. Example 3. Q.E.. a).
. 11. I S(qi.qo=a. qi2 . The corresponding deterministic finite automaton is given by
0
0
10
(a) the nondeterministic finite automaton
0. qj. ... Let the machines be at some step in their computation where the next input symbol is a.. }-then it can move to any of the states contained in the sets S(qi.13
A nondeterministic automaton and an equivalent deterministic finite automaton. 1) = [a}
and illustrated in Figure 3. qi2. qjk). qj. If the nondeterministic machine can be in any of states qi. . Q={a.. .. . .. a). 8(qi2.3.13(a). 3: 8(a.

1) = {b} '({b}. b}. because they are unreachable from the start state. b} 6'(0. {a}. 0) = {b} 6'({a.
S'(0. 0) = la.
Thus the conversion of a nondeterministic automaton to a deterministic one creates a machine. as shown in Figure 3. {a. 1) = 0 Y'({a).
I
0
1
I
I
Figure 3. the conversion may create any number of unreachable states. where five of the eight states are unreachable. we can avoid generating unreachable states by using an iterative approach based on reachability: begin with the initial state of the nondeterministic automaton and proceed outward to those states reachable by the nondeterministic automaton. {b}. in particular. an exponential increase. Q' ={0. 0) = {a. This process will generate only useful states-states reachable from the start state-and so may be considerably more efficient than the brute-force generation of all subsets. 1) = la. In general. However. When generating a deterministic automaton from a given nondeterministic one. F'={{a}. many of these states may be useless. b} : '({b}. The conversion takes a nondeterministic automaton with n states and creates a deterministic automaton with 2' states. as we saw briefly.14. the empty state is unreachable when every state has at least one transition. 1) = {a} 6'(1a. b}
E
and illustrated in Figure 3. bj. b)}.O) = 0 S'(Qa}.13(b) (note that state 0 is unreachable). b)1. 1}.56
Finite Automata and Regular Languages
X = {O. q' ={a}. the states of which are all the subsets of the set of states of the nondeterministic automaton.
. {a.14
A conversion that creates many unreachable states.

8. {q. a)
=
4. 8.
We do not specify whether the finite automaton is deterministic or nondeterministic. one transition on a. a) to be the set of all states that can be reached by 1. 1. design a new finite automaton that accepts all strings accepted by either machine. since we have already proved that the two have equivalent
power. {4. 6. our answer will be "no. 9.2. followed by 3. 7. followed by 2. an E transition from state A to state B would have to be the single transition out of A (any other transition would induce a nondeterministic choice). 10
so that we get 6'(q. 9. {4. the states reachable from state q through the three steps are:
£-closure
1. 5. Thus an £ transition is essentially nondeterministic. 81 3. The new machine "guesses" which machine will accept the current string. let its transition function be 6. zero or more £ transitions. 6. so that we could merge state A and state B.2
E Transitions
An E transition is a transition that does not use any input-a "spontaneous" transition: the automaton simply "decides" to change states without reading any symbol.
. 5.3. Ml and M2 ." Assume that we are given a finite automaton with E transitions. and thus eliminating the £ transition. zero or more £ transitions. This is the set of all states reachable from state q in our machine while reading the single input symbol a. then sends the whole string to that machine through an £ transition. simply redirecting all transitions into A to go to B. In Figure 3.2 Properties of Finite Automata
57
3. 6. 7. Let us define 3'(q. there exists an
equivalent finite automaton without
£
transitions. 10)
£
Theorem 3. we call B' the of 3. 3) 2. Such a transition makes sense only in a nondeterministic automaton: in a deterministic automaton. El The obvious question at this point is: "Do £ transitions add power to finite automata?" As in the case of nondeterminism.2 Given two finite automata. 2. for instance.2 For every finite automaton with
transitions. Example 3.15.

we claim that the set of states reachable under some input string x E in the original a& machine is the same as the set of states reachable under the same input string in our a-free machine and that the two machines both accept or both reject the empty string.E. by construction of 5'.
. and (with one possible exception) the same set of accepting states. Our new automaton has the same set of states. but its transition function is now 3' rather than 5 and so does not include any E moves. From each of the states that could have been reached after i input characters. Q.D. The two machines can reach exactly the same states from any given state (in particular from the start state) on an input string of length 1. the same starting state. the same alphabet. after processing i input characters. Hence one machine can accept whatever string the other can. our proof proceeds by induction on the length of strings. Finally.15
Moving through a transitions.
Thus a finite automaton is well defined in terms of its power to recognize languages-we do not need to be more specific about its characteristics. For the former. the two machines can reach the same set of states by reading one more character.
Proof. more specifically. The latter is ensured by our correction for the start state. Assume that.We construct 5'as defined earlier.58
Finite Automata and Regular Languages
Figure 3. we make that start state in our new automaton an accepting state. Thus the set of all states reachable after reading i + 1 characters is the union of identical sets over an identical index and thus the two machines can reach the same set of states after i + 1 steps. We claim that the two machines recognize the same language. the two machines have the same reachable set of states. if the original automaton had any (chain of) a transitions from its start state to an accepting state. by construction of 8'. Assume that we have been given a finite automaton with a transitions and with transition function 5.

No one would long tolerate having to define finite automata for pattern-matching tasks in searching and editing text.3
3...
. 101. arithmetic (on nonnegative integers) is a language built from one generator (zero. etc. with or without E transitions) have equivalent power.3. they form the basis for the pattern-matching commands of editors and text processors. through which it can count only to a fixed constant-so that counting to arbitrary values. with a simple and familiar syntax. We call the set of all languages recognized by finite automata the regular languages. they are well suited for use by humans in describing patterns for string processing.o. mostly because they offer no natural mechanism for induction.-all suffer from their basic premise.. since
they otherwise require elaborate conventions for encoding the transition
table. on the other hand. A finite automaton has no dynamic memory: its only "memory" is its set of states. the one fundamental number). Indeed. namely the notion of state.. Compare the ease with which we can prove statements about nonnegative integers with the incredible lengths to which we have to go to prove even a small piece of code to be correct.000111.3 Regular Expressions
59
since all versions (deterministic or not... a tool built from a set of primitives (generators in mathematical parlance) and operations. Another problem of finite automata is their nonlinear format: they are best represented graphically (not a convenient data entry mechanism).). programs. For instance. are simple strings much like arithmetic expressions. 101001. such as {1.. States make formal proofs extremely cumbersome. 10010001o. I or {E.01.1
Regular Expressions
Definitions and Examples
Regular expressions were designed by mathematicians to denote regular languages with a mathematical tool.3. is impossible. The mechanical models-automata. Regular expressions. multiplication. one basic operation (successor. We shall prove this statement and obtain an exact characterization later.0011. Not every language is regular: some languages cannot be accepted by any finite automaton. each defined inductively (recursively) from existing operations.
3. as is required in the two languages just given. which generates the "next" number-it is simply an incrementation). and optional operations (such as addition. etc. These include all languages that can be accepted only through some unbounded count.

these expressions are not as yet associated with languages: we have defined the syntax of the regular expressions but not their semantics. and a (for any a E E) are regular expressions. and union lowest precedence. . 01 1. Examples of regular expressions on the alphabet {0. 11} is the set {E..I U {xw I X E P and w E P*]. PQ is a regular expression (concatenation). 00. P Q is a regular expression denoting the set {xy I x E P and y e Q1. 1} include -. We now rectify this omission: * 0 is a regular expression denoting the empty set. . 0. P* is a regular expression denoting the set (..3 A regular expression on some alphabet E is defined inductively as follows: * 0. * Nothing else is a regular expression. that is. . P* is a regular expression (Kleene closure)..e. etc. 111. P + Q is a regular expression denoting the set {x Ix E P or x E Qj. 10*(E + 1)1*. This last definition is recursive: we define P* in terms of itself. 1111. P+ differs from P* in that it must contain at least one copy of an element of P. 11. E If P is a regular expression. For convenience. that is the same notation!). E + 1. the Kleene closure of a set S is the infinite union of the sets obtained by concatenating zero or more copies of S. concatenation intermediate precedence. we shall define P+ = PP*.1. * If P is a regular expression. the set of all possible strings over the alphabet.. E If P and Q are regular expressions. 110. 11. 1. * If P and Q are regular expressions. * is a regular expression denoting the set {£}.. we let Kleene closure have highest precedence.60
Finite Automata and Regular Languages
Definition 3. 1*. * If P and Q are regular expressions. For the sake of avoiding large numbers of parentheses. For instance. (O + 1)*. the Kleene closure of the set {0. i. the Kleene closure of {1) is simply the set of all strings composed of zero or more Is. * If P and Q are regular expressions. 0. * a E X is a regular expression denoting the set {a}. 1. However.
. . P + Q is a regular expression (union). and the Kleene closure of the set E (the alphabet) is X* (yes. much like arithmetic expressions. E1 The three operations are chosen to produce larger sets from smaller ones-which is why we picked union but not intersection. 1111. 1. 1*= {E. e. Put in English. This definition sets up an abstract universe of expressions.

. i. for instance. Our proof consists of showing that (i) for every regular expression. 1 1. representing the set (0. 01. being a mathematical tool (as opposed to a mechanical tool like finite automata).. . Assume the alphabet E = (0. so that the first term can be dropped. Hence. once the proof has been made. We use nondeterministic finite automata with e transitions for part (i) because they are a more
.2 Regular Expressions and Finite Automata
Regular expressions..3 Regular Expressions
61
Let us go through some further examples of regular expressions. . 11. the regular expression ((0 + 1)10*(0 + 1*))*.. 11} (0 + 1)1*. representing the infinite set {1. 1}. Consider. that they denote the same set of languages. it will be possible to go from any form of finite automaton to a regular expression and vice versa. which. using the + notation. lend themselves to formal manipulations of the type used in proofs and so provide an attractive alternative to finite automata when reasoning about regular languages. indeed. it often pays to simplify it before attempting to understand the language it defines. then we have A U B = A. when given a complex regular expression. can be rewritten as 10+ + 10*1*. 1111. The subexpression 10*(0 + 1*) can be expanded to 10*0 + 10*1*. .3. representing the set (01. )
*(O + 1)* =F+ (O + 1) + (O + 1)(0 + 1) + . 3. if A contains B.e. 01 1. there is a (nondeterministic) finite automaton with Etransitions and (ii) for every deterministic finite automaton. We have previously seen how to construct a deterministic finite automaton from a nondeterministic one and how to remove E transitions. then the following are regular expressions over X: * * * * * * * 0 representing the empty set 0 representing the set {0} I representing the set 11) 11 representing the set {11 } 0 + 1.{e}
The same set can be denoted by a variety of regular expressions.3.
* (O+ 1)+
(0+ 1)(0 + W = E+ = E* . (In set union. We see that the second term includes all strings denoted by the first term.) Thus our expression can be written in the simpler form ((0 + 1)10*1*)* and means in English: zero or more repetitions of strings chosen from the set of strings made up of a 0 or a 1 followed by a 1 followed by zero or more Os followed by zero or more is.=a
0. there is a regular expression. But we must first prove that regular expressions and finite automata are equivalent.
0111. 13 (0 + 1)1.

the corresponding finite automaton is -( For the regular expression E denoting the set {£}. (Any nondeterministic finite automaton with £ moves can easily be transformed into one with a unique accepting state by adding such a state. The proof hinges on the fact that regular expressions are defined recursively. once the basic steps are shown for constructing finite automata for the primitive elements of regular expressions. finite automata for regular expressions of arbitrary complexity can be constructed by showing how to combine component finite automata to simulate the basic operations. so that. Theorem 3. the corresponding finite automaton is
0
-0
For the regular expression a denoting the set {a). setting up an .transition to this new state from every original accepting state. and then turning all original accepting states into rejecting ones. then we can construct a finite automaton denoting P + Q in the following manner:
The £ transitions at the end are needed to maintain a unique accepting state. For convenience. the corresponding finite automaton is
-~a
Q
If P and Q are regular expressions with corresponding finite automata Mp and MQ.3 For every regular expression there is an equivalent finite automaton. we shall construct finite automata with a unique accepting state. conversely. we use a deterministic finite automaton in part (ii) because it is an easier machine to simulate with regular expressions. En Proof.62
Finite Automata and Regular Languages expressive (though not more powerful) model in which to translate regular expressions.) For the regular expression 0 denoting the empty set.
.

. if P is a regular expression with corresponding finite automaton MP.3 Regular Expressions
63
If P and Q are regular expressions with corresponding finite automata
MP and MQ. Our construction made no attempt to be efficient: it typically produces cumbersome and redundant machines. for each constructor of regular expressions.3. and then design an ad hoc finite automaton that accomplishes the same thing. The proof was by constructive induction. and individual symbols) were used as the basis of the proof. we have a corresponding constructor of finite automata. By converting the legal operations that can be performed on these basic pieces into finite automata.transitions are here to maintain a unique accepting state. then we can construct a finite automaton denoting P Q in the
following manner:
Finally. the induction step is proved and our proof is complete. However. there exists an equivalent nondeterministic finite automaton with E transitions. Since. we showed that these pieces can be inductively built into larger and larger finite automata that correspond to the larger and larger pieces of the regular expression as it is built up.D. In the proof. then we can construct a finite automaton denoting P* in the following manner:
£
Again. £. For an "efficient" conversion of regular expressions to finite automata. We have proved that for every regular expression. the mechanical construction used in the proof was needed to prove that any regular expression can be converted to a finite automaton. Q. The finite automata for the basic pieces of regular expressions (0. it is generally better to understand what the expression is conveying. we chose the type of finite automaton with which it is easiest to proceedthe nondeterministic finite automaton. the extra .E. It is clear that each finite automaton described above accepts exactly the set of strings described by the corresponding regular expression (assuming inductively that the submachines used in the construction accept exactly the set of strings described by their corresponding regular expressions).

it is necessary to show both that there is a finite automaton for every regular expression and that there is a regular expression for every finite automaton. Put another way. As before. This approach. is in effect a dynamic programming technique. simply entering or leaving the node. The length of the path isunrelated to the number of distinct states seen on the path and so remains (correctly) unaffected by the inductive ordering. consider the paths from node (state) i to node j. We shall now demonstrate the second part: given a finite automaton. however. regular expression.3. therefore. Our proof is again an inductive. thus most machines have an infinite number of accepting paths. Therefore we should be able to induct on some ordering related to the number of distinct states present in a path. For a deterministic finite automaton with n states. for all pairs of nodes i and j in the machine. In this case the most restricted finite automaton. identical to Floyd's algorithm for generating all shortest paths
. as the set of all paths from state i to state j that do not pass through any intermediate state numbered higher than k. "passing through" a node means both entering and leaving the node. which are numbered from I to n. with the aim of describing all accepting paths through the automaton with a regular expression. we proceed inductively on the index of the highest-numbered intermediate state used in getting from i to j. best serves our purpose. does not matter in figuring k. In building up an expression for these paths. in contrast.64
Finite Automata and Regular Languages
3. However. we need a general way to talk about and to build up paths. due to Kleene. We will develop the capability to talk about the universe of all paths through the machine by inducting on k from 0 to n (the number of states in the machine). isnot feasible. we can always construct a regular expression that denotes the same language.it cannot pass through more distinct states than are contained in the machine. no matter how long a path is. mechanical construction. though infallible correct. Infinding an approach to this proof. The first part has just been proved. Inducting on the length or number of paths. the deterministic finite automaton. states i and j (unless they are also numbered no higher than k) can be only left (i) or entered (j). as happens with nodes i and j. is a constant. Define Rk. paths can be arbitrarily large. On these paths.3
Regular Expressions from Deterministic Finite Automata
In order to show the equivalence of finite automata to regular expressions. can be used repeatedly. due to the presence of loops. which generally produces an unnecessarily cumbersome. since all finite automata are equivalent. we are free to choose the type of automaton that is easiest to work with. the intermediate states (those states numbered no higher than k through which the paths can pass). The number of states in the machine.

the construction proceeds in the same way. for instance). In particular.3 Regular Expressions
65
in a graph. The first set presents a bit of a problem because we must talk about paths that pass
I
Figure 3.16. In other words. these are the paths that go directly from node i to node j without passing through any intermediate states.
. Consider for example the deterministic finite automaton of Figure 3. These paths are described by the following regular expressions: * e if we have i = j (£ is the path of length 0). The set Rk can be thought of as the union of two sets: paths that do pass through state k (but no higher) and paths that do not pass through state k (or any other state higher than k).17.1. For a specific pair of nodes i and j. The inductive step must define R' in terms of lower values of k (in terms of k .3. and so forth) is irrelevant: for each possible labeling. the specific ordering of the states (which state is labeled 1. The second set can easily be recursively described by R k-71. The Inductive Step We now devise an inductive step and then proceed to build up regular expressions inductively from the base cases. The construction is entirely artificial and meant only to yield an ordering for induction.16
A simple deterministic finite automaton. a) = qj (including the case i = j with a self-loop).1. and/or * a if we have 8(qi. which is labeled 2. we want to be able to talk about how to get from i to i without going through states higher than k in terms of what is already known about how to get from i to j without going through states higher than k . The Base Case The base case for the proof is the set of paths described by R'j for all pairs of nodes i and j in the deterministic finite automaton. Some of the base cases for a few pairs of nodes are given in Figure 3.

={
R 2 =. effectively splitting the set of paths from i to j through k into three separate components. but this loop could occur any number of times. including none. R k I (Rkk k)*R'R 1.1 (remember that entering the state at the end of the path does not count as passing through the state). none of which passes through any state higher than k . With this inductive step.17
Some base cases in constructing a regular expression for the automaton of Figure 3. We now have all the pieces we need to build up the inductive step from k .1} R33
1 II£
+ 1
Figure 3. We can circumvent this difficulty by breaking any path through state k every time it reaches state k.1.16.1.1. in any of the paths in R~-. The expression R k 1 describes one iteration of a loop..1 (the paths exit k at the beginning and enter k at the end.
through state k without passing through any state higher than k . even though k is higher than k . the paths that go from i to k without passing through a state higher than k . we can proceed to build all possible paths in the machine (i. These components are: * Rk I.18 illustrates the second term. all the paths between every pair of nodes i and j for each
. and k* Rkj the paths that go from state k to state j without passing through a state higher than k . The expression corresponding to any number of iterations of this loop therefore must be (Rkkl)*.e.1. but never pass through k). one iteration of any loop from k to k. * R k-.66
Finite Automata and Regular Languages Path Sets
Rl ={sl
Regular Expression
B
R2 = (0° R0 -{1}
R
_
0
1
0 + 1
0
Rol.1 to k: Rk ii
k-l ij
+Rk-I Rk.)*R k-1
ik "kk ) kj
Figure 3. without passing through a state higher than k .

and k. j. Since the Rks are built from the regular expressions for the various Rk.S using only operations
that are closed for regular expressions (union. Thus we can state that Rk. concatenation. Completing the Proof The language of the deterministic finite automaton is precisely the set of all paths through the machine that go from the start state to an accepting state. and that this expression denotes all paths (or.) The language of the whole machine is then described by the union of these expressions. it works in all cases to derive correct expressions and thus serves to establish the theorem that a regular expression can be constructed for any deterministic finite automaton. the Rks are also regular expressions.
.i. we have k = n. is a regular expression for any value of i. since it is mechanical..3 Regular Expressions no k
67
nok
nok
Figure 3.
k from 1 to n) from the expressions for the base cases. strings that cause the automaton to follow these paths) that lead from state i to state j while not passing through any state numbered higher than k.n. the technique is mechanical and results in cumbersome and redundant expressions: it is not an efficient procedure to use for designing regular expressions from finite automata. However. As before. the paths are allowed to pass through any state in the machine. for any deterministic finite automaton. Our proof is now complete: we have shown that. k . with 1 . where j is some accepting state. j. in the final expressions. that is. These paths are denoted by the regular expressions R'. we can construct a regular expression that defines the same language. the regular expression EjIF R' .3. and Kleene closure-note that we need all three operations!). equivalently. (Note that.18
Adding node k to paths from i to j.

such need not be the case. then
. We now get to our final idea about induction: we shall number the states of the finite automaton and use an induction based on that numbering. however: which intermediate state(s) should we allow? If we allow any single intermediate state. This ordering is not yet sufficient. We could conceivably attempt an induction on the length of the strings accepted by the automaton. What we need is an induction that allows us to build regular expressions describing strings (i. The simplest sequence of transitions through an automaton is a single transition (or no transition at all). Reviewing the Construction of Regular Expressions from Finite Automata Because regular expressions are defined inductively. Hence our preliminary idea about induction can be stated as follows: we will start with paths (strings) that allow no intermediate state. we need to proceed inductively in our proof. the ordering is not strict: there are many different subsets of k intermediate states out of the n states of the machine and none is comparable to any other. and so on.. then any three. It would be much better to have a single subset of allowable intermediate states at each step of the induction. and has simple base cases. we can look at the set of strings accepted by a finite automaton. and so forth. The induction will start with paths that allow no intermediate state. Every such string leads the automaton from the start state to an accepting state through a series of transitions. sequences of transitions through the automaton) in a progressive fashion. finite automata are not defined inductively. While that seems to lead us right back to induction on the number of transitions (on the length of strings).68
Finite Automata and Regular Languages
In the larger picture. but this length has little relationship to either the automaton (a very short path through the automaton can easily produce an arbitrarily long string-think of a loop on the start state) or the regular expressions describing the language (a simple expression can easily denote an infinite collection of strings). Unfortunately. this proof completes the proof of the equivalence of regular expressions and finite automata.e. We can view a single transition as one that does not pass through any other state and thus as the base case of an induction that will allow a larger and larger collection of intermediate states to be used in fabricating paths (and thus regular expressions). then a set of two intermediate states. Since we are not so much interested in the automata as in the languages they accept. terminates easily. then proceed with paths that allow one intermediate state. then any two. nor do they offer any obvious ordering for induction.

all paths that go from state i through state j and that are allowed to pass through any of the states numbered from 1 to k. In effect. Now we can formalize our induction: at step k of the induction. otherwise we allow each path only to leave state i but not see it again on its way to state j. If the starting state for these paths. The induction step simply colors one more blue node in red. we shall compute. for each pair (i. if not already included. is among the first k states. somewhat larger finite automaton composed of the first k states of the original automaton. To describe with regular expressions what is happening. in which case we just look up the transition table of the automaton. from one step to the next. Similarly. Hence we can add to all existing paths from i to j those paths that now go through the new node. Think of these states and transitions as being highlighted in red. these paths can go through the new node several times (they can include a loop that takes them back to the new node over and over again) before reaching node j. then to paths that can pass through states 1 and 2. That is simple. This process looks good until we remember that we want paths from the start state to an accepting state: we may not be able to find such a path that also obeys our requirements. while the rest of the automaton is blue. it will be simple enough to keep those that describe paths from the start state to an accepting state. then we allow paths that loop through state i. k equals n. at each step of the induction. state i.3 Regular Expressions
69
proceed to paths that can pass (arbitrarily often) through state 1. Once we have regular expressions for all source/target pairs. so we are playing with the original machine. together with all transitions among these k states. Since only the portion that touches the new node is new. we can play only with the red automaton at any step of the induction. but at paths from any state to any other. another blue state gets colored red along with any transitions between it and the red states and any transition to it from state i and any transition from it to state j. plus any transition to state j from any of these states that is not already included.3. we simply break
. otherwise each path can only reach it and stop. if state j is among the first k states. and so on. When the induction is complete. plus any transition from state i to any of these states that is not already included. since such transitions occur either under E (when i = j) or under a single symbol. j) of states. and all states have been colored red. we begin by describing paths from i to j that use no intermediate state (no state numbered higher than 0). each path may go through it any number of times. we define a new. plus any transition from state i to state j. However. Thus we should look not just at paths from the start state to an accepting state. the number of states of the original machine.

4 3. by showing that the language does not obey the lemma. each of which leaves or enters the new node but does not pass through it. we can divide the
Figure 3. It establishes a necessary (but not sufficient) condition for a language to be regular. Put differently. one more than exist in the automaton. all strings of arbitrary length (i. then Figure 3. We cannot use the pumping lemma to establish that a language is regular. In view of our preceding remarks. and let z be a string of length at least n that is accepted by this automaton. all "sufficiently long" strings) belonging to a regular language must have some repeating pattern(s). the automaton makes a transition for each input symbol and thus moves through at least n + 1 states. .) Consider a finite automaton with n states. (The short strings can each be accepted in a unique way. completing the induction. . Every such segment goes through only old red nodes and so can be described recursively.1
The Pumping Lemma and Closure Properties
The Pumping Lemma
We saw earlier that a language is regular if we can construct a finite automaton that accepts all strings in that language or a regular expression that represents that language. In particular.
3. so far we have no tool to prove that a language is not regular. Let the string be z = xIx 2 x 3 . However..19
@
)
An accepting path for z. In order to accept z.4. The pumping lemma is based on the idea that all regular languages must exhibit some form of regularity (pun intended-that is the origin of the name "regular languages"). but we can use it to prove that a language is not regular. Therefore the automaton will go through at least one loop in accepting the string. each through its own unique path through the machine.e.19 illustrates the accepting path for z.70
Finite Automata and Regular Languages any such paths into segments. xlzl.
. any finite language has no string of arbitrary length and so has only "short" strings and need not exhibit any regularity. The pumping lemma is such a tool.

for every string z e L with lzj . the first loop encountered. from zero times (yielding xy't) to twice (yielding xy'y"y'y"y't) to any number of times (yielding a string of the form xy'(y"y')*t). as it was long enough). w. 3i.4 The Pumping Lemma and Closure Properties
71
no loop
x
loop
y
t tail
Figure 3. v = y"y'. vj 3 1. jvj ¢ 1. y.
D
Writing this statement succinctly. In our case the string can be viewed as being of the form uvw. This is the spirit of the pumping lemma: you can "pump" some string of unknown. v. and w = t. all of these strings must be in the language. as many times as you want and always obtain another string in the language-no matter what the starting string z was (as long. We used x. We have (somewhat informally) proved the pumping lemma for regular languages. We are then saying that any string of the form uvow is also in the language. there exist u. writing y = y'y"y'. and t to denote the three parts and further broke the loop into two parts. we obtain L is regular X#(3nVz. uviw 0 L) X L is not regular n. UViW E L)
. where we have u = xy'.n. v. showing potential looping. Vu. but nonzero length. here y"y'. zj l n. zj . y' and y". Iuvj s n.20 The three parts of an accepting path. 3u. Vi. lvij 1.4 For every regular language L. so that the entire string becomes xy'y"y't.20 illustrates this partition. Theorem 3. and a final part that may or may not contain additional loops. for all
i E RJ. that is.3.n. w. Figure 3. -u-l n
and. Now we can go through the loop as often as we want.
path through the automaton into three parts: an initial part that does not contain any loop. uv w E L. v. luvIj so that the contrapositive is (Vn3z. there exists some constant n (the size of the smallest automaton that accepts L) such that. w E A* with z = uvw.

so that the pumping lemma holds vacuously. Assume that the language is regular. as it has to be. then tells us where the first loop used by his machine lies and how long it is (something we have no way of knowing since we do not have the automaton). and w. Pick the string z = Onln. Figure 3. a pumping number i (which can vary from decomposition to decomposition) such that the string uvi w is not in the language. Our opponent. z. On the other hand. n is the number of states in the corresponding deterministic finite automaton. Consider the language L. who (claims that he) knows a finite automaton for the language. If our opponent claims that the language is regular. Thus for each possible decomposition into u. the pumping lemma is useless. we must prepare our counterexample. To summarize. Yet no matter what that automaton is. = (01 I i 3 01. Let n be the constant of the pumping lemma (that is. the automaton's accepting paths all have length less than the number of states in the machine. Pick a "suitable" string z with jzj 3 n. Let some parameter n be the constant of the pumping lemma. so we cannot pick n.n.. the number of states of the claimed automaton. 3. but must keep it as a parameter in order for our construction to work for any number of states. 4. 2. all we need to do is find a string z that contradicts the lemma. Conclude that assumption (1) was false. for every legal decomposition of z into uvw (i. so pumping v
. v. but. must be prepared for any decomposition given to us by our opponent. in the language and give it to our opponent.72
Finite Automata and Regular Languages
Thus to show that a language is not regular. should one exist). the steps needed to prove that a language is not regular are: 1. that is. it satisfies zi -. our counterexample must work. We can think of playing the adversary in a game where our opponent is attempting to convince us that the language is regular and where we are intent on providing a counterexample. Failure to proceed through these steps invalidates the potential proof that L is not regular but does not prove that L is regular! If the language is finite. Thus we cannot choose the decomposition of z into u. on the contrary.e.21 shows how we might decompose z = uvw to ensure luvi S n and lvi 3 1. obeying lvi 3 1 and luvi n). there exists i 3 0 such that uviw does not belong v to L. The uv must be a string of Os. we get to choose a specific string. Show that. 5. since all finite languages are regular: in a finite language. and w (that obeys the constraints). v. then he must be able to provide a finite automaton for the language.

3.4 The Pumping Lemma and Closure Properties
n n

73

all Os

all is

U

IV 21

W

Figure 3.21

Decomposing the string z into possible choices for u, v, and w.

will give more Os than Is. It follows that the pumped string is not in LI, which would contradict the pumping lemma if the language were regular. Therefore the language is not regular. As another example, let L2 be the set of all strings, the length of which is a perfect square. (The alphabet does not matter.) Let n be the constant of the lemma. Choose any z of length n2 and write z = uvw with lvi - 1 and luvi - n; in particular, we have I - lv - n. It follows from the pumping lemma that, if the language is regular, then the string z' = uv 2 w must be in the language. But we have Iz'l = IzI + lvI = n2 + lvI and, since we assumed 1 - lviI n, we conclude n2 < n 2 + 1 n2 +{ lVI -_ + n < (n + 1)2, + or n2 < lz'j < (n + 1)2, so that lz'l is not a perfect square and thus z' is not in the language. Hence the language is not regular. As a third example, consider the language L3 = (aibWck I 0 < j < k}. 2 Let n be the constant of the pumping lemma. Pick z = a bn+lcn+ , which clearly obeys zj 3 n as well as the inequalities on the exponents-but is as close to failing these last as possible. Write z = uvw, with juvl - n and lvi 3 1. Then uv is a string of a's, so that z' = uv2 w is the string
2 an+lvlbn+lCn+; since we assumed jvi

3

1, the number of a's is now at least

equal to the number of b's, not less, so that z' is not in the language. Hence
L is not regular. As a fourth example, consider the set L4 of all strings x over (0. 11* such that, in at least one prefix of x, there are four more Is than Os. Let n be the constant of the pumping lemma and choose z = On In+4; z is in the language,

because z itself has four more Is than Os (although no other prefix of z does: once again, our string z is on the edge of failing membership). Let z = uvw; since we assumed luvi - n, it follows that uv is a string of Os and that, in particular, v is a string of one or more Os. Hence the string z' = uv2 w, which
must be in the language if the language is regular, is of the form On+jvI ln+4;

74

Finite Automata and Regular Languages

but this string does not have any prefix with four more Is than Os and so is not in the language. Hence the language is not regular. As a final example, let us tackle the more complex language L 5 = (aibick | i 4 j or j :A k}. Let n be the constant of the pumping lemma and choose z = anbn!+ncn!+n-thereason for this mysterious choice will become clear in a few lines. (Part of the choice is the now familiar "edge" position: this string already has the second and third groups of equal size, so it suffices to bring the first group to the same size to cause it to fail entirely.) Let z = uvw; since we assumed IuvI -_n, we see that uv is a string of a's and thus, in particular, v is a string of one or more a's. Thus the string z' = uv' w, which must be in the language for all values of i - 0 if the language is regular,
is of the form an+(i-)IUvlbn!+nCn!+n. Choose i to be (n!/lvl) + 1; this value is

a natural number, because IvI is between 1 and n, and because n! is divisible by any number between 1 and n (this is why we chose this particular value n! + n). Then we get the string an!+nbn!+ncn!+n, which is not in the language. Hence the language is not regular. Consider applying the pumping lemma to the language L6 = laibick I > j > k - 01. L 6 is extremely similar to L 3 , yet the same application of the pumping lemma used for L3 fails for L6 : it is no use to pump more a's, since that will not contradict the inequality, but reinforce it. In a similar vein, consider the language L7 = {oiI jOj I i, j > O}; this language is similar to the language L1 , which we already proved not regular through a straightforward application of the pumping lemma. Yet the same technique will fail with L7, because we cannot ensure that we are not just pumping initial Os-something that would not prevent membership in L7. In the first case, there is a simple way out: instead of pumping up, pump down by one. From uvw, we obtain uw, which must also be in the language 2 if the language is regular. If we choose for L6 the string z = an+ bn~l, then uv is a string of a's and pumping down will remove at least one a, thereby invalidating the inequality. We can do a detailed case analysis for L 7 , which will work. Pick z = O1'O"; then uv is 0 1 k for some k > 0. If k equals 0, then uv is just 0, so u is £ and v is 0, and pumping down once creates the string VWOn, which is not in the language, as desired. If k is at least 1, then either u is a, in which case pumping up once produces the string 0 1 kOlnOn, which is not in the language; or u has length at least 1, in which case v is a string of is and pumping up once produces the string 0 1 n+Iv On, which is not in the language either. Thus in all three cases we can pump the string so as to produce another string not in the language, showing that the language is not regular. But contrast this laborious procedure with the proof obtained from the extended pumping lemma described below.

3.4 The Pumping Lemma and Closure Properties

75

What we really need is a way to shift the position of the uv substring within the entire string; having it restricted to the front of z is too limiting. Fortunately our statement (and proof) of the pumping lemma does not really depend on the location of the n characters within the string. We started at the beginning because that was the simplest approach and we used n (the number of states in the smallest automaton accepting the language) rather than some larger constant because we could capture in that manner the first loop along an accepting path. However, there may be many different loops along any given path. Indeed, in any stretch of n characters, n + 1 states are visited and so, by the pigeonhole principle, a loop must occur. These observations allow us to rephrase the pumping lemma slightly. Lemma 3.1 For any regular language L there exists some constant n > 0 such that, for any three strings zI, Z2, and Z3 with z = ZIZ2Z3 E L and IZ21 = n, there exists strings u, v, w E A* with Z2 = UVW, vi 3 1, and, for all i E hI, ZIUVIWZ 3 E L. This restatement does not alter any of the conditions of the original pumping lemma (note that IZ21 = n implies luviI n, which is why the latter inequality was not stated explicitly); however, it does allow us to move our focus of attention anywhere within a long string. For instance, consider again the language L 7 : we shall pick zi = O', Z2 = 1', and Z3 = On; clearly, Z = ZIZ2Z3 = on l"On is in L 7 . Since Z2 consists only of Is, so does v; therefore the string zluv2wz 3 is 0 n 1n±1vjOn and is not in L 7 , so that L 7 is not regular. The new statement of the pumping lemma allowed us to move our focus of attention to the Is in the middle of the string, making for an easy proof. Although L6 does not need it, the same technique is also advantageously applied: if n is the constant of the pumping lemma, pick zi = a+l, Z2 = , and Z3 = S; clearly, z = ZlZ2Z3 = an+lbn is in L6 . Now write Z2 = UVW: it follows that v is a string of one or more b's, so that the string ziuv 2wz3 is an+lbn+Iv , which is not in the language, since we have n + lVI > n + 1. Table 3.1 summarizes the use of (our extended version of) the pumping lemma. Exercise 3.2 Develop a pumping lemma for strings that are not in the language. In a deterministic finite automaton where all transitions are specified, arbitrary long strings that get rejected must be rejected through a path that includes one or more loops, so that a lemma similar to the pumping lemma can be proved. What do you think the use of such a D lemma would be?

76

Finite Automata and Regular Languages

Table 3.1
e

How to use the pumping lemma to prove nonregularity.

Assume that the language is regular. * Let n be the constant of the pumping lemma; it will be used to parameterize the construction. e Pick a suitable string z in the language that has length at least n. (In many cases, pick z "at the edge" of membership-that is, as close as possible to failing some membership criterion.) e Decompose z into three substrings, z = ZIZ2Z3, such that Z2 has length exactly n. You can pick the boundaries as you please. * Write Z2 as the concatenation of three strings, Z2 = uvw; note that the boundaries delimiting u, v, and w are not known-all that can be assumed is that v has nonzero length. * Verify that, for any choice of boundaries, i.e., any choice of u, v, and w with Z2 = uvw and where v has nonzero length, there exists an index i such that the string zjuviWZ3 is not in the language. * Conclude that the language is not regular.

3.4.2

Closure Properties of Regular Languages

By now we have established the existence of an interesting family of sets, the regular sets. We know how to prove that a set is regular (exhibit a suitable finite automaton or regular expression) and how to prove that a set is not regular (use the pumping lemma). At this point, we should ask ourselves what other properties these regular sets may possess; in particular, how do they behave under certain basic operations? The simplest question about any operator applied to elements of a set is "Is it closed?" or, put negatively, "Can an expression in terms of elements of the set evaluate to an element not in the set?" For instance, the natural numbers are closed under addition and multiplication but not under division-the result is a rational number; the reals are closed under the four operations (excluding division by 0) but not under square root-the square root of a negative number is not a real number; and the complex numbers are closed under the four operations and under any polynomial root-finding. From our earlier work, we know that the regular sets must be closed under concatenation, union, and Kleene closure, since these three operations were defined on regular expressions (regular sets) and produce more regular expressions. We alluded briefly to the fact that they must be closed under intersection and complement, but let us revisit these two results.

3.4 The Pumping Lemma and Closure Properties

77

The complement of a language L C X* is the language L = *- L. Given a deterministic finite automaton for L in which every transition is defined (if some transitions are not specified, add a new rejecting trap state and define every undefined transition to move to the new trap state), we can build a deterministic finite automaton for L by the simple expedient of turning every rejecting state into an accepting state and vice versa. Since regular languages are closed under union and complementation, they are also closed under intersection by DeMorgan's law. To see directly that intersection is closed, consider regular languages L, and L2 with associated automata Ml and M2. We construct the new machine M for the language LI n L2 as follows. The set of states of M is the Cartesian product of the sets of states of Ml and M2 ; if Ml has transition 6'(q', a) = q' and M2 has transition S"(qk', a) = q', then M has transition 8((q,', q7), a) = (q), ql'); finally, (q', q") is an accepting state of M if q' is an accepting state of Ml and q" is an accepting state of M2 . Closure under various operations can simplify proofs. For instance, consider the language L8 = (aW I i 0 jI; this language is closely related to our standard language (aWb I i E NJ and is clearly not regular. However, a direct proof through the pumping lemma is somewhat challenging; a much simpler proof can be obtained through closure. Since regular sets are closed under complement and intersection and since the set a*b* is regular (denoted by a regular expression), then, if L8 is regular, so must be the language L8 n a*b*. However, the latter is our familiar language {atbi I i E NJ and so is not regular, showing that L8 is not regular either. A much more impressive closure is closure under substitution. A substitution from alphabet E to alphabet A (not necessarily distinct) is a mapping from E to 2A - (0) that maps each character of E onto a (nonempty) regular language over A. The substitution is extended from a character to a string by using concatenation as in a regular expression: if we have the string ab over X, then its image is f (ab), the language over A composed of all strings constructed of a first part chosen from the set f (a) concatenated with a second part chosen from the set f (b). Formally, if w is ax, then f (w) is f (a)f(x), the concatenation of the two sets. Finally the substitution is extended to a language in the obvious way: f (L)=

U
weL

f (w)

To see that regular sets are closed under this operation, we shall use regular expressions. Since each regular set can be written as a regular expression, each of the f (a) for a E E can be written as a regular expression. The

78

Finite Automata and Regular Languages language L is regular and so has a regular expression E. Simply substitute for each character a E E appearing in E the regular (sub)expression for f(a); the result is clearly a (typically much larger) regular expression. (The alternate mechanism, which uses our extension to strings and then to languages, would require a new result. Clearly, concatenation of sets corresponds exactly to concatenation of regular expressions and union of sets corresponds exactly to union of regular expressions. However, f(L) = UWEL f(W) involves a countably infinite union, not just a finite one, and we do not yet know whether or not regular expressions are closed under infinite union.) A special case of substitution is bomomorphism. A homomorphism from a language L over alphabet E to a new language f (L) over alphabet A is defined by a mapping f: E -* A*; in words, the basic function maps each symbol of the original alphabet to a single string over the new alphabet. This is clearly a special case of substitution, one where the regular languages to which each symbol can be mapped consist of exactly one string each. Substitution and even homomorphism can alter a language significantly. Consider, for instance, the language L = (a + b)* over the alphabet {a, bhthis is just the language of all possible strings over this alphabet. Now consider the very simple homomorphism from {a, bJ to subsets of {O, 1}* defined by f (a) = 01 and f (b) = 1; then f (L) = (01 + 1)* is the language of all strings over (0, 11 that do not contain a pair of Os and (if not equal to E) end with a 1-a rather different beast. This ability to modify languages considerably without affecting their regularity makes substitution a powerful tool in proving languages to be regular or not regular. To prove a new language L regular, start with a known regular language Lo and define a substitution that maps Lo to L. To prove a new language L not regular, define a substitution that maps L to a new language L, known not to be regular. Formally speaking, these techniques are known as reductions; we shall revisit reductions in detail throughout the remaining chapters of this book. We add one more operation to our list: the quotient of two languages. Given languages LI and L2, the quotient of LI by L2 , denoted LI/L 2 , is the language {x I By C L2 , xy X LI). Theorem 3.5 If R is regular, then so is R/L for any language L.
D

The proof is interesting because it is nonconstructive, unlike all other proofs we have used so far with regular languages and finite automata. (It has to be nonconstructive, since we know nothing whatsoever about L; in particular, it is possible that no procedure exists to decide membership in L or to enumerate the members of L.)

3.4 The Pumping Lemma and Closure Properties
Proof Let M be a finite automaton for R. We define the new finite automaton M' to accept R/L as follows. M' is an exact copy of M, with one exception: we define the accepting states of M' differently-thus M' has the same states, transitions, and start state as M, but possibly different accepting states. A state q of M is an accepting state of M' if and only if there exists a string y in L that takes M from state q to one of its accepting Q.E.D. states. M', including its accepting states, is well defined; however, we may be unable to construct M', because the definition of accepting state may not be computable if we have no easy way of listing the strings of L. (Naturally, if L is also regular, we can turn the existence proof into a constructive proof.)

Exercise 3.3 Prove the following closure properties of the quotient:
* If L2 includes s, then, for any language L, L/L 2 includes all of L. * If L is not empty, then we have E*/L = E*. * The quotient of any language L by E* is the language composed of all prefixes of strings in L. El

If L, is not regular, then we cannot say much about the quotient L 1 /L 2 , even when L2 is regular. For instance, let L, = (O'1" I n E RN}, which we know is not regular. Now contrast these two quotients: * Li/1+ = I n > m E RN), which is not regular, and * LI /0+ 1+ = 0*, which is regular.
{0lm

Table 3.2 summarizes the main closure properties of regular languages.

In addition to the operators just shown, numerous other operators are closed on the regular languages. Proofs of closure for these are often ad hoc, constructing a (typically nondeterministic) finite automaton for the new language from the existing automata for the argument languages. We now give several examples, in increasing order of difficulty. Example 3.4 Define the language swap(L) to be
{a2a, . . . a2na2n-l 1I aa2
. .

. a2 n-la2, E LI

We claim that swap(L) is regular if L is regular. Let M be a deterministic finite automaton for L. We construct a (deterministic) automaton M' for swap(L) that mimics what M does when it reads pairs of symbols in reverse. Since an automaton cannot read a pair of symbols at once, our new machine, in some state corresponding to a state of M (call it q), will read the odd-indexed symbol (call it a) and "memorize" it-that is, use a new state (call it [q, a]) to denote what it has read. It then reads the even-indexed symbol (call it b), at which point it has available a pair of symbols and makes a transition to whatever state machine M would move to from q on having read the symbols b and a in that order. As a specific example, consider the automaton of Figure 3.22(a). After grouping the symbols in pairs, we obtain the automaton of Figure 3.22(b).

Our automaton for swap(L) will have a four-state block for each state of the pair-grouped automaton for L, as illustrated in Figure 3.23. We can formalize this construction as follows-albeit at some additional cost in the number of states of the resulting machine. Our new machine M' has state set Q U (Q x E), where Q is the state set of M; it has transitions of the type 5'(q, a) = [q, a] for all q E Q and a E E and transitions of the type 8'([q, a], b) = (3(q, b), a) for all q E Q and a, b E E; its start state is qo, the E] start state of M; and its accepting states are the accepting states of M. Example 3.5 The approach used in the previous example works when trying to build a machine that reads strings of the same length as those read by M; however, when building a machine that reads strings shorter than those read by M, nondeterministic E transitions must be used to guess the "missing" symbols. Define the language odd(L) to be
{aja3a5
.
.

. a2n-, I 3a 2 , a4 , .. ., a2,

ala2 . . . a2n-la2n E L}

When machine M' for odd(L) attempts to simulate what M would do, it gets only the odd-indexed symbols and so must guess which even-indexed symbols would cause M to accept the full string. So M' in some state q corresponding to a state of M reads a symbol a and moves to some new state not in M (call it [q, a]); then M' makes an £ transition that amounts to guessing what the even-indexed symbol could be. The replacement block of states that results from this construction is illustrated in Figure 3.24. Thus we have q' E %(tq,a], s) for all states q' with q' = S(3(q, a), b) for any choice of b; formally, we write
8'([q, a], E) = {8(6(q, a), b) I b E Xl

26.25
The automaton used in the odd language. Our automaton moves from the start state to one of the two accepting states while reading a character from the
(0
(a) the originalautomaton
(b) the automaton after groupingsymbols in pairs
Figure 3. we get the automaton of Figure 3. M' makes two transitions for each symbol read. as shown in Figure 3. After grouping the input symbols in pairs. For this choice of L. consider the language L = (00 + 11)*.24
The substitute block of states for the odd language. Now our new nondeterministic automaton has a block of three states for each state of the pair-grouped automaton and so six states in all.82
Finite Automata and Regular Languages
Figure 3. As a specific example.
In this way. recognized by the automaton of Figure 3.25(b).25(a).
. odd(L) is just Z*. enabling it to simulate the action of M on the twice-longer string that M needs to verify acceptance.

The key to a solution is tying together these four machines: for instance. given L. As must be the case. v. and transition function 8.26 The nondeterministic automaton for the odd language. but it is not even processed when the processing of x starts. of four separate machines. our automaton returns to the start state to read the next character. If the guess is good (corresponding to a 0 following a 0 or to a 1 following a 1).4 The Pumping Lemma and Closure Properties
83
.6 As a final example. effectively guessing the even-indexed symbol in the string accepted by M. the machine processing x should start from the state reached by the machine processing v once v has been completely processed. Since we have four of them. third. the processing of the guessed strings u.
W E
r
Jul = lv = w= xl and uVxw e LI
In other words. our new language is composed of the third quarter of each string of L that has length a multiple of 4.3. The answer is to use yet more nondeterminism and to guess what should be the starting state of each component machine. Thus our machine for the new language will be composed. with three copies processing guesses and one copy processing the real input.
input-corresponding to an odd-indexed character in the string accepted by M-and makes an E transition on the next move. This problem at first appears daunting-not only is v guessed. we have to guess a large number of absent inputs to feed to M. As in the odd language. and fourth machines (the first naturally
. each copy will process its quarter of uvxw. our automaton accepts A*-albeit in an unnecessarily complicated way. start state qo. if the guess is bad. accepting states F. V. it moves to a rejecting trap state (a block of three states). each a copy of M. in effect.6
IeI
1
~At
1
Figure 3. E Example 3. and w must take place while we process x itself. let us consider the language
{x I 3u. Let M be a (deterministic) finite automaton for L with state set Q. we need a guess for the starting states of the second. Since the input is the string x.

the principle generalizes easily to more complex situations. qk'. In order to check initial guesses. qo). qn. when the input has been processed. qm. qj. qi. qj. qn.). so that we must encode in the state of our new machine both the current state of each machine and the initial guess about its starting state. qj. and where all q. This chain of reasoning leads us to define a state of the new machine as a seven-tuple. it will be accepted if the state reached by each machine matches the start state used by the next machine and if the state reached by the fourth machine is a state in F. qn. When the input has been processed.) when reading character c from the input string x whenever the following four conditions are met:
* there exists a C E with 8(qj.32. qf). and n. qj. its construction is rather
straightforward. and n. as explored in Exercises 3. qk.. qj. its current state. our new machine can start from any state of the form (qo.84
Finite Automata and Regular Languages
starts in state qo). F
. but each machine will move from its starting state. we must also check that the fourth machine ended in some state in F. say (qj. qj. a) = qj. c) = qm'
* there exists a E E with S(q.. that is. indeed. qj. qo. While the machine is large. if the state of our new machine is of the form (qj.31 and 3.. our new machine can move to a new state (qp'. q. a) = q0. these initial guesses must be retained. qn. In addition. qj. which is highly nondeterministic. qn. * there exists a c X with 8(qk. and add £ transitions from it to the Q1 states of the form (qo. Finally. Then we need to verify these guesses by checking. are states of M. is the guessed starting state for the third machine and q.. The initial state of each machine is the same as the guess. from some state (qj.. of course. qn). qo). q. q. has
IQ 7 + 1 states. a) = qk'
* b(qm. with qf e F and for any choices of j. qk. In order to make it possible. that is. our new machine. qj. and the the third machine has reached the state guessed as the start of the fourth. that the second machine has reached the state guessed as the start of the third. for any choice of j. designate it as the unique 3 starting state. we add one more state to our new machine (call it S'). qm . 1. q. qn. qj is the guessed starting state for the second machine and qk its current state. q. qn. that the first machine has reached the state guessed as the start of the second. qj. qj. qm. qj. Overall. qj. and qn is the guessed starting state for the fourth machine and q 0 its current state. where qj is the current state of the first machine (no guess is needed for this machine). 1.

which are processed sequentially in the known machine but concurrently in the new machine. As another example.3.
. where. State transitions of the new machine are then defined on the tuples by defining their effect on each member of tuple. However. (We shall soon see that most questions about universal models of computation are undecidable. the tuple notation can be used to record starting and current states in the exploration of each substring. we need a much more powerful model. as we have seen. whereas our finite-state automata have no output function. members of the tuple are states from the known machine or alphabet characters.g. any "chip" is a finite-state machine. so that the Unix tools built for searching and matching are essentially finite-state automata. yet simple enough that most questions about them are decidable. the difference from our model being simply that.5 Conclusion
85
These examples illustrate the conceptual power of viewing a state of the new machine as a tuple.
3. have to match in the new machine as they automatically did in the known machine. but computer scientists see them most often in parsers for regular expressions. where the state transitions of the known machine can be used to good effect. finite-state machines do). When the new language includes various substrings of the known regular language. For instance.5
Conclusion
Finite automata and regular languages (and regular grammars. variables names. but that is similar in spirit to the grammars used in describing legal syntax in programming languages) present an interesting model. much less search for optimal solutions. they cannot even count. finite automata cannot be used for problem-solving. Thus if we want to study what can be computed. etc. tokens in programming languages (reserved words. an equivalent mechanism based on generation that we did not discuss.) can easily be described by regular expressions and so their parsing reduces to running a simple finite-state automaton (e. i ex). Initial state(s) and accepting states can then be set up so as to ensure that the substrings.) Finite automata find most of their applications in the design of logical circuits (by definition. such a model forms the topic of Chapter 4. typically. with enough structure to possess nontrivial properties. the expression language used to specify search strings in Unix is a type of regular expression..

3. The set of all first names given to children born in New Zealand in 1996. for instance. the string is in the language. but if the number of Os is odd. 1): 1.5 Design finite automata for the following languages over {O. The set of all strings that do not contain the substring 000. Design a solution that keeps track of everything needed for both cases until it reaches the end of the string. }1) accepted by the following deterministic finite automata. Exercise 3.)
.6 In less than 10 seconds for each part. This last problem is harder than the previous four since this automaton has no way to tell in advance whether the input string has odd or even length. the final state-these deterministic finite automata have only one final state each-is identified by a double circle. except in the last four characters. The set of all strings where pairs of adjacent Os must be separated by at least one 1. The set of all strings that contain the substring 010. 2. 2. 5.7 Describe in English the languages (over {O. Exercise 3. 4. verify that each of the following languages is regular: 1.4 Give deterministic finite automata accepting the following languages over the alphabet E = {O.86
Finite Automata and Regular Languages
3. 3. 2. (The initial state is identified by a short unlabeled arrow. if the number of Os is even.6
Exercises
Exercise 3. The set of all strings such that every other symbol is a 1 (starting at the first symbol for odd-length string and at the second for even-length strings. The set of all C programs written in North America in 1997. both 1 0111 and 0101 are in the language). The set of all strings that contain either an even number of Os or at most three Os (that is. then the string is in the language only if that number does not exceed 3). The set of numbers that can be displayed on your hand-held calculator. The set of all strings such that every substring of length 4 contains at least three Is. The set of all strings where no pair of adjacent Os appears in the last four characters. Exercise 3. 1: 1.

0.9 Give both deterministic and nondeterministic finite automata accepting the following languages over the alphabet E = tO.
Exercise 3. there are two Os separated by an even number of symbols. Every language with nonempty complement is contained in a regular language with nonempty complement. The set of all strings such that.
3. 1}.8 Prove or disprove each of the following assertions: 1.
.6 Exercises
1.
0
87
0. 2. then prove lower bounds on the size of any deterministic finite automaton for each language: 1.3. Every nonempty language contains a nonempty regular language. Exercise 3. at some place in the string.1
2.

1}: 1. The set of all strings such that the fifth symbol from the end of the string is a 1. an automaton that defines the same language as M) in which the start state. (Hint: Exercise 2.10 Devise a general procedure that. b.) Exercise 3. The language of Exercise 3.14* In contrast to the previous exercise.11 Devise a general procedure that. given some finite automaton M. Exercise 3. or c appears at least four times in all. c} such that the string. Exercise 3. Exercise 3.88
Finite Automata and Regular Languages
2.15 Write regular expressions for the following languages over {o. prove that there exist regular languages that cannot be accepted by any planar deterministic finite automaton. Exercise 3. produces an equivalent deterministic finite automaton M' (i. evaluates to the same value left-to-right as it does right-to-left. b. b. Exercise 3. Thus a planar finite automaton must have at least one state with no more than five transitions leading into or out of that state. The set of all strings over the alphabet {a. produces the new finite automaton M' such that M' rejects a. A finite automaton is planar if its transition diagram can be embedded in the plane without any crossings.
. interpreted as an expression to be evaluated. d) such that one of the three symbols a. C.12* Give a nondeterministic finite automaton to recognize the set of all strings over the alphabet {a. once left.5(1). 3. under the following nonassociative operation: a b c a a b c b c b b c a a b
Then give a deterministic finite automaton for the same language and attempt to prove a nontrivial lower bound on the size of any deterministic finite automaton for this problem. cannot be re-entered.21 indicates that the average degree of a node in a planar graph is always less than six. so that every planar graph must have at least one vertex of degree less than six.13* Prove that every regular language is accepted by a planar nondeterministic finite automaton. given a deterministic finite automaton M. but otherwise accepts all strings that M accepts..e.

withw.3.19 Let E be composed of all two-component vectors with entries of 0 and 1. The language of Exercise 3. {OllmOn InIlorn m. (°). Decide whether each of the following languages over E* is regular:
. E has four characters in it: (°). give a counterexample. 11*1 3. 1}+} 4.) {Oil gcd(i. Exercise 3. 8. (P+Q)*=P*+ Q*
89
Exercise 3.16 Let P and Q be regular expressions. The set of all strings (over {O. 9. The set of all strings with at most one pair of consecutive Os and at most one pair of consecutive Is. and (). The set of all strings not containing the substring 110. 2}* Ix = w2w. that is. i and j are relatively prime) 1I
Exercise 3. 5.)-= P* 2. 6. 1}* Ix=wRwy.1 ) such that. (P + Q)*=(P*Q*)* 3. Exercise 3. 7. prove it by induction. 1}) that have equal numbers of Os and
6. {x E {O. 3.
1.nE NJ {OL-n IneJNJ The set of all strings x (over (0.6 Exercises 2. The set of all strings over {O. with we {O. for the others.
is and such that the number of Os and the number of Is in any prefix of the string never differ by more than two. Give two different proofs that the complement of L (with respect to {0. 1}*) is not regular. there are four more is than Os. Which of the following equalities is true? For those that are true. {xc{0. The set of all strings in which every pair of adjacent Os appears before any pair of adjacent Is. 101*1
5.
1.17 For each of the following languages.yE{0. give a proof that it is or is not regular.
10. 1}* that have the same number of occurrences of the substring 01 as of the substring 10. (h).18 Let L be OIn ' I n e NJ. (P. I* I x {O1. our familiar nonregular language. 1. 4. {x e {O. j) = 1} (that is.m. The set of all strings with at most one triple of adjacent Os. in at least one substring of x. l. 1}* I x :AxRI
2. we have 101 E L and 1010 0 L.5(2). {x e {O. (For instance.

If L = LI L2 is regular and L2 is finite.90
Finite Automata and Regular Languages
1. and w belong to L. Is L regular? Exercise 3. Prove that.25 Prove or disprove each of the following assertions. Is the language Ixy I IIx IIYIII regular? Exercise 3.} that consists of all legal (nonempty) regular expressions written without parentheses and without Kleene closure (the symbol . 90 is XC.22 Let L be the language over {0. 1. with one exception: the last "digit" is obtained by subtraction from the previous one. then it is unitary if and only if. the number 4999 is written MMMMCMXCIX while the number 1678 is written MDCLXXVIII. c}. if L is a regular language.
. thus E has eight characters in it. L. and 900 is CM. 2. the bottom row has a 0 and vice versa). Decide whether each of the following languages over A* is regular: 1. 3. 2.23* Given a string x over the alphabet {a. so that 4 is IV. Exercise 3. Exercise 3. +. is regular. 400 is CD. The set of all strings such that the "top row" has the same number of Is as the "bottom row. stands for concatenation). then L is regular. C. then L. always using the largest symbol that will fit next. where the top row has a I." where each row is read left-toright as an unsigned binary integer. The set of all strings such that the product of the "first row" and "second row" equals the "third row. define jlxii to be the value of string according to the evaluation procedure defined in Exercise 3.21 Recall that Roman numerals are written by stringing together symbols from the alphabet E = {1. 9 is IX.24* A unitary language is a nonempty regular language that is accepted by a deterministic finite automaton with a single accepting state. then so does string wv.12. X. b. 40 is XL.20 Let E be composed of all three-component vectors with entries of 0 and 1. . If L* is regular. uv. D. V. The set of all strings such that the "top row" is the complement of the "bottom row" (that is. The set of all strings such that the sum of the "first row" and "second row" equals the "third row. whenever strings u. we have (0) (1) E L and (0) ( (0) (0) (1)(l)() L. For example. 2." For instance. The set of all strings such that the "top row" is the reverse of the "bottom row. 1." Exercise 3. MI. Is the set of Roman numerals regular? Exercise 3." where each row is read left-to-right as an unsigned binary integer.

.33** Prove that the language SUB(L) (see Exercise 3. if L is regular. 4. FRAC(1. Exercise 3. j)(L)
to be the set of strings x such that there exist strings xl. +j.. Xi-IXXi+l . . so that the set of subsequences of a finite collection of strings is also finite and regular. that is. then so is SUB(L). That is. SUB(L) is the set of all subsequences of strings of L. then Li is regular.28 Let L be a language and define the language NPR(L) = {x e L I x = yz and z e =X y 0 LI. 2)(L) is made of the first halves of even-length strings of L and FRAC(3. Prove that. Xj E L and 1xil I
xi-11 = lxi+ilI= . if L is regular. Prove that.6 Exercises 3. j)(L) is composed of the ith of j pieces of equal length of strings of L that happen to have length divisible by j. FL(L) is composed of the first and last thirds of strings of L that happen to have length 3k for some k. In particular. if L is regular. then so is NPR(L).32* Let L be a language and define the language f (L) = {x I 3yz.27 Let L be a language and define the language CIRC(L) = 1w I w = xy and yx E LI. FRAC(i.6.xj . j)(L). Exercise 3. Exercise 3.26 Let L be a language and define the language SUB(L) = {x | 3w e L. . Prove that. . that is. where xR is the reverse of string x. . If L = LI/L 2 is regular and L2 is regular. . If L is regular.31* Let L be a language and define the language FRAC(i. Exercise 3. Exercise 3. L is composed of the first half of whatever palindromes happen to belong to L. In words.. then so is PAL(L).= lxj1
Ixl. Prove that... YI = 2xl = 41zl and xyxz E L}.
91
with xi .30* Let L be any regular language and define the language FL(L) = (xz I 3y. Prove that. if L is regular.29 Let L be a language and define the language PAL(L) = {x I xxR E L).. then so is f (L). Exercise 3. NPR(L) is composed of exactly those strings of L that are prefix-free (the proper prefixes of which are not also in L). ..3. If L = Li + L2 is regular and L2 is finite. Hint: observe that the set of subsequences of a fixed string is finite and thus regular. We say that a string x
. 4)(L) is the language used in Example 3. that is. L need not be regular. then so is FRAC(i. does it follow that CIRC(L) is also regular? Exercise 3. . if L is regular. then LI is regular. x is a subsequence of w}. lxi = Ijy = IzI and xyz e LI. Let S be any set of strings. Is FL(L) always regular? Exercise 3. .26) is regular for any choice of language L-in particular.

Exercises 3.3). Conclude that the complement of SUB(L) is finite. and Moore [19561. Mealy [19551.31 and 3. 1964]. who presented a version of neural nets. The nondeterministic finite automaton was introduced by Rabin and Scott [1959]. also discussed the finite-state model at some length. Let M(L) be the set of minimal elements of the complement of SUB(L). The interested reader should consult the classic text of Hopcroft and Ullman [1979] for a lucid and detailed presentation of formal languages and their relation to automata. Several of these results use a grammatical formalism instead of regular expressions or automata.5) was shown by Ginsburg and Spanier [1963]. and proposed various design and minimization methods. The pumping lemma (Theorem 3. Seiferas and McNaughton [1976] characterized which operations of this type preserve regularity. who also investigated several closure operations for regular languages. Huffman [1954]. three independent authors. Closure under quotient (Theorem 3.32 are examples of proportional removal operations.4) is due to Bar-Hillel et al. At about the same time.3. Prove that M(L) is finite by showing that no element of M(L) is a subsequence of any other element of M(L) and that any set of strings with that property must be finite.
3.7
Bibliography
The first published discussion of finite-state machines was that of McCulloch and Pitts [1943].1). the texts of Harrison [1978] and Salomaa [1973] provide additional coverage. [1961]. proving the equivalence of the two models (Theorem 3. this formalism was created in a celebrated paper by Chomsky [1956]. or sequential machines. Regular expressions were further developed by Brzozowski [1962. all from an applied point of view-all were working on the problem of designing switching circuits with feedback loops. who proved its equivalence to the deterministic version (Theorem 3.
.92
Finite Automata and Regular Languages
is a minimal element of S if x has no proper subsequence in S. Kleene [1956] formalized the notion of a finite automaton and also introduced regular expressions.3 and Section 3.

in particular. we also need a reasonable charging policy for it. When analyzing an algorithm. this style of analysis is well suited to its purpose. The vague model of computation
93
. with the type of questions that typically arise with such models as well as with the methodologies that we use to answer such questions. While somewhat sloppy. It also has the advantage of providing results that remain independent of the specific environment under which the algorithm is to be run.CHAPTER 4
Universal Models of Computation
Now that we have familiarized ourselves with a simple model of computation and. models of computation that can be used to characterize problem-solving by humans and machines. we need to establish more than just the model itself. we typically assume some vague model of computing related to a general-purpose computer in which most simple operations take constant time. require more than constant time when given arbitrary large operands. since. where very large numbers are commonplace). with a few exceptions (such as public-key cryptography. even though many of these operations would. we can move on to the main topic of this text: models of computation that have power equivalent to that of an idealized generalpurpose computer or. Since we shall use these models to determine what can and cannot be computed in both the absence and the presence of resource bounds (such as bounds on the running time of a computation). Implicit in the analysis (in spite of the fact that this analysis is normally carried out in asymptotic terms) is the assumption that every quantity fits within one word of memory and that all data fit within the addressable memory. the implicit assumption holds in most practical applications. equivalently. in fact.

for each clause.
. Indeed. however. it needs simple and fairly obvious adaptationsfor instance. 111 Alternately. such as massively parallel machines. we need a symbol to separate clauses in our encoding (say a number sign). all analyses done to date for these unusual machines have been done using the conventional model of computation. rather. optical computers. we have to address three separate questions: (i) How is the input (and output) represented? (ii) How does the computational model compute? and (iii) What is the cost (in time and space) of a computation in the model? We take up each of these questions in turn. Recall that an instance of this problem is a Boolean expression consisting of k clauses over n variables.94
Universal Models of Computation
assumed by the analysis fits any modern computer and fails to fit 1 only very unusual machines or hardware. if for no other purpose than to justify our claims that the exact choice of computational model is mostly irrelevant. and DNA computers (the last three of which remain for now in the laboratory or on the drawing board). The literals themselves can be encoded by assigning each variable a distinct number from 0 to n. it pays to be more careful. with the required alterations. still
'In fact. Different literals can thus have codes of different lengths. the model does not really fail to fit. Such an instance can be encoded clause by clause by listing. each of which can store more than one bit of information. we can eliminate the need for separators between literals by using a fixed-length code for the variables (of length [10g 2 ni bits). written in conjunctive normal form. 010. consider the satisfiability problem. 011#10.1 and by preceding that number by a bit indicating whether the variable is complemented or not. which literals appear in it. Similarly. 110#11. For example. As an example.
4.1
Encoding Instances
Any instance of a problem can be described by a string of characters over some finite alphabet. quantum computers. In discussing the choice of computational model. When laying the foundations of a theory. parallel and optical computers have several computing units rather than one and quantum computers work with quantum bits. so that we need a symbol to separate literals within a clause (say a comma). the instance
(XO V X2) A (X1 V X2 V X3 ) A (Yo V X3)
would be encoded as 00.

4.1 Encoding Instances
95
preceded by a bit indicating complementation. but the third is not. We now need to know either how many variables or how many clauses are present (the other quantity can easily be computed from the length of the input). the first two are reasonable. but the lengths of all three remain polynomially related." and the number
." the comma as "01. the first two encodings have length O(k log n). we need to know the code length for each variable or. Our sample instance would then yield the code
100#000110#101010011#100111
The lengths of the first and of the second encodings must remain within a ratio of Flog 2 ni of each other. We could go one more step and make the encoding of each clause be of fixed length: simply let each clause be represented by a string of n symbols. equivalently." "1" as "11. so that the length of our last encoding need no longer be polynomially related to the length of the first two. when each clause includes only a constant number of variables. without using some arbitrary number of separators. we really should compare encodings on the same alphabet. in particular. one encoding can be converted to the other in time polynomial in the length of the code. Of the three encodings. so that everything becomes a string of bits. Now. Let us restrict ourselves to a binary alphabet. we use two bits per symbol: "00" for an absent variable. or appears complemented. we shall use two bits per symbol in either case. With a binary alphabet. On the other hand. then followed by the clauses. Since our first representation uses four symbols and our third uses three. as it can become exponentially longer than the first two. We shall require of all our encodings that they be reasonable in that sense. where each symbol can take one of of three values. we can write this as the first item in the code. "01" for an uncomplemented variable. appears uncomplemented. indicating that the corresponding variable does not appear in the clause. followed by a separator. Again we write this number first. the number of variables. however. When each clause includes almost every variable. which then have length E)(kn log n). it is more concise than the first two encodings. Of course. Using our first representation and encoding "0" as "00. and "10" for a complemented one. Our sample instance (in which each clause uses 4 2 = 8 bits) is then encoded as
100#010010000010010110000010
This encoding always has length 6(kn). separating it from the description of the clauses by some other symbol.

. Such a code uses O (IEI log IVI) bits. Once again. Finally. 11
00. with potentially very different lengths.. its length is O(I VI + IEI log IVI). Given an undirected graph. consider encoding the graph as a list of edges. the last encoding could be far more concise on an extremely sparse graph. we simply write a collection of pairs. For instance.. The overall encoding looks very much like that used for satisfiability. While an encoding in which every string is meaningful might be more elegant. the choice of any fixed alphabet to represent instances does not affect the length of the encoding by more than a constant factor.96
Universal Models of Computation sign as "10." our sample instance becomes
00000111000010111101. which is exponentially larger. Since each matrix entry is simply a bit. Moreover. depending on the problem and the chosen encodings. then. The result is an encoding that mixes the two styles just discussed and remains reasonable under all graph densities. Finally. Fortunately. write a list of the matrix entries.. we can encode any graph by breaking the list of vertices into two sublists. the total length of the encoding is always a(I V 12).
1 00# 1# . we need to indicate the number of vertices. if the graph has only a constant number of edges. one containing all isolated vertices and the other containing all vertices of degree one or higher. while the connected vertices are identified individually. the length of the codes chosen for the symbols. for each vertex. while the second has length E (IV ). More difficult issues are raised when encoding complex structures. V). we list the vertices (if any) present in the adjacency lists. The list of isolated vertices is given by a single number (its size). the anomaly arises only for uninteresting graphs (graphs that have far fewer than IV edges). not every bit string represents a valid instance of the problem. Now consider encoding the graph by using adjacency lists. without any separator. While the lengths of the first two encodings (adjacency matrix and adjacency lists) are polynomially related. G = (E. it is certainly not required. In general. for each vertex. separating adjacency lists by some special symbol. Using a fixedlength code for each vertex (so that the code must begin by an indication of the number of vertices). All that we need is the ability to differentiate
. 1 1
The length of the encoding grew by a factor of two. Consider encoding the graph as an adjacency matrix: we need to indicate the number of vertices (using E)(log IVI) bits) and then. such as a graph. as long as the alphabet has at least two symbols.. then the last encoding has length e (log IV). we face an enormous choice of possible encodings.

2. we discuss two models of computation. establish that they have equivalent power in terms of absolute computability (without resource bounds). as mentioned in Section 2. all instances of which are Hamiltonian graphs given in the same format. In this section. they are polynomially related in terms of their effect on running time and space. in our first and second encodings for Boolean formulae in conjunctive normal form. if given in a format where the vertices are listed in the order in which they appear in a Hamiltonian circuit. Yet the same graphs. such as "Can the same problem be solved in polynomial time on all reasonable models of computation?" we
. not in the encoding.1
Issues of Computability
Before we can ask questions of complexity. the number of bits between any two number signs must be a multiple of a given constant. a graph problem. With almost any encoding. as for encodings. For instance. something for which only exponential-time algorithms have been developed to date. this decision can be made in linear time and thus efficiently. only strings of a certain form encode instances-in our first encoding. so that the choice of a model (as long as it is reasonable) is immaterial while we are concerned with the boundary between tractable and intractable problems. On the other hand. but our development is applicable to any other reasonable model.4. whether or not the graph is actually Hamiltonian. and finally show that. We shall examine only two models. but in the assumptions made about valid instances. all instances of which are planar graphs and are encoded according to one of the schemes discussed earlier.2
Choosing a Model of Computation
Of significantly greater concern to us than the encoding is the choice of a model of computation. In fact. For instance.2 Choosing a Model of Computation (as efficiently as possible) between a string encoding a valid instance and a meaningless string.
4. while in our second encoding. a graph problem. requires us to differentiate efficiently between planar graphs (valid instances) and nonplanar graphs (meaningless inputs). making this distinction is easy.
97
4. requires us to distinguish between Hamiltonian graphs and other graphs. make for a reasonable input description because we can reject any input graph not given in this specific format.4. a comma and a number sign cannot be adjacent. the problem of distinguishing valid instances from meaningless input resides.

Then consider Figure 4.. 99: end. the first representing a program and the second representing data. If the program represented by this bit string stops when run on itself (i. the same argument carries over to any other general model of computation). then Q enters an infinite loop.x) 1: goto 1. given two bit strings.y: bitstring): boolean.
must briefly address the more fundamental question of computability. of course. so it should not come as a surprise that among these are some truly basic and superficially simple problems. The classical example of an unsolvable problem is the Halting Problem: "Does there exist an algorithm which. with its own description as input). Now consider what happens when we run Q on itself: Q stops if and only if P(Q.1. Similarly. determines whether or not the program run on the data will stop?" This is obviously the most fundamental problem in computer science: it is a simplification of "Does the program return the correct answer?" Yet a very simple contradiction argument shows that no such algorithm can exist. begin end. Q) returns true.
Figure 4. which happens if and only if Q stops when run on itself-also a contradiction. however. Suppose that we had such an algorithm and let P be a program for it (P itself is. function P (x. Since our construction from the hypotheses is perfectly legitimate. P returns as answer either true (the argument program does stop when run on the argument data) or false (the argument program does not stop when run on the argument data).1 This proof of the unsolvability of the halting problem is really
. otherwise Q stops. Q enters an infinite loop if and only if P(Q. Exercise 4.98
Universal Models of Computation
procedure Q (x: bitstring). which happens if and only if Q does not stop when run on itself-a contradiction.
then goto 99. Procedure Q takes a single bit string as argument. to wit "What kind of problem can be solved on a given model of computation?" We have seen that most problems are unsolvable.e. Q) returns false.1
The unsolvability of the halting problem. our assumption that P exists must be false. just a string of bits). begin if not P(x. Hence the halting problem is unsolvable (in our world of programs and bit strings.

The British logician Alan Turing designed it to mimic the problem-solving mechanism of a scientist. these models are truly universal. The key result is that all such models have been proved equivalent from a computability standpoint: what one can compute. Over a dozen very different such models were designed.2. in the process of thinking the scientist will jot down some notes.2 Choosing a Model of Computation a proof by diagonalization. and the standard model in computer science. Not. each of which can store one symbol from a fixed tape alphabet-this mimics the supply of paper. A Turing machine (see Figure 4. some from logic. Decisions are made on the basis of the material present in the notes (but only a fixed portion of it-say a page-since no more can be confined to the scientist's fixed-size memory) and of the scientist's current mental state. The machine is started in a fixed initial state with the head on the first square of the input string and the rest of the tape blank: the scientist is getting ready to read the description of the problem. The idealized scientist sits at a desk with an unbounded supply of paper. of course. and (iii) a finite-state controlthis mimics the brain. The machine stops on entering a final state with the head on the first square of the output string and the rest of the tape blank: the scientist has solved the
. is the Turing machine.
99
4. some from psychology. Recast the proof so as to bring the diagonalization to the surface. starting with the design of universal models of computation.2
The Turing Machine
Perhaps the most convincing model. altering. in particular many inspired from computer science.4. more have been added since. all others can. Since the brain encloses a finite volume and thought processes are ultimately discrete. that there is any way to prove that a model of computation is universal (just defining the word "universal" in this context is a major challenge): what logicians meant by this was a model capable of carrying out any algorithmic process. F1 The existence of unsolvable problems in certain models of computation (or logic or mathematics) led in the 1930s to a very careful study of computability. there are only a finite number of distinct mental states. possibly altering some entries. some taking inspiration from mathematics. pencils.2) is composed of: (i) an unbounded tape (say magnetic tape) divided into squares. look up some previous notes. based on the fact that we can encode and thus enumerate all programs. and writing of notes. and erasers and thinks. In that sense. (ii) a read/write head that scans one square at a time and is moved left or right by one square at each step-this mimics the pencils and erasers and the consulting.

and kept only the sheets describing the solution. a) = (qj. but.2
The organization of a Turing machine. the choice of transition is dictated by the current state qj and the current input symbol a (but now the current input symbol is the symbol stored on the tape square under the head). there is at most one applicable five-tuple) or nondeterministic. Thus a Turing machine is much like a finite automaton equipped with a tape. since the next instruction to follow is determined entirely by the current state and the symbol under the head. Thus a Turing machine program is much like a program in a logic language such as Prolog. pattern-matching is used instead to determine which instruction is to be executed next. with the same convention: a nondeterministic machine
. There is no sequence inherent in the list of instructions. in which direction to move the head. Part of the transition is to move to a new state qj. on the basis of the current state and the current contents of the tape square under the head. the finite-state control.
problem. the instructions of a Turing machine are not written in a sequence. discarded any notes made in the process. At any given step. a Turing machine may be deterministic (for each combination of current state and current input symbol. the instruction also specifies the symbol b to be written in the tape square under the head and whether the head is to move left (L) or right (R) by one square. An instruction in the finite-state control is a five-tuple 6(qi. L/R) Like the state transition of a finite automaton. b. in addition to a new state.100
Universal Models of Computation
unbounded tape divided into squares
Figure 4. and which state to enter next. Like a finite automaton. A Turing machine program is a set of such instructions. decides which symbol to write on that square.

when in state i and reading symbol x. it keeps track of whether or not there is a running carry in its finite state (two possibilities. Consider the problem of incrementing an unsigned integer in binary representation: the machine is started in its initial state. it must stop in the final state with its head immediately to the left of the incremented number on the tape. the new symbol. with its head immediately to the left of the number on the tape. 2 The great advantage of the Turing machine is its simplicity and uniformity.) The machine first scans the input to the right until it encounters a blank-at which time its head is sitting at the right of the number. Decrementing an unsigned integer (decrementing zero leaves zero. In the rest of this section. separated only by an additional symbol)-this last task requires a much larger control than the first two. the blank symbol. In the diagram. must change x to y. The Turing machine model makes perfect sense but hardly resembles a modern computer. For instance.. with Turing machines as with finite automata. we must assume the existence of a third symbol. 2.2 Design Turing machines for the following problems: 1. Yet writing programs (i.2 Choosing a Model of Computation
101
accepts its input if there is any way for it to do so. designing the finite-state control) for Turing machines is not as hard as it seems. we shall deal with the deterministic variety and thus shall take "Turing machine" to mean "deterministic Turing machine. each state is represented as a circle and each step as an arrow labeled by the current symbol. The resulting program is shown in both diagrammatic and tabular form in Figure 4. nondeterminism does not add any computational power. an arc from state i to state j labeled x/y." We shall return to the nondeterministic version when considering Turing machines for decision problems and shall show that it can be simulated by a deterministic version. (In order to distinguish data from blank tape. Adding two unsigned integers (assume that the two integers are written consecutively. verify that your machine does not leave a leading zero on the tape). changing the tape as necessary. so that. and enter state j. Multiplying an unsigned integer by three (you may want to use an additional symbol during the computation). there is no question as to appropriate choices for time and space complexity measures: the time taken
. L indicates that the machine. _. necessitating two states). Exercise 4. move its head one square to the left. Since there is only one type of instruction.4.e. 3. Then it moves to the left.3. and the direction of head movement. Each bit seen will be changed according to the current state and will also dictate the next state to enter.

3 illustrates. Similarly. Elementary arithmetic. These are really problems of scale: while incrementing a number on a Turing machine requires.g. is that it requires much time to carry out
elementary operations that a modern computer can execute in one instruction..102
Universal Models of Computation
(a) in diagrammatic form Current State
qo
Symbol Read 0
Next State qo qj
Symbol Written
Head Motion R L
Comments Scan past right end of integer Place head over rightmost bit Propagate carry left
qo
qj
q1
1 1j0
q2
q2
0 1 0
l
L
q
q2
L
L
L R
End of carry propagation
Scan past left end of integer Place head over leftmost bit
q2
q2
halt
(b) in tabular form
Figure 4. time proportional to the length of the number's binary representation. as Figure 4. the same is true of a modern computer when working with very large numbers: we would need an unbounded-precision arithmetic package.
by a Turing machine is simply the number of steps taken by the computation and the space used by a Turing machine is simply the total number of distinct tape squares scanned during the computation. for parity). the same is again true of a modern computer: only those locations within the machine's address space can be accessed in (essentially) constant
time. The great disadvantage of the Turing machine. whereas accessing an arbitrary stored quantity cannot be done in constant time with a Turing machine.3 A Turing machine for incrementing an unsigned integer. simple tests (e. of course. and especially access to a stored quantity all require large amounts of time on a Turing machine.
.

3
Multitape Turing Machines
The abstraction of the Turing machine is appealing. if not in neatly organized files. and will have transitions given by (3k + 2)-tuples of the form 8(qi.3 Verify that a two-tape Turing machine can recognize the language of palindromes over {0. a k-tape machine is as powerful as our standard model-just set k to 1 (or just use one of the tapes and ignore the others). a2 . as the reader is invited to verify. The question is whether adding k . However. as we shall shortly prove..
D1
The basic idea is the use of the alphabet symbols of the one-tape machine to encode a "vertical slice" through the k tapes of the k-tape machine.1 A k-tape Turing machine can be simulated by a standard one-tape machine at a cost of (at most) a quadratic increase in running
time. while a one-tape Turing machine appears to require time quadratic in the
size of the input. L/R)
where the ais are the characters read (one per tape. The
. . 1) in time linear in the size of the input. the quadratic increase in time evidenced in the example of the language of palindromes is a worst-case increase. it is enough to replicate the tape and head structure of our one-tape model. at least in separate piles on the floor.. that idea alone does not suffice: we also need to encode the positions of the k heads. L/R. The answer to the former is no.1 tapes adds any power to the model-or at least enables it to solve certain problems more efficiently. to encode the contents of tape square i on each of the k tapes into a single character. under that tape's head). Exercise 4. one per tape. bl. ak)
=
(qj. and various notes. since they move independently and thus need not all be at the same tape index. bk. We can do this by adding a single bit to the description of the content of each tape square on each of the k tapes: the bit is set to 1if the head sits on this tape square and to 0 otherwise. a.4. and the L/R entries tell the machine how to move (independently) each of its heads. the bis are the characters written (again one per tape). there is no reason why the machine should be equipped with a single tape.2. reprints. . A k-tape Turing machine will be equipped with k read/write heads. In order to endow our Turing machine model with multiple tapes.. . In particular. but there is no compelling choice for the details of its specification. Even the most disorganized mathematician is likely to keep drafts. . Theorem 4. . cz
In fact. while the answer to the latter is yes. that is. Clearly.2 Choosing a Model of Computation
103
4. b 2 L/R. .

. we need to know where to scan and when to stop scanning in order to retain some reasonable efficiency. The finite control of M stores the current state of Mk along with the number of heads of Mk sitting to the right of the current position of the head of M. . be a k-tape Turing machine. whether or not a head of Mk sits on the current square. an indication that can be encoded into the finite state of the simulating machine (thereby increasing the number of states of the one-tape machine by a factor of k).4 illustrates the idea. we design a one-tape Turing machine M that simulates Mk. . In effect. Thus our one-tape machine will have to maintain some basic information to help it make this scan. . Perhaps the simplest form is an indication of how many of the k heads being simulated are to the right of the current position of the head of the one-tape machine. I )k instead of just Ek.
.4
Simulating k tapes with a single k-track tape. Figure 4. the alphabet of M is large enough to encode in a single character the k characters under the k heads of Mk as well as each of the k bits denoting. Let Mk. . .104
Universal Models of Computation
track
Z2 I
Z3
hZ4
I
zK
T
t zSZ +lt
. . for some k larger than 1. the one-tape machine will have to scan several of its own squares. it also stores the characters under the heads of Mk as it collects them. we have replaced a multitape machine by a one-tape. . .
concept of encoding a vertical slice through the k tapes still works-we just have a somewhat larger set of possibilities: (X U {0. .
. There remains one last problem: in order to "collect" the k characters under the k heads of the multitape machine. . M has q k* (s + I)k states-the (s + 1) term accounts for tape symbols not yet collected-and a tape alphabet of (2s)k characters-the (2s) term accounts for the extra marker needed at each square to denote the positions of the k heads.
track2
4b. . As discussed earlier.
b24 b3
b4!
b5
a5
track]-------------------a a. . Proof. a2 a3 4
Figure 4. Thus if Mk has q states and a tape alphabet of s characters. for each of the k tapes. . "multitrack" machine.

moreover. then M takes on the order of En 4i = 0(n2 ) steps.D. followed by a right-to-left sweep. M resets it to k and starts a right-to-left scan. In contrast to the time increase. we have shown that one-tape and multitape Turing machines have equivalent computational power and.E. if Mk runs for a total of n steps.2 Choosing a Model of Computation
105
To simulate one move of Mk. in terms of bits. Q. updating the record every time it scans a vertical slice with one or more head markers and decreasing its count (also stored in its finite control) of markers to the right of the current position. each of which can store an arbitrarily large integer. M can now simulate the correctly chosen transition of Mk. each character uses k(1 + log s) bits instead of log s bits-a constant-factor increase for each fixed value of k. its alphabet is significantly larger.2. the program is not stored in
. our new machine M makes a left-to-right sweep of its tape. so that. Thus simulating step i of Mk is going to cost M on the order of 4i steps (2i steps per sweep). Since Mk starts its computation with all of its heads aligned at index 1. M records in its finite control the content of each tape square of Mk under a head of Mk. it changes that square's marker bit to 0 while setting the marker bit of the correct adjacent square to 1). the processor carries out instructions (from a limited repertoire) on the registers.4. again counting down from k the number of markers to the left of the current position. Thus in its rightto-left sweep. Instead of s symbols. the distance (in tape squares) from its leftmost head to its rightmost head after i steps is at most 2i (with one head moving left at each step and one moving right at each step). from the leftmost head position of Mk to its rightmost head position. When this count reaches zero. it uses (2s)k symbols. M resets it to k and reverses direction. When the count reaches 0. However. that a multitape Turing machine can be simulated by a one-tape Turing machine with at most a quadratic time penalty and constant-factor space penalty. On the left-to-right sweep. now ready to simulate the next transition of Mk. Since it has recorded the k characters under the heads of Mk as well as the state of Mk. To summarize. note that M uses exactly the same number of tape squares as Mk. As is the case with the Turing machine. M updates each character under a head of Mk and "moves" that head (that is. One of the many varieties of such machines is composed of a central processor and an unbounded number of registers.4 The Register Machine
The standard model of computation designed to mimic modern computers is the family of RAM (register machine) models. 4.

5 solves the third part of Exercise 4.2 for the RAM model. since the number of registers is fixed for any program. Again. and register transfer to our chosen model. in marked contrast to the Turing machine designed for the same task. where m is the number stored in register Ri) with a Turing machine design for the same task. (In the end.
the memory that holds the data. 1}: the Turing machine requires only one pass over the input to carry out the concatenation. with the answer stored in its first few registers. subtraction. decrement. In this model. Space is not too difficult. The mechanism of the Turing machine is simply better suited for certain string-oriented tasks than that of the RAM. Of course. To bring the RAM model closer to a typical computer. In general. we might want to include integer addition. An immediate consequence is that a RAM program cannot be self-modifying. we shall add addition. another consequence is that any given RAM program can refer only to a fixed number of registers. multiplication. we should not hasten to conclude that RAMs are inherently more efficient than Turing machines. We can either charge for the maximum number of bits used among all registers during the computation or charge for the maximum number of bits used in any register during the computation-the two can differ only by a constant ratio. it stops upon reaching the halt instruction.106
Universal Models of Computation
adds RO and Ri and returns result in RO loop invariant: RO + RI is constant loop: JumpOnZero Rl. 0 in Ri Dec RI Inc RO JumpOnZero R2.5
A RAM program to add two unsigned integers. The simplest such machine includes only four instructions: increment. as well as register transfer operations. and division. Figure 4. Consider for instance the problem of concatenating two input words over (0. but the RAM (on which a concatenation is basically a shift followed by an addition) requires a complex series of arithmetic operations.) The question now is how to charge for time and space. jump on zero (to some label). and halt. the program to increment an unsigned integer has two instructions-an increment and a halt-and takes two steps to execute. subtraction. The machine is started at the beginning of its program with the input data preloaded in its first few registers and all other registers set to zero. compare its relative simplicity (five instructions-a constant-time loop executed m times.
.loop unconditional branch (R2 = 0) done: Halt
Figure 4.done RO + Rl in RO.

since a program cannot use arbitrarily large amounts of space in one unit of time. Consequently.) Such behavior is very unrealistic in any model. but it cannot increase any number-at most. Our machine is by now fairly realistic for numbers of moderate sizethough impossibly efficient in dealing with arbitrarily large numbers. (Such a relationship clearly holds for the Turing machine: in one step. What happens if we now introduce unit cost multiplication and division? A product requires about as many bits as needed by the two multiplicands.4. the storage requirements can double. Since the increment instruction is the only one which may result in increasing space consumption and since it never increases space consumption by more than one bit. the space used by a RAM program grows no faster than the time used by it plus the size of the input:
SPACE =
O(Input size + TIME)
Let us now add register transfers at unit cost. once again at unit cost. (Think of a program that uses just two registers and squares whatever is in the first register as many times as indicated by the second. This behavior leads to an exponential growth in storage requirements. When started with n in the first register and m in the second. in other words. Since the number of registers is fixed for a given program. Once more the relationship between space and time is preserved. The time used is m. A register copy may increase space consumption much faster than an increment instruction. it can copy the largest number into every named register. the machine can use at most one new tape square.) In this light. it is instructive to examine briefly the consequences of possible charging policies. space consumption remains asymptotically bounded by time consumption. any addition operation asymptotically increases storage consumption by one bit (asymptotic behavior is again invoked. this allows copying an arbitrary amount of data in constant time. We now proceed to include addition and subtraction. by multiplying a number by itself. Assume that we assign unit cost to the first four instructions mentioned-even though this allows incrementing an arbitrarily large number in constant time.2 Choosing a Model of Computation
107
we want the space measure not to exceed the time measure (to within a constant factor). this program stops with n2" in the first register and 0 in the second. register transfers do not contribute to the asymptotic increase in space consumption. rather than design
. but the storage is 2m log n-assuming that all numbers are unsigned binary numbers. Since the result of an addition is at most one bit longer than the longer of the two summands. since the first few additions may behave like register transfers).

of course. then nothing at all is stored. we shall simply use a RAM model without multiplication. We prove below that our two models are equivalent in terms of computability and that the choice of model causes only a polynomial change in complexity measures. zero.) While the proof is quite simple. A RAM in which each register must be explicitly named has neither indexing capability nor indirection-two staples of modern computer architectures. and then decide how. 4. The proof consists simply of simulating one machine by the other (and vice versa) and noting the time and space requirements of the simulation.2. it is also quite long. incrementation of arbitrary integers in unit time and indirect references are compatible in the sense that the space used remains bounded by the sum of the input size and the time taken. ordered as a sequential list. In order to avoid ambiguities. if the integer is. (The same construction. However. if at all.11) that the combination of register transfer. In fact. In order to simulate a RAM on a Turing machine.108
Universal Models of Computation some suitable charge for the operation. the complexity of a problem is affected by the choice of model.5 Translation Between Models
We are now ready to tackle the main question of this section: how does the choice of a model (and associated space and time measures) influence our assessment of problems? We need to show that each model can compute whatever function can be computed by the other. establishes the equivalence of the two models from the point of view of computability. in fact. the reader can verify (see Exercise 4. A satisfactory solution uses an additional tape symbol (say a colon) as a separator and has the tape contain all registers at all times. addition. we sketch its general lines and illustrate only a few simulations in full detail. We choose to continue with our first model. as signaled by two
. therefore. we can go with the model described earlier or we can accept indexing but adopt a charging policy under which the time for a register transfer or an addition is proportional to the number of bits in the source operands. and indirect reference allows the space used by a RAM program to grow quadratically with time:
SPACE = O(TIME 2 )
At this point. some conventions must be established regarding the representation of the RAM's registers. Our model remains unrealistic in one respect: its way of referencing storage. let us assume that each integer in a RAM register is stored in binary representation without leading zeros.

which depicts the blocks corresponding to the "jump on zero.3. notice. we have adopted an additional convention regarding state transitions: a transition labeled with only a direction indicates that. The block for the instruction Dec is similar." and the "decrement.6.
2 For the sake of convenience in the figure. First the Turing machine scans over (i .4 Design a Turing machine block to simulate the register transfer instruction. due to the necessity of looking ahead one tape square. j 0 i. the Turing machine can encounter a colon or a blank-in which case Ri contains zero-or a one-in which case Ri contains a strictly positive integer. the machine merely moves its head in the direction indicated." the "increment.1)st register from the ith.4. Thus the number of different blocks used depends on the number of registers named in the RAM program: with k registers.2 Choosing a Model of Computation consecutive colons on the tape. The simulation of the instruction Inc is somewhat more complicated. we need up to 3k + 1 different Turing machine blocks. that a right shift is somewhat more complex than a left shift.1) states to do so. if the propagation of the carry leads to an additional bit in the representation of the number and if the ith register is not the first. on all symbols not already included in another transition from the same state. then the Turing machine uses three states to shift the contents of the first through (i -1)st registers left by one position. Again the Turing machine scans right. C1 From the figures as well as from the description. Thus each RAM instruction becomes a block of Turing machine states with appropriate transitions while the program becomes a collection of connected blocks.1) registers. the Turing machine repositions the head over the leftmost bit of RI and makes a transition to the block of states that simulate the (properly chosen) instruction to execute. this time until it finds the rightmost bit of the ith register. After moving its head to the right of the colon separating the (i .2 Consider the instruction JumpOnZero Ri . recopying whatever symbol was read without change. it should be clear that the block for an instruction dealing with register i differs from the block for an instruction dealing with register j. The RAM program itself is translated into the finite-state control of the Turing machine. using (i . label (starting the numbering of the registers from Ri). however. In order to allow blocks to be connected and to use only standard blocks. Exercise 4. However. In Figure 4. It now increments the value of this register using the algorithm of Figure 4. In either case." the head sits on the leftmost nonblank square on the tape when a block is entered and when it is left.
109
. the position of the head must be an invariant at entrance to and exit from a block.

all three of which could refer to the same register).7
The Turing machine program produced from the RAM program of Figure 4. Verify that the time required for the simulation is. Thus our most basic RAM model can be simulated on a Turing machine at a cost increase in time proportional to the space used by the RAM. the time spent by the Turing machine in simulating the jump and increment instructions does not exceed a constant multiple of the total amount of space used on the tape-that is.4.
An important point to keep in mind is that blocks are not reused but copied as needed (they are not so much subroutines as in-line macros): each instruction in the RAM program gets translated into its own block. or
SPACETM = O(SPACERAM)
In contrast.7. Nevertheless.5 becomes the collection of blocks depicted in Figure 4. a constant multiple of the space used by the RAM program. in shifting blocks of tape up or down to keep all registers in a sequential list. we replace each instruction of the RAM program by a Turing machine block (a macro expansion) and the connectivity among the blocks describes the flow of the program. the time spent in simulating the register transfer instruction does not exceed a constant multiple of the square of the total amount of space used on the tape and uses no extra space.
SPACERAM)
Similarly. at
. In effect.5. not as a subroutine. the RAM program for addition illustrated in Figure 4. The reason for avoiding reuse is that each Turing machine block is used. but as a macro. For example.5 Design a block simulating RAM addition (assuming that such an instruction takes three register names. much time is spent in seeking the proper register on which to carry out the operation. Exercise 4.2 Choosing a Model of Computation
ill
Figure 4. or
TIMETM = O(TIMERAM
. Our simulation is efficient in terms of space: the space used by the Turing machine is at most a constant multiple of the space used by the RAM. and in returning the head to the left of the data at the end of a block.

A standard technique is to divide the tape into three parts: the square under the head. or ID. through repeated decrements and a jump on zero. they must be encoded in the same way. then.112
Universal Models of Computation
worst.
. Moving the head
R2
RI
R3
Figure 4. this snapshot of the machine is called an instantaneous description. it follows that any RAM program can be simulated on a Turing machine with at most a quadratic increase in time and a linear increase in space. but those to the right of the head from right to left. Simulating a Turing machine with a RAM requires representing the state of the machine as well as its tape contents and head position using only registers. If the Turing machine has an alphabet of d characters. As the left and right portions of the tape are subject to the same handling. the head position. decrementing is similar to incrementing. with state changes corresponding to unconditional jumps (which can be accomplished with a conditional jump by testing an extra register set to zero for the specific purpose of forcing transfers). RAM addition can be simulated on a Turing machine using space proportional to the space used by the RAM and time proportional to the square of the space used by the RAM. and the tape contents completely describes the Turing machine at some step of execution. Subtraction is similar to addition. with the result that we read the squares to the left of the head from left to right. proportional to the square of the total amount of space used on the tape. The combination of the control state. we assign the value of zero to the blank character. Moving from one transition to another of the Turing machine requires testing the register that stores the code of the symbol under the head.8
Encoding the tape contents into registers. which in turn is proportional to the space used by the registers. w1 By the previous exercise. those to the left of the head. and those to the right.8. we use base d numbers to encode the tape pieces. Because blanks on either side of the tape in use could otherwise create problems. Since the space used by a RAM is itself bounded by a constant multiple of the time used by the RAM program. Now each of the three parts of the tape is encoded into a finite number and stored in a register. as illustrated in Figure 4. Each Turing machine state is translated to a group of one or more RAM instructions.

and a + b is the integer quotient of a by b. we execute RI f-b R3 -. but can be accomplished (by building the quotient digit by digit) in time proportional to the square of the number of digits of the operand-or equivalently in time proportional to the square of the space used by the Turing machine. in order to simulate the transition S(q. reduces to dividing one register by d (to drop its last digit). where the size of numbers stored in any register is bounded
. multiplying another by d and adding the code of the symbol rewritten. thanks to our encoding. R). The division itself requires more time. b.
4. L). Thus we can write. a) = (q'. We have thus added an element of support for the celebrated ChurchTuring thesis: the Turing machine and the RAM (as well as any of a number of other models such as lambda calculus or partial recursive functions) are universal models of computation. Formally.d where a mod b is the integer remainder of a by b. we execute Ri f-b R2 <-d* R2 +R1 RI <-R3 mod d R3 (-R3 . in order to simulate the transition 3(q. b. where the cost of an operation is proportional to the length of its operands.3 Model Independence
113
is simulated in the RAM by altering the contents of the three registers maintaining the tape contents. much as before. bounded RAMS. and setting a third (the square under the head) to the digit dropped from the first register. Except for the division.4. a) = (q'.d and. all operations can be carried out in constant time by our RAM model (the multiplication by d can be done with a constant number of additions). this operation.d R3 +RI RI R2modd R2 R2 . SPACERAM = E3(SPACETM) TIMERAM = O(TIMETM SPACE M) and so conclude that any Turing machine can be simulated on a RAM with at most a cubic increase in time and a linear increase in space.3
Model Independence
Many variations of the RAM have been proposed: charged RAMs.

Several variants are explored in the exercises at the end of this chapter. and s possible states. This is easiest to see in a Turing machine. our machine never found itself twice in exactly the same instantaneous description (same tape contents. which can access an arbitrary stored quantity in constant time. equipped with a read-only head. equipped with several tapes and heads (some of which may be read-only or write-only). with an alphabet of d symbols and a finite control of s states. f (n) possible head positions. equipped with a normal read/write head. uses f (n) tape squares. indexed RAMs.114
Universal Models of Computation
by a constant. same head position. and same state). Assume that. as discussed earlier. we use Turing machines when we need a formal model. i. During the computation. our Turing machine. such as random-access Turing machines. Then there are df W possible tape contents.1)
Moreover. each equipped with its own head. which is O(cf()) for a suitable choice of c. In the following. In any reasonable model of computation. and multitape Turing machines. space requirements cannot grow asymptotically faster than time requirements. This machine uses three tapes.2)
.
for some constant c
(4. This polynomial relatedness justifies our earlier contention that the exact choice of model is irrelevant with respect to intractability: choosing a different model will neither render an intractable problem tractable nor achieve the converse. but otherwise we continue our earlier practice of using an ill-defined model similar to a modern computer. thereby allowing sublinear space measures by charging neither for input space nor for output space.. One tape is the input tape. and. due to polynomial relatedness):
TIME
= O(cSPACE). used mainly in connection with space complexity.
SPACE = O(TIME)
(4. as in a RAM limited to incrementing). where a fixed number of CPU registers can also be used as index registers for memory addressing. Thus the following relation holds for Turing machines (and for all other reasonable models of computation.e. Space consumption is measured on the work tape only. Many variations of the Turing machine have also been proposed. An important variant is the off-line Turing machine. no model can expend arbitrary amounts of time in computation and still halt. given fixed space bounds. on an input of size n. so that the total number of configurations is s f(n) df (). or else it would have entered an infinite loop. the third tape is the work (or scratch) tape. Yet the complexity of programs run on any one of these machines remains polynomially related to that of the same programs simulated on any other (unless the machine is oversimplified. another is the output tape. equipped with a write-only head that can move only to the right.

automatically (and at no cost) chose the correct next step-indeed. In order to view them as language acceptors (like our finite automata). informally. it can just watch its own output and stop with "yes" as soon as the desired string appears in the list. as a language acceptor. the whole machine stops and accepts the instance. a lesser level of information can be obtained if we consider a machine that can list all strings in the language but cannot always decide membership. the machine cannot answer "no" until all of its descendants have answered "no"-a determination
. whenever faced with a choice. we need to adopt some additional conventions. in which a nondeterministic Turing machine is viewed as a purely deterministic machine that is also a prolific breeder. Thus an enumerating Turing machine. We shall assume that the string to be tested is on the tape and that the Turing machine accepts the string (decides it is in the language) if it stops with just a "yes" (some preagreed symbol) on the tape and rejects the string if it stops with just a "no" on the tape. this is an alternate way of defining nondeterminism. since we do not get a decision otherwise. As the choice cannot in practice be made without a great deal of additional information concerning the alternatives. is one that lists all strings in the language. However. As soon as one of its progeny identifies the instance as a "yes" instance. it is not always able to answer "no": while it never gives a wrong answer. However. While nondeterminism in general Turing machines is difficult to use (if the different decisions can lead to a large collection of different outputs. Of course.4 Turing Machines as Acceptors and Enumerators
115
4. it will do so. another model of nondeterminism uses the "rabbit" analogy. it must stop on every string. Note that the listing is generally in arbitrary order-if it were in some total ordering.4.4
Turing Machines as Acceptors and Enumerators
We presented Turing machines as general computing engines. though. All happens as if the nondeterministic machine. a Turing machine might not always stop. what has the machine computed?). Whenever faced with a choice for the next step. we would then be able to verify that a string does not belong to the language by observing the output and waiting until either the desired string is produced or some string that follows the desired string in the total ordering is produced. On the other hand. it might fail to stop when fed a string not in the language. since we never know if the machine might not stop in just one more step. Of course. Such a machine is able to answer "yes": if it can list all strings in the language. the machine creates one replica of itself for each possible choice and sends the replicas to explore the choices. nondeterminism in machines limited to function as language acceptors can be defined just as for finite automata: if there is any way for the Turing machine to accept its input. failure to stop is not a condition we can detect.

Thus. we know that such a machine can be simulated with a one-tape machine. We can simplify the construction by using a three-tape deterministic Turing machine. If Md finds that all sequences of digits of some given length are illegal sequences for Mn. Proof. as M. It is clear from this description that Md will accept exactly what Mn accepts. whatever language can be decided by a nondeterministic Turing machine can also be accepted by a deterministic Turing machine. m The proof is simply a simulation of the nondeterministic machine. whereas a similar logical "and" appears to require a very large amount of time due to the need for an exhaustive assessment of the situation. Using the sequence on the second tape as a guide. beginning with the short sequences and increasing the length of the sequences after exhausting all shorter ones. Again. goes through its computation. including cases when the nondeterministic machine has an accepting path in a tree of computations that includes nonterminating paths. it is clear that whatever is rejected
. Let Mn be the nondeterministic machine and Md be the new three-tape deterministic machine. In fact. then it halts and rejects its input. M. but we do know that one exists for each string in the language.116
Universal Models of Computation
that requires counting. since a logical "or" necessitates only the detection of a single "yes" answer. otherwise it moves on to the next sequence of digits. Md will now proceed to simulate Ml on the third tape.) Naturally. Since the machine to be simulated need not always halt. we do not know how long the accepting sequence is. At each step. The asymmetry here resides in the ability of the machine to perform a very large logical "or" at no cost. not all such sequences define a valid computation of Mn. Theorem 4. each step it takes can be represented by a number between 1 and k. If the simulation results in Ma's entering an accepting configuration (halting state and proper contents of tape). Our new machine Md will use the second tape to enumerate all possible sequences of moves for Mn. has at most some number k of possible choices.2 Any language accepted by a nondeterministic Turing machine can be accepted by a deterministic Turing machine. using the first tape for the input. A sequence of such numbers defines a computation of Mn. since its finite-state control contains only a finite number of five-tuples. we must ensure that the simulation halts whenever the machine to be simulated halts. We do this by conducting a breadth-first search of the tree of possible computations of the nondeterministic machine. (Strictly speaking.but we can easily check whether or not a sequence does define one and restrict our attention to those sequences that do. then our machine Md accepts its input.

8 In contrast to the previous two exercises.. up. can move its head to the right or leave it in place but cannot move it to the left. there is an equivalent Turing machine (that computes the same function) that never moves it head more than one character to the left of its starting position. then every one of its computation paths halts and rejects. so that Md rejects exactly the
same strings as Mn. Exercise 4. If M. or is it more powerful? Exercise 4. Under the same input.
Q. Exercise 4. Exercise 4.10 A two-dimensional Turing machine is equipped with a twodimensional tape rather than a one-dimensional tape. right. say of length n. at each transition.E.7 Prove that. and then show how to simulate it on a conventional Turing machine.
..
4. Md must also halt and reject after examining all computations of up to n steps. Is it equivalent to our finite automaton model. for each Turing machine that halts under all inputs.6 Prove that allowing a Turing machine to leave its head stationary during some transitions does not increase its computational power.11 Verify that a RAM that includes addition and register transfer in its basic instruction set and that can reference arbitrary registers through indirection on named registers can use space at a rate that is quadratic in the running time.4. has some longest halting path.) Exercise 4. and down.5
Exercises
Exercise 4. The two-dimensional tape is an unbounded grid of tape squares over which the machine's head can move in four directions: left. so that M. verify that a Turing machine that.5 Exercises
117
by Md would have been rejected by M. is not equivalent to our standard model.D. Exercise 4. Define such a machine formally. halts and rejects. (Hint: this new model will use far more space than the standard model.9* Prove that a Turing machine that can write each tape square at most once during its computation is equivalent to our standard version.12 Devise a charging policy for the RAM described in the previous exercise that will prevent the consumption of space at a rate higher than the consumption of time. and thus has no legal move beyond the nth step on any path.

as the machine need not write something at each move). show how to use it to simulate a conventional off-line Turing machine. the machine can distinguish between an empty square and one with a pebble. L/R/-. L/R/-. a. respectively) of the work tape square under the head (either nothing or a pebble). and d is the character written on the output tape (which may be absent.16* Define a new Turing machine model as follows. Exercise 4. The machine is equipped with three tapes: a read-only input tape (with a head that can be moved left or right or left in place at will). and can remove the pebble or leave it in place.(only R/. where a is the character under the head on the input tape. Specifically. while the three LIRI. a write-only output tape (where the head only moves right or stays in place). An alternative is to begin by using a five-pebble machine (otherwise identical to the model described here).14 Define a RAM model where registers store values in unary code. b) = (q'. then show how to simulate the conventional RAM on such a model. we shall add a queue and allow the finite automaton to remove and read the character at the head of the queue as well as to add a character to the tail of the queue. On reading a square of the work tape. Use prime encoding to show that a three-register RAM can simulate a k-register RAM for any fixed k > 3.in the case of the output tape) denote the movements for the three heads. Show that this machine model is universal. c. Exercise 4.15 and show that the threepebble machine can simulate the two-register RAM. then complete the proof by simulating the five-pebble machine on the three-pebble machine by using prime encoding. Perhaps the simplest way to do so is to use the result of Exercise 4. R/-). up to three at any given time. A move of this machine is similar to a move of the conventional off-line Turing machine.15 Use the results of the previous two exercises to show that a two-register RAM where all numbers are written in unary can simulate an arbitrary RAM.118
Universal Models of Computation Exercise 4. d/-. It is of the form S(q. The character read or added can be chosen from the queue alphabet (which would typically include the input alphabet and some additional symbols) or it can be the empty string (if the
.17 Consider enhancing the finite automaton model with a form of storage. Exercise 4.13 Verify that a RAM need not have an unbounded number of registers. Exercise 4. and a work tape. The work tape differs from the usual version (of an off-line Turing machine) in that the machine cannot write on it but can only place "pebbles" (identical markers) on it. b and c are the contents (before and after the move.

since 11 can be written as 2 3 + 5. Verify that your measures respect the basic relationships between time and space and that the translation costs obey the polynomial (for time) and linear (for space) relationships discussed in the text.4.18 Repeat the previous exercise. For instance. the starting number is transformed in just one new number. first stack character. Thus the transition function of our enhanced finite automaton now maps a triple (state. input character. Show that this machine model is universal. but now add two stacks rather than one queue. we need to impose additional conventions. is transformed into just one new number. thus the transition function now maps a quadruple (state. queue character) to a pair (state. queue character). input character. Exercise 4.) Devise suitable measures of time and space for the enhanced finite-automaton models (with a queue and with two stacks). it may well be that several rules can apply to the same natural number.x + 9 can be applied to 11. together with a set of conventions on how this collection is to be used. a rule ax + b -+ cx + d can be applied to n if we have n = axo + b for some xo. first stack character. Exercise 4.19 (Refer to the previous two exercises. In that way. In turn. Exercise 4. Given a natural number n. b. these numbers can be transformed through rules to yield more numbers. in turn. perhaps the simplest is to order the rules and require that the first applicable rule be used. Each rule is an equation of the form
ax + b
-*
cx + d
where a. in which case applying the rule yields the new natural number cxo + d. the rule 2x + 5 . second stack character). and so on. where the subset produced contains all natural numbers that can be derived from the given argument by using zero or more applications of the rules. and so on. Since a Post system contains some arbitrary number of rules. While any number of conventions will work. which.5 Exercises
119
queue is empty or if we do not want to add a character to the queue). c. Thus a Post system can be viewed as computing a map f: N -+ 2N. To view a Post system as a computing engine for partial functions mapping N to A. Some combinations of
. yielding a set of new natural numbers. second stack character) to a triple (state. and d are natural numbers (possibly 0) and x is a variable over the natural numbers. applying the rule yields the new number 3 + 9 = 12.20 A Post system is a collection of rules for manipulating natural numbers.

the aforementioned references all discuss such measures and how the choice of a model affects them.15.120
Universal Models of Computation rules and initial numbers will then give rise to infinite series of applications of rules (some rule or other always applies to the current number).) Use the idea of prime encoding to prove that our version of Post systems is a universal model of computation. while others will terminate. once used. Under this convention. the reader will note that. 4. Exercise 4. Our RAM model is a simplified version derived from the computability model of Shepherdson and Sturgis [1963]. most interesting are those of Aho. Time and space as complexity measures were established early. the Post system computes a partial function. A thorough discussion of machine models and their simulation is given by van Emde Boas [1990]. 1.
4. The crux of the problem is how to handle tests for zero: this is where the additive term in the rules comes into play. all existing simulations between reasonable models still require a supralinear increase in time complexity. Hopcroft. thereby implementing multiplication through prime encoding.
. 3. Give a system that always stops.6
Bibliography
Models of computation were first proposed in the 1930s in the context of computability theory-see Machtey and Young [1978] and Hopcroft and Ullman [1979]. the current number (to which no rule applies) is taken to be the output of the computation. yet does something useful. although more efficient simulations than ours have been developed. thereby implementing addition through prime encoding. You may want to use the two-register RAM of Exercise 4. Give a type of rule that. More recent proposals include a variety of RAM models. Give a system that will transform 2 n3m into 2mn and stop (although it may not stop on inputs of different form). At termination. 2. will always remain applicable. and Ullman [1974] and Schonhage [1980]. so that our development of model-independent classes remains unaffected.21* (Refer to the previous exercise. Give a system that will transform 2n3 m into 2m+n and stop (although it may not stop on inputs of different form).

thus any string we consider is from la}*. we often write n + 1. . Working with a one-symbol alphabet is equivalent to working with natural numbers represented in base 1. 1. using the strings E. such a mathematical function can then be
121
. a. Thus. aa. instead of writing ya for inductions.. sets and Turing machines. Working with some richer alphabet would not gain us any further insight. aa. . we use partial recursive functions to prove two of the fundamental results in computability theory: Rice's theorem and the recursion (or fixed-point) theorem. we identify partial recursive functions with the mathematical (partial) functions that they embody and thus also speak of a partial recursive function as a mathematical function that can be computed through a partial recursive implementation. we limit our alphabet to one character.) sets and make the connection between ne. One difficulty that we encounter in studying computability theory is the tangled relationship between mathematical functions that are computable and the programs that compute them. we often use the numbers 0. Finally. Throughout this chapter. in the following. . where we have n = jyl. instead of k. However. However. a. the partialrecursivefunctions. yet would involve more details and cases. it is best studied with mathematical tools and thus best based on the most mathematical of the universal models of computation. similarly. 2. We then build up to the partial recursive functions and recursively enumerable (re.CHAPTER 5
Computability Theory
Computability can be studied with any of the many universal models of computation. Of course. A partial recursive function is a computing tool and thus a form of program. We introduce partial recursive functions by starting with the simpler primitive recursive functions.

k E N.1 The following functions. indeed. called the base functions. In spite of the limited scope of primitive recursive functions. are primitive recursive: * Zero: N -Fil N always returns zero. Hence the term "primitive recursive function" can denote either a mathematical function or a program for that function.
5. this is really a countably 0 infinite family of functions.1 Defining Primitive Recursive Functions
We define primitive recursive functions in a constructive manner. Moving back and forth between the two universes is often the key to proving results in computability theory-we must continuously be aware of the type of "function" under discussion. in fact.
-#
(Note that Pi (x) is just the identity function. one for each pair 1 s i . by giving base functions and construction schemes that can produce new functions from known ones. * Succ: N -* N adds 1 to the value of its argument. a limited type of recursive (inductive) definition.122
Computability Theory computed through an infinite number of different partial recursive functions (a behavior we would certainly expect in any programming language. most of the functions that we normally encounter are. Definition 5. * pik: Nk N returns the ith of its k arguments. it is not easy to define a total function that is not primitive recursive." that is.) We call these functions primitive recursive simply because we have no doubt of their being easily computable.1
Primitive Recursive Functions
Primitive recursive functions are built from a small collection of base functions through two simple mechanisms: one a type of generalized function composition and the other a "primitive recursion. We claim that each can easily be computed through a program. therefore we shall identify them with their implementations. regardless of the value of its argument. so that the correspondence is not one-to-one. 5. primitive recursive. since we can always pad an existing program with useless statements that do not affect the result of the computation). The functions we have thus defined are formal mathematical functions. We may think of our base functions as the
.1.

5.Xn) =g(X2.xn). However.1 arguments and h a function of n + 1 arguments. . Our choice of base functions is naturally somewhat arbitrary.
h(i. then g is a function of zero arguments.2 The following construction schemes are primitive recursive: * Substitution: Let g be a function of m arguments and hi. we want to go beyond simple composition and we need some type of logical test. . then the function of f of one argument is obtained from x and h by primitive recursion as follows: f (0) = x and f (i + 1) = h (i. Thus we define two mechanisms through which we can combine primitive recursive functions to produce more primitive recursive functions: a type of generalized composition and a type of recursion. h 2 . we severely limit the form that this type of recursion can take to ensure that the result is easily computable. **X)
El
(We used 0 and i + 1 rather than Zero and Succ(i): the 0 and the i + 1 denote a pattern-matching process in the use of the rules. .
. then the function of f of n arguments is obtained from g and h by primitive recursion as follows:
|f(O. If we have n = 1. However.
Xn)
= g(hi(xi.
. Our first two base functions give us a foundation for natural numbers-all we now need to create arbitrary natural numbers is some way to compose the functions.1 arguments at all-a convention that will turn out to be very useful. .Xn) =
f (i + 1. hm(xi . then the function f of n arguments is obtained from g and the his by substitution as follows:
f (xI. we interpret p k to return its ith argument without having evaluated the other k . recursive case) as well as a standard programming tool.. hm be functions of n arguments each. Semantically. . X2. Definition 5. . The need for the former is evident.
Xn). . . f (i))... . in other words a constant.x n))
* Primitive Recursion: Let g be a function of n . .1 Primitive Recursive Functions
123
fundamental statements in a functional programming language and thus think of them as unique. . f (i. x 2 . but it is motivated by two factors: the need for basic arithmetic and the need to handle functions with several arguments. . The latter gives us a testing capability (base case vs. . . .) This definition of primitive recursion makes sense only for n > 1. x 2 .Xn) . . not applications of the base functions Zero and Succ. and the definition then becomes: * Let x be a constant and h a function of two arguments. X2. .

obtained through substitution or primitive recursion. D The definition reflects the syntactic view of the primitive recursive definition mechanism.3 A function (program) is primitive recursive if it is one of the base functions or can be obtained from these base functions through a finite number of applications of substitution and primitive recursion.
Definition 5.
Note again that. it may also be implemented with a program that uses more powerful construction schemes.1 gives a programming framework (in Lisp) for each of these two constructions. of course. Figure 5. if a function is derived from easily computable functions by substitution or primitive recursion. A mathematical primitive recursive function is then simply a function that can be implemented with a primitive recursive program.
.124
Computability Theory
(defun f (g &rest fns) "Defines f from g and the h's (grouped into the list fns) through substitution" #'(lambda (&rest args) (apply g (map (lambda (h) (apply h args)) fns)))) (a) the Lisp code for substitution (defun f (g h) "Defines f from the base case g and the recursive step h through primitive recursion" #'(lambda (&rest args) if (zerop (car args)) (apply g (cdr args)) (apply h ((-1 (car args)) (apply f ((-1 (car args)) (cdr args))) (cdr args))))) (b) the Lisp code for primitive recursion
Figure 5. We are now in a position to define formally a primitive recursive function. we do this for the programming object before commenting on
the difference between it and the mathematical object. it is itself easily computable: it is an easy matter in most programming languages to write code modules that take functions as arguments and return a new function.1
A programming framework for the primitive recursive construction schemes.

y)). and then using it to define con 2. We define con' as follows:
Icon'(e. y). We define it as
I
dec(O) = 0
dec(i + 1) = P. ..
. For that purpose we return to our interpretation of arguments as strings over {a)*.4 A (mathematical) function is primitive recursive if it can be defined through a primitive recursive construction.. dec.+l))
Proving that con2 is primitive recursive is a bit harder because it would seem that the primitive recursion takes place on the "wrong" argumentwe need recursion on the second argument. eX~)
Xn+l). . in a primitive recursive manner as follows:
conn+ ((x.
pnn++ (Xl. we can then define the new function con. The concatenation functions simply take their arguments and concatenate them into a single string.
Pn2+l (X. x).
x) = Pi'(x)
con'(ya. of course. x) = Succ(P 3(y.1 Primitive Recursive Functions Definition 5. we want
con. .
Now we can use substitution to define con 2 (x. x2 ) = x2xI. F2 Equivalently. dec(i))
Note the syntax of the inductive step: we did not just use dec(i + 1) = i but formally listed all arguments and picked the desired one. This definition is a program for the mathematical function dec in the computing model of primitive recursive functions. in which case it is returned unchanged). P2(x. symbolically. which subtracts I from its argument (unless. con'(y.5. we can define the (mathematical) primitive recursive functions to be the smallest family of functions that includes the base functions and is closed under substitution and primitive recursion.2(i. are primitive recursive. Xn+0 =
con 2 (conn(P1 + (xi. x))
y) = con'(P2(x.. . .Xn) = X1X2 . Let us begin our study of primitive recursive functions by showing that the simple function of one argument. Let us now prove that the concatenation functions are primitive recursive. We get around this problem by first defining the new function con'(xl. (xi. X2. the argument is already 0. Xn
If we know that both con2 and con. not the first.+.
*
. is primitive recursive.. .

126
Computability Theory Defining addition is simpler. x). 5.
x) = P11 (x)
add(i + 1. since we can take immediate advantage of the known properties of addition to shift the recursion onto the first argument and write
Iadd(O. x)) These very formal definitions are useful to reassure ourselves that the functions are indeed primitive recursive. For the most part. the function of two arguments mult(x. the function of two arguments minus(x. x)
. 3.y (or 0 whenever y > x). Equipped with these new functions. y). its complement is-zero(x). write completely formal definitions of the following functions: 1. which returns x . we tend to avoid the pedantic use of the Pj functions. and. x)
=
Succ(P (i. which returns the product of x and y. the level function lev(x). y). 2. add(i. Zero. For instance. which returns 0 if x equals 0 and returns 1 otherwise.1 Before you allow yourself the same liberties. the "guard" function x#y. con'(i. x) = Succ(con'(i. however. x)) rather than the formally correct con'(i + 1. which returns 0 if x equals 0 and returns y otherwise (verify that it can be defined so as to avoid evaluating y EZ whenever x equals 0). x). we are now able to verify that a given (mathematical) primitive recursive function can be implemented with a large variety of primitive recursive programs. x))
Exercise 5. The following are just a few (relatively speaking: there is already an infinity of different programs in these few lines) simple primitive recursive programs that all implement this same function: * Zero(x) * minus(x. x) = Succ(P3(i. the simplest primitive recursive function. 4. we would generally write con'(i + 1. for instance. Take.

x. (h(i. (i + 1. Zero(f (x)) * for any primitive recursive function f of one argument.. x)). (h(i.. . f.
f(0. x). . x.1 Primitive Recursive Functions
127
dec(Succ(Zero(x))).(. f(i
-
(5. .5. x). p(i. of course. Our trick with the permutation of arguments in defining con2 from con' shows that we can move the recursion from the first argument to any chosen argument without affecting closure within the primitive recursive functions. x). .))i+ ) 2 = (i + 2. x). . x))i+2 )
Yet.x))j+3 =(i + 2. since g and pairing are both primitive recursive.1)
1. . x))i +)) = (i + 2. f(0. f(0. . if the functions g and h are primitive recursive. such as the "course of values" recursion suggested by the definition
f (0. p(i.. x))i+3 = (i + 2. .1. The recursive step is a bit longer:
p(i + 1. x)). Now p(O.x)= g(x)
f(i + 1. f(i. dec(lev(f (x)))
e
The reader can easily add a dozen other programs or families of programs that all return zero on any argument and verify that the same can be done for the other base functions. f(i. X))i+2 O
is primitive recursive whenever g and h are primitive recursive. for any k > 0 e for any primitive recursive function f of one argument. f(i. simply because we can replace any use of the base functions by any one of the equivalent programs that implement these base functions.. 2 (p(i. h(i.f(O. x. since the rest of the construction is primitive recursive. x). x). x). x))i+2). f(i + 1. Thus any built-up function has an infinite number of different programs. x. . x. However. f(i. (f(i. which is primitive recursive. it does not yet allow us to do more complex recursion. x))))
nI
. h(i. x). x. (h(i. x). x). f(i. What we need is to show that
p(i. x). x)).1. f(i.f(0.. . f(i . f(i . f(O. (i + 1. f(O. x.. then f as just defined is also primitive recursive (although the definition we gave is not.
f(i. x) = h(i. p(i. . x) =(i
+ 2. which can be expanded to use k consecutive Succ preceded by k consecutive dec. x))i+3 = (i + 2. p(i. . x)). . X). x) is just (1. entirely primitive recursive). x) = (i + 1. g(x)).

using projection functions as necessary. .
. logical or.
cpandQ(xl. Lemma 5. .1 If P and Q are primitive recursive predicates.Xn){ 1n if (xI. In mathematics. . E Proof
Cnotp(XI. Xn)))
Xn) = lev(con 2 (cp(x. . given primitive recursive functions g and h and primitive recursive predicate P. we define its characteristic function as follows:
CPXI a) .g(x
PXI.E.Xn). .3 Verify that definition by cases is primitive recursive. .. = 0 1if (xI. it would be helpful to have a way to define functions by cases.
1
X)
. For that. For instance.) Further verify that this definition can be made so as to avoid evaluation of the functions) LI specified for the case(s) ruled out by the predicate. so are their negation. which takes the value 1 on the members of the subset. a predicate on some universe S is simply a subset of S (the predicate is true on the members of the subset. in turn. else .128
Computability Theory
and now we are done.xO)
. we need the notion of a predicate. CQ(XI
Q. otherwise
Xn)
is also primitive recursive. false elsewhere). To identify membership in such a subset. Exercise 5. . then .. the new function f defined by
. CPorQ(XI. In our universe.2 Present a completely formal primitive recursive definition of f. .
. we first need to define an "if .*.
Xn) =
iszero(cp(xI." construction. X V P n)
We say that a predicate is primitive recursive if and only if its characteristic function can be defined in a primitive recursive manner. . .
Exercise 5. We need to establish some other definitional mechanisms in order to make it easier to "program" with primitive recursive functions.
.
.x))
X. 0 elsewhere.) CQ(XI* Xn. if given some predicate P of n variables.. since this last definition is a valid use of primitive recursion. EP Cp(XI. mathematics uses a characteristicfunction. That is.
Xn)
if P(xI . (We can easily generalize this definition to multiple disjoint predicates defining multiple cases. and logical and. . for which. .n) = dec(con2 (cp(x.D.

* prime(x) returns the xth prime. most functions with which we are familiar are primitive recursive. . Exercise 5.4 Verify that the primitive recursive functions are closed under the bounded quantifiers.6 Verify this assertion. indeed. the function returns x + 1. zi .5. Exercise 5. * is prime(x). we can develop our inventory of primitive recursive functions. as we now proceed to do. if no such y exists.x. * x S y. zr)] returns the smallest y no larger than x such that the predicate P is true.)] zj which is true if and only if there exists some number y z x such that P(y. true if and only if x is no larger than y. we have not yet seen any function that is not primitive recursive. .. .X [P(yz . .
129
Zn)]
which is true if and only if P(y. zj. . Exercise 5. Li Equipped with these construction mechanisms.zJ = min y . z. .. Our definition scheme for the primitive recursive functions (viewed as programs) shows that they can be enumerated: we can easily enumerate the base functions and all other programs are built through some finite number of applications of the construction schemes. z 1
.) holds for all initial values y . prove that the following predicates and functions are primitive recursive: * f (x. El We should by now have justified our claim that most familiar functions are primitive recursive.5 Using the various constructors of the last few exercises. Use pairing functions and assign a unique code to each type of base function and each construction scheme. . true if and only if x is prime. . z. so are the two bounded quantifiers 3y -.1 Primitive Recursive Functions Somewhat more interesting is to show that. the code I
. so that we can enumerate them all. For instance. * x I y. Use primitive recursion to sweep all values y . zi. although the existence of such functions can be easily established by using diagonalization. ez ) is true. if P is a primitive recursive predicate.x and logical connectives to construct the answer. Indeed. true if and only if x divides y exactly. zI.. we can assign the code 0 to the base function Zero. . and Vy --x [P(y.x [P(y.

Then we can assign code 3 to substitution and code 4 to primitive recursion and thus encode a specific application of substitution
f (xi.Cm)m+3
Encoding a specific application of primitive recursion is done in a similar way. fi. .hxc)) .. and the code 2 to the family {P1j' . . since each of the fis is itself easily computable. . f 2 . thus g is clearly not primitive recursive. and. Let the primitive recursive functions in our enumeration be named Jo. Cl . by (3.. . printing the definition of the corresponding primitive recursive function. we define the new function g with g(k) = Succ(fk(k)). We first look at 11I(c).130
Computability Theory
to the base function Succ. . xm()Xihx. . deciding whether or not it is a valid code. When getting a code c. encoding a specific function P/ as (2. we have a base function. etc. it is also clear that g is easily computable once the enumeration scheme is known.
. otherwise we have a construction scheme. We conclude that there exist computable functions that are not primitive recursive. I Thus we can enumerate the (programs implementing the) primitive recursive functions. we can start taking it apart. If Fl1(c) equals 3. and so forth.
cg.
. This enumeration lists all possible definitions of primitive recursive functions.. . the code for the composing function (g in our definition) as lI (r12 (1 2 (c)). if it is between 0 and 2.
where function g has code cg and function hi has code ci for each i. However. . j)3. Now we can enumerate all (definitions of) primitive recursive functions by looking at each successive natural number. We now use diagonalization to construct a new function that cannot be in the enumeration (and thus cannot be primitive recursive) but is easily computable because it is defined through a program. x. i. we know that the outermost construction is a substitution and can obtain the number of arguments (m in our definition) as -1(Il2(0). which must be a number between 0 and 4 in order for c to be the code of a primitive recursive function.. if so. This function provides effective diagonalization since it differs from fk at least in the value it returns on argument k. . m. so that the same mathematical function will appear infinitely often in the enumeration (as we saw for the mathematical function that returns zero for any value of its argument). Further decoding thus recovers the complete definition of the function encoded by c whenever c is a valid code.) = g(h1 (xi.

and f3 (x. y) is just (x + 2) *y. The base case requires a proof that fi grows as fast as any of the base functions (Zero. x))
f(O. Let us define the following family of functions: * the first function iterates the successor:
I
x) = x f 1 (i + 1.It isperfectly well defined and easily computable through a simple (if highly recursive) program.7 Verify that each fi is a primitive recursive function. although rather complex. and Pij). We could fake the number of arguments by adding dummy ones that get ignored or by repeating the same argument as needed or by pairing all arguments into a single argument. since F has been built for that purpose: it is enough to observe that fi+l grows faster than fi. the n + 1st function (for n -. x). we proceed in two steps: we prove first that every primitive recursive function is bounded by some fi. x).5. we use induction on the number of applications of construction schemes (composition or primitive recursion) used in the definition of primitive recursive functions. y) is just x + y. x) =
fn(x. x)
fn+l(i + 1. fi(x.
. grows as yx+3.2
Ackermann's Function and the Grzegorczyk 1 Hierarchy
It remains to identify a specific computable function that is not primitive recursive-something that diagonalization cannot do. Succ.
* in general. To prove this claim. The first part is more challenging. and then that F grows faster than any fi. (We ignore the "details" of the number of arguments of each function.1) is defined in terms of the nth function:
Jfn+ I(0. We now proceed to define such a function and prove that it grows too fast to be primitive recursive.(x. y). Succ acts like a one-argument fo and forms the basis for this family.) The second part is essentially trivial. x)
In essence. The inductive step requires a proof that. f 2 (x.1. but we claim that it cannot be primitive recursive. Exercise 5. x) = Succ(f1 (i.1 Primitive Recursive Functions
131
5. if h is defined through one application of either substitution or primitive recursion from
'Grzegorczyk is pronounced (approximately) g'zhuh-gore-chick. Thus fo(x) is just x + 1. x) = fn(f+ 1 (i. with F(O) = 1. c1
Consider the new function F(x) = f.

A(Succ(m). Basically. and so on). n)) Then A(n. without losing any opportunity to make fwi grow. n + 1) = A(m. fl acts much like addition. 1) A(m + 1. primitive recursive functions can grow only so fast. n) behaves much as our F(n) (although its exact values differ.(i. We can also give a single recursive definition of a similar version of Ackermann's function if we allow multiple. then h is itself bounded by some f. and we fed these two arguments to what we knew by inductive hypothesis to be the fastest-growing primitive recursive function defined so far.132
Computability Theory
some other primitive recursive functions gis. we conclude that primitive recursive functions are not closed under this type of construction scheme. x). from our previous results.k.e. since F grows much faster yet. the ft functions have that bounding property because fjti is defined from fi by primitive recursion without "wasting any power" in the definition. j)-one row for each value of i and one column for each value of j. x). The details of the proof are now mechanical. To define f. f3 like exponentiation. The third statement (the general case) uses double. rather than primitive recursion: A(O. In computing the value of f(i. An interesting aspect of the difference between primitive and generalized recursion can be brought to light graphically: consider defining a function of two arguments f (i. Again. its growth rate is the same).. j).+. I/ . moreover. namely. Thus not every function is primitive recursive. but there is no reason why we should not also be able to
use previous columns in the current row. there is no reason why we should not be able to use previously computed values (prior rows and
columns) in any order. namely fn.+ 1(i + 1. something that nested recursion does.O) = A(m. n) = Succ(n) A(m + 1. i. F is one of the many ways of defining Ackermann's function (also called Peter's or Ackermann-Peter's function). nested recursion. we used the two arguments allowable in the recursion. f2 like multiplication. Note also that we have claimed that primitive recursive functions are
. x and the recursive call f. a primitive recursive scheme allows only the use of previous rows. f 4 like a tower of exponents. j) through recursion and mentally prepare a table of all values of f (i. each of which is bounded by fk. Moreover. Our family of functions fi includes functions that grow extremely fast (basically. yet not fast enough. the primitive recursive
scheme forces the use of values on previous rows in a monotonic order: the computation must proceed from one row to the previous and cannot later use a value from an "out-of-order" row.

. H. G. 2) with F(8) nestings (and then the last call to gi iterates F again for a number of nestings equal to the value of F(F(F(. F. x). which is gl(F(F(F(. x).. you get the idea... all successive gis grow increasingly faster. where we used Succ as our base function before. Thus c1(0) is just Succ(O) = 1. Now F acts like a one-argument go.x)= g(gnt(i. except that. while CF(1) is F(1) = f. As we defined it. Yet again. 2) = gl(gi(F(8). 2). as Succ was first used. and then cap it with a new function H. Well. 2). we proceed to define a new family of functions {gi} exactly as we defined the family {f1 ). 2). }-call them {(o. After generating again an infinite family of infinite families. this process can continue forever and create higher and higher levels of completion. That is. T.1 Primitive Recursive Functions
133
very easy to compute. The resulting rich hierarchy is
.. F(F(2)) . You can verify quickly that G(2) is gI(gI(gi (2. which entirely defies (2) description.. But observe that we are now in the process of generating a brand new infinite family at a brand new level. The new function G is now a type of superAckermann's function-it is to Ackermann's function what Ackermann's function is to the Succ function and thus grows mind-bogglingly fast! Yet we can repeat the process and define a new family {hi I based on the function G. 03. we can again cap the whole construction with. x).. we now use F: * gi (O. Then. We have an infinite family of functions {f. E. X general. }. each capped with its own one-argument function. .. F(F(2)) . indeed. *and cap that family by <>(x) = ox(x). with G(O) = 1.. We can once again repeat our capping definition and define G(x) = gx(x. even though we can certainly write a very concise program to compute it. x)). we can continue: make D the basis for a whole new process of generation. Ackermann's function is an example of a completion. An amusing exercise is to resume the process of construction once we have Ackermann's function. say. say.+(O. of course.. we can repeat the process.. in turn and. namely the family I qW. g. flooo(x). but "capping" also connotes the fact that the completion grows faster than any function in the family) by Ackermann's function. .5. so we can cap that family t. 02. E. .. of course. Now we can consider the family of functions {Succ. 2). which may be doubtful in the case of. ))) with F(8) nestings)! If you are not yet tired and still believe that such incredibly fast-growing functions and incredibly large numbers can exist..x)and gnw(i+1. x) = F(gl(i. we can repeat this process ad infinitum to obtain an infinite collection of infinite families of functions..x) = x and gj (i + 1.. 1) = 2... and FP is G(2). say.. F(x) would be much harder to compute.x)= g(x.(l.. ))). 01. F. obtaining another two levels of families capped with. I i E HI and we "cap" it (complete it.. which behaves on each successive argument like the next larger function in the family.

5. definition by cases.134
Computability Theory known as the Grzegorczyk hierarchy. a Turing machine) and since primitive recursive functions. this function will be total and computable but. by definition. When a new partial function is built from existing partial functions. as soon as we enumerate total functions (be they primitive recursive or of some other type). although computable. after this dazzling hierarchy. food for thought..Note that. In particular. We say that two partial functions are equal whenever they are defined on exactly the same arguments and.. the new function will be defined only on arguments on which all functions used in the construction are defined.) and predicates (such as equality). we need to be careful about what we mean by using various construction schemes (such as substitution. will not appear in the enumeration. by construction. return the same values. in order to account for all computable functions. it will take all the semiconductor memory ever produced and several trillions of years just to compute the value on argument 2. do not account for all computable functions. This makes sense in terms of computing as well: not all programs terminate under all inputs-under certain inputs they may enter an infinite loop and thus never return a value. we can write a fairly concise but very highly recursive computer program that will compute the value of any of these functions on any argument. whatever a program computes is. Yet. It follows that. we would do well to consider what we have so far learned and done. we shall see in Section 5. of course. computable! When working with partial functions. that they are provably uncomputable .6 that there exist functions (the so-called "busy beaver" functions) that grow very much faster than any function in the Grzegorczyk hierarchy-so fast. we can use this enumeration to build a new function by diagonalization. However. functions that are not defined for every input argument. in fact. say.) Rather astoundingly. for those arguments. no matter how fast any of these functions grows. we must make room for partial functions. it is always computable-at least in theory. primitive recursion. (For any but the most trivial functions in this hierarchy. etc. As we have seen. we may be tempted to add some new scheme for constructing functions and thus enlarge our set of functions beyond the primitive recursive ones. that is. Certainly.2
Partial Recursive Functions
Since we are interested in characterizing computable functions (those that can be computed by. if some
. but it is theoretically doable.

. . . xv). where q is the least such m. .
135
Xn) =
Ity* xI . .. for all p. This new scheme is most often called g-recursion. . although it is defined formally as an unbounded search for a minimum. Unlike our previous schemes.. xn) is defined and Vr(m. .. We then write 5(xI. then it also diverges at (z. (The choice of a test for zero is arbitrary: any other recursive predicate on the value returned by the function would do equally well.. Unlike our previous two schemes.5 The following construction scheme is partial recursive: * Minimization or g-Recursion: If l. . this one is easily computable: there is no difficulty in writing a short program that will cycle through increasingly larger values of y and evaluate * for each. however. xn) equals q. xi. this one. Figure 5. . we write 0(x) t. . xi. the new function is defined as the smallest value for some argument of a given function to cause that given function to return 0. . we write 0(x) J. converting from one recursive predicate to another is no problem.. xn) for all z 3 y. a (partial) function of n arguments. . is obtained from * by minimization if (xi. .2 gives a programming framework (in Lisp) for this construction. V'(p. .
. then q5(xl. and. xi. Indeed...
X) = 0]. (y. 0 S p • m.. That is.
Like our previous construction schemes. . xi. xn) is defined. .whenever 0 (xi.(lambda f (O &rest args) (defun f #'(lambda (i &rest args) if (zerop (apply psi (i args)))
i
(apply f ((+1 i) args))))))
Figure 5. even when
(defun phi (psi) "Defines phi from psi through mu-recursion" #.2
A programming framework for ft-recursion. .. . We are now ready to introduce our third formal scheme for constructing computable functions. If ¢(x) converges.e. .) Definition 5. whenever such an m exists. x") is defined if and only if there exists some m E N such that.5. this one can construct partial functions even out of total ones.. if it diverges. Xn) equals 0. .
. is some (partial) function of n + 1 arguments. . .. then q. .2 Partial Recursive Functions partial function 0 is defined by recursion and diverges (is undefined) at (y. looking for a value of 0. i.

3
Relationships among classes of functions. say {f. primitive cD recursion.6 A partialrecursive function is either one of the three base functions (Zero.
started with a total *l. partial recursive functions are enumerable: we can extend the encoding scheme of Exercise 5. . In consequence. may not define values of 4 for each combination of arguments. Thus the total functions cannot be enumerated. total recursive functions cannot be enumerated. We shall see a proof later in this chapter but for now content ourselves with remarking that such an enumeration would apparently require the ability to decide whether or not an arbitrary partial recursive function is total-that is. is subject to diagonalization and thus incomplete. since we can always define the new total function g(n) = fn (n) + 1 that does not appear in the enumeration. Definition 5. ). Exercise 5. and. and Li-recursion. Unlike partial recursive functions. or {P/I) or a function constructed from these base functions through a finite number of applications of substitution.136
Computability Theory
all partialfunctions
Figure 5.8 We remarked earlier that any attempted enumeration of total functions... Figure 5. Why does this line of reasoning not apply directly to the recursive functions? c1
. something we have noted cannot be done. whether or not the program halts under all inputs.3 illustrates the relationships among the various classes of functions (from N to N ) discussed so far-from the uncountable set of all partial functions down to the enumerable set of primitive recursive functions. Whenever an m does not exist. f2. . fittingly. our simple program diverges: it loops through increasingly large ys and never stops.6 to include Ai-recursion. Succ. we shall call it a total recursive function or simply a recursive function. If the function also happens to be total. the value of 0 is undefined.

3
Arithmetization: Encoding a Turing Machine
We claim that partial recursive functions characterize exactly the same set of computable functions as do Turing machine or RAM computations. Again. the choice of a one-character alphabet does not limit what the machine can compute. we choose the simplest version of deterministic Turing machines to encode.6. with a few more details since the RAM model is somewhat more complex than the Turing machine model. as in our simulation of RAMs by Turing machines and of Turing machines by RAMS. although.3 Arithmetization: Encoding a Turing Machine
137
5. Thus one result of this endeavor will be the production of a code for the Turing machine or RAM at hand. J. we shall now demonstrate that any Turing machine can be simulated by a single partial recursive function. of course. we need to "simulate" a Turing machine or RAM with a partial recursive function. This encoding in many ways resembles the codes for primitive recursive functions of Exercise 5. A more important result is the construction of a universal function: the one partial recursive function we shall build can simulate any Turing machine and thus can carry out any computation whatsoever. The other direction is trivial and already informally proved by our observation that each construction scheme is easily computable. Basically. E = {c.
. it returns the value that the Turing machine would return for these arguments. However. although it goes beyond a static description of a function to a complete description of the functioning of a Turing machine. our simulation this time introduces a new element: whereas we had simulated a Turing machine by constructing an equivalent RAM and thus had established a correspondence between the set of all Turing machines and the set of all RAMs. Since we know that deterministic Turing machines and nondeterministic Turing machines are equivalent. encoding a RAM is similar. This function takes as arguments a description of the Turing machine and of the arguments that would be fed to the machine. The proof is not particularly hard.5. Whereas our models to date have all been turnkey machines built to compute just one function. our deterministic Turing machines will have a tape alphabet of one character plus the blank. We choose to encode a Turing machine. this function is the equivalent of a stored-program computer. since Godel first demonstrated the uses of such encodings in his work on the completeness and consistency of logical systems. This encoding is often called arithmetization or Gddel numbering. We consider only deterministic Turing machines with a unique halt state (a state with no transition out of it) and with fully specified transitions out of all other states. it may make the computation extremely inefficient. furthermore.

) Since every state except the halt state has fully specified transitions. we define a series of useful primitive recursive predicates and functions. . . we code this pair of transitions as Di = ((j.. L/R) and 6(qi.g. in order to build such a predicate. not complexity. there will be two transitions for each state: one for c and one for _. When it reaches the halt state. while invective. If the Turing machine has the two entries S(qi. However. we need to describe its finitestate control. c) = (qj.
D. with its head positioned on the first square of the input. i)) All are clearly primitive recursive. In view of our definitions of the H functions. (DI. L/R) 3 . (k.138
Computability Theory
Since we are concerned for now with computability. say 0 to L and 1 to R. We number the states so that the start state comes first and the halt state last.. i. L/R) 3 ) In order to use the pairing functions. say 0 to _ and 1 to c. (Its current tape contents. 0) = F12 (trans(x. a one-character alphabet is perfectly suitable. c'. i)) triple(x. c". In order to encode a Turing machine..)n)
Naturally. we assign numerical codes to the alphabet characters. ) = (qk. these various functions are well defined for any x. a Turing machine that loops forever in a couple of states). as well as to the L/R directions. We assume that our deterministic Turing machine is started in state 1. we do need a predicate to recognize a valid code. c". . the output is the string that starts at the square under the tape and continues to the first blank on the right. i) = "Il(x)(table(x)) X triple(x. 1) = 11I(trans(x. and control state are not part of the description of the Turing machine itself but are part of the description of a step in the computation carried out by the Turing machine on a particular argument. L/R). beginning with selfexplanatory decoding functions: * nbr-states(x) = Succ( 1 (x)) I
* table(x) = H2 (x)
* trans(x. This is not a problem: we simply consider every natural number that is not a valid code as corresponding to the totally undefined function (e. is not suriective: most natural numbers are not valid codes. this encoding. head position. i. Now we encode the entire transition table for a machine of n + 1 states (where the (n + 1)st state is the halt state) as D = (n. although what
. c'. where c' and c" are alphabet characters.

n) = [I . 11I(x). so that the code for the left. so we finally define the main predicate.n) = is trans(F1 1(y). that a member of the pairing in the second part of D encodes valid transitions.n]. is-char(x) = [x = 0] V [x = 1].or right-hand side of the tape is a binary code. and head move are all well defined. Now define the predicate
issriple(z.. in order to "execute" a Turing machine program on some input. so that numbers are written in unary. new character. and from just after the head position to the rightmost nonbank character. which tests whether or not some number x is a valid encoding of a Turing machine. (Even though
. and the current control state. but the alphabet used on the tape of the Turing machine has two symbols (_ and c). we do this with a recursive definition that allows us to sweep through the table:
is-table(y.e. Define the helper predicates is move(x) = [x = 0] v [x = 1]. n)
=
is bounded(FI3(z). n) = 1 is table(y. we need to describe the tape contents. as follows: is TM(x) = is-table(table(x). i. n)
A
is table(f1 2 (y).i .3 Arithmetization: Encoding a Turing Machine
139
they recover from values of x that do not correspond to encodings cannot be characterized. we can build one that checks that a state is well defined. we run into a nasty technical problem at this juncture: the alphabet we are using for the partial recursive functions has only one symbol (a). We can encode the tape contents and head position together by dividing the tape into three sections: from the leftmost nonblank character to just before the head position. the head position. Our predicates will thus define expectations for valid encodings in terms of these various decoding functions.5. i. n)
A
is-triple(H 2 (z). Using this predicate. as follows:
is trans(z. 0. Succ(n)) A is-char(113(z))
A
is-move(r13(z))
which checks that an argument z represents a valid triple in a machine with n + 1 states by verifying that the next state. Unfortunately. n) = is-triple(fl (z). nbr states(x)) Now. i + 1. n)
Now we need to check that the entire transition table is properly encoded. and is-bounded(i. the square under the head position. n)
This predicate needs to be called with the proper initial values. all clearly primitive recursive.

this treatment causes no problem.140
Computability Theory both the input and the output written on the Turing machine tape are expressed in unary-just a string of cs-a configuration of the tape during execution is a mixed string of blanks and cs and thus must be encoded as a binary string.
. However. Thus we make a quick digression to define conversion functions. we use them below without further comments. c) ripple(xc) = con 2 (ripple(x).) The converse is harder: given a number n in unary. We again use number representation for the unary number and string representation for the binary number:
Iu-tob(0)
=
£
u-to-b(n + 1) = ripple(u-to-b(n)) where the function ripple adds a carry to a binary-coded number. Since these redefinitions are self-explanatory. . rippling the carry through the number as necessary:
ripple(e) = c
ripple(x_) = con 2 (x. its value considered as a binary number is easily computed as follows (using string representation for the binary number. Since we need only use the function when given strings without blanks. (Technically.) We need conversions in both directions in order to move between the coded representation of the tape used in the simulation and the single characters manipulated by the Turing machine. only Succ and primitive recursion need to be redefined-Succ becomes an Append that can append any of the characters to its argument string and the recursive step in primitive recursion now depends on the last character in the string.) If we are given a string of n cs (as might be left on the tape as the output of the Turing machine). (Only the length of the input string is considered: blanks in the input string are treated just like cs. 2). we would also need to redefine partial recursive functions from scratch to work on an alphabet of several characters. but integer representation for the unary number): b to-u(r) = 0 b-to-u(x_) = Succ(double(bLto u(x))) b to u(xc) = Succ(double(b to u(x))) where double(x) is defined as mult(x. we must produce the string of cs and blanks that will denote the same number encoded in binary-a function we need to translate back and forth between codes and strings during the simulation.

Call them next state(x. q. we simply set next-tape(x. then we set
q. Let us now define functions that allow us to describe one transition of the Turing machine. then the tape contents will be encoded as
tape(n) = (0. q) and next tape(x. if the input to the partial function is the number n. Initially. if q is well defined and not the halt state. under the head. and if its head motion at this step. q. t.
(t)))
21
otew se 1--)) otherwise
q
<
nbrstates(x)
The function for the next tape contents is similar but must take into account the head motion. lev(n).3 Arithmetization: Encoding a Turing Machine Now we can return to the question of encoding the tape contents. if q is the halt state or is not well defined.
and if I3(triple(x. the symbol under the head is simply given its coded value (Ofor blank and 1 for c).
add(double (FI3(t)). I13(triple(x.5. t.and right-hand side portions are considered as numbers written in binary. Fl'(triple(x.
q)
=
q
3 I triplex(. odd(rI3(t)). where x is the Turing machine code.
odd(al i3a(t)). and w. q. q. The next state is easy to specify: next-state(x. t the tape code. div2(s3n(wdfin
and finally.
t. v.
q)
= t
. q) = (div2(Fl3(t)). thus. FI3(t))).
t.
q) = (add(double(Hl3(t)). equals L. q). bLto u(dec(n))) 3
141
where we used lev for the value of the symbol under the head in order to give it value 0 if the symbol is a blank (the input value is 0 or the empty string) and a value of 1 otherwise. so that both parts always have c as their most significant digit. f2(t))) equals R. If we denote the three parts just mentioned (left of the head. with the right-hand side read right-to-left. and q the current state. n3(t)))).
t. and right of the head) with u. b-to-u(wR))3 Thus the left. we encode the tape and head position as (bLto-u(u). t. 1-13(t))))3
next-tape(x. then we set next tape(x. v. F13(triple(x.

i)) = nbr states(x)]
This function simply seeks the smallest number of steps that the Turing machine coded by x. started in state 1 (the start state) with y as argument. q)3. we get the function
O(x. q)W) =
(x. Now consider running our Turing machine x on input y for stop((x. tape(y). y)) steps and returning the result. y))))
As defined. y) is the paired triple describing the tape contents (or is undefined if the machine does not stop).
t.
1)3. q)3. y)) is not defined.
q). ) =
J
if H2(O(x. t. next-tape(x.
1)3.142
Computability Theory These definitions made use of rather self-explanatory helper functions. tape(y).
t. needs to reach the halting state (indexed nbr-states(x)). y) = H2(step((x. there is not much we can say about stop. If the Turing machine coded by x does not halt on input t.
q)3. i +
1) = next-id(step((x. then stop((x. If the Turing machine coded by x halts on input t = tape(y). Now we define the crucial function. 0) =
(x. y)) = 0 ladd(double(b to u(strip(u to b( 3(xy)). * div2(0) =0 and div2(n + 1) =
Succ(div2(n))
Ldiv2(n)
odd(n) otherwise
Now we are ready to consider the execution of one complete step of a Turing machine: next-id((x. we define them here for completeness:
* odd(O) =0 and odd(n + 1) = is-zero(odd(n)). t. Thus we write out(x. if x does not code a Turing machine. But we have stated that the output of the Turing machine is considered to be the string starting at the position under the head and stopping before the first blank. Finally. q))3
and generalize this process to i steps:
step((x.
n2(x
y)
. i))
All of these functions are primitive recursive.
q)3
step((x. not even total:
stop((x.
stop((x. next-state(x. O(x. which is not primitive recursive-indeed.
t. then the function stop returns a value.
t.
t. y)) = li[fl3(step((x.

strip(x)) Our only remaining problem is that. we have chosen to decode indices that do not meet our encoding format by producing for them a Turing machine that implements the totally undefined function. Since this machine is universal. The universal Turing machine. it is a universal function. In other words. there is a single code i such that Oi (x. Up to now. Yet Ouniv(X. everything else in its definition is primitive recursive. y)
isiTM(x) otherwise
so that. Notice first that it is defined with a single use of a-recursion. y) computes 0. y) lout(xo. deciding whether this specific machine halts under some input is as hard as deciding whether any arbitrary Turing machine halts under some input. the result of out(x. asking a question about it is as hard as asking a question about all of the Turing machines. the output of the Turing machine coded by x when run on input y. (y).) The property of our definition by cases (that the function given for the case ruled out is not evaluated) now assumes critical importance-otherwise our new function would always be undefined! This function Tuniv is quite remarkable. Universal Turing machines are fundamental in that they answer what could have been a devastating criticism of our theory of computability so far. Let x0 be the index of a simple two-state Turing machine that loops in the start state for any input and never enters the halt state. Since it is partial recursive. y) is unpredictable and meaningless. We define
u (x Y) =
out(x. if x does not code a Turing machine. it is computable and there is a universal Turing machine that actually computes it. if x does not code a Turing machine.5. every Turing machine or RAM we saw was a "special-purpose" machine-it computed only the function for which it was programmed. on the other hand. is a general-purpose computer: it takes as input a program (the code of a Turing machine) and
. for instance. (An interesting side effect of this definition is that every code is now considered legal: basically. that is.3 Arithmetization: Encoding a Turing Machine
143
where the auxiliary function strip changes the value of the current string on the right of the head to include only the first contiguous block of cs and is defined as strip(g) = 8 strip(_x) E strip(cx) = con2 (c. y) returns the output of the Turing machine coded by x when run on input y. the function is completely undefined.

m).. . (El. L/R) 3 ) we then obtain E' = ((add(j. what it leaves on the tape is used by the second machine as its input. L/R) 3 ) The new machine is legal. (DI. .
and produce the new code
Dm)m)
ly= (n. that is. E . . c. the compound machine does not halt either. say
Ix= (m. d.144
Computability Theory
data (the argument) and proceeds to execute the program on the data. if we start with Ej = ((j. we simply take the codes for the two machines. n). is an enumeration of all partial recursive functions. . Every reasonable model of computation that claims to be as powerful as Turing machines or RAMs must have a specific machine with that property. so that the compound machine correctly computes the composition of the functions computed by the two machines. Finally. (add(k. we can feed the output of one machine to the next machine and regard the entire two-phase computation as a single computation.
5. L/R) 3 . E).n)
z=(add(m. Thus if neither individual machine halts. We can let the index set range over all of A. it has m + n + 1 states. To do so. . we abstract and formalize the lessons learned in the arithmetization of Turing machines. Dn. c. If the first machine halts. because we have effectively merged the halt state of the first machine with the start state of the second. This composition function. {Pi I i E NJ. even
. d. (k. note that we can easily compose two Turing machine programs.4
Programming Systems
In this section. is primitive recursive! Exercise 5. m). one less than the number of states of the two machines taken separately.9 Verify this last claim. it is another synonym for a Godel numbering. .E')m+n)
where. L/R) 3 .(DI. moreover. .. . A programming system.

e. After looking at the proof. you may want to try to prove the converse (an easier task).. Yl. that is. and all x. We prove that this capability is a characteristic of any acceptable programming system and ask you to show that it can be regarded as a defining characteristic of acceptable programming systems. we have Oi((x.) Any reasonable programming language (that allows us to enumerate all possible programs) is an acceptable programming system.we have
145
x. indexed directly by N. x *
yn)
E
This theorem is generally called the s-m-n theorem and s is called an s-m-n function. if there is an index i such that. namely that a programming system with a total recursive s-m-n function (s-1-1 suffices) is acceptable. (Since the Lisp programs can be indexed in different ways. we could use the legality-checking predicate to re-index an enumeration and thus enumerate only legal codes. y)) = (y).4 Programming Systems though we have discussed earlier the fact that most encoding schemes are not surjective (that is. As we have seen.5. because we can use it to write an interpreter for the language itself. (1
Xm.
.1 Let {f5 I i E N} be an acceptable programming system. We saw in the previous section that our arithmetization of Turing machines produced an acceptable programming system. alternately. *. we can decide to "decode" an illegal code into a program that computes the totally undefined function. Y. . In programming we can easily take an already defined function (subroutine) of several arguments and hold some of its arguments to fixed constants to define a new function of fewer arguments. all n 1. We say that a programming system is universal if it includes a universal function. for all x and y. for this Oi. The proof of our theorem is surprisingly tricky. precisely because we can tell the difference between a legal encoding and an illegal one (through our is-TM predicate in the Turing machine model. for all i. y. x. for example).
X'n) = i(xX . Then there is a total recursive function s such that. We write Ouni. thus the system {(o) could correspond to all Lisp programs and the system {/j} to all C programs. they leave room for values that do not correspond to valid encodings). A programming system can be viewed as an indexed collection of all programs writable in a given programming language. We say that a programming system is acceptable if it is universal and also includes a total recursive function co) that effects the composition of functions. we would have several different programming systems for the set of all Lisp programs.. Theorem 5. that yields 0. . i. .x)(yl *
*. all m 3 1.((xy)) = A by .

xm)m and T = (yI.x) 3)(Y) = Oi * =
* Pk h(m)
* Oh(x)(Y)
'Pi
*k((m. y). h(x)) for all x..
-n)Y)3) = (xI. .m.Yn)m+n)
Write x = (xI. Z)) = (x.
-
=Oi ((xI ** . . YIl
*I
Yn)m+n
Since Con is primitive recursive.
as desired. Y. * . As a simple example of the use of s-m-n functions (we shall see many more in the next section). m. and let if be an index with Pif = f and ig an index with 'ig = g. c((h(m). The
x. y) and g((x. (YI..n. xm)m. y)3) y)3)) = Oi ((x. .Xm. we should really have written
Os((i^ (xi-. h(x)))))))
We now have
ts((i. .D. We use our composition function c (there is one in any acceptable programming system) and define functions that manipulate indices so as to produce pairing functions. Now define h(e) = if and h(xa) = c(ig.
y)) Q. (y. Now note that the following function is primitive recursive (an easy exercise): Con((m. c((k.. let us prove this important theorem:
. YI.. y)) = (Succ(x). Z)3
We are finally ready to define s as
s((i.
. Use induction to verify that we have 'IOh(x)(Y) = (x.
Ok =
Con... x)3) = c((i.)3)((Y1.
W))) 3
Now we need to show how to get a construction for s. Thus we can write
Oh(x) * /h(y)(Z) = Oh(x)((Y. x. how to bring it out of the subscript.E. that is. (xI . z)) = (x.146
Computability Theory Proof. Since we have defined our programming systems to be listings of
functions of just one argument. Yn). there is some index k with desired s-m-n function can be implicitly defined by
Os((imx)3)(y) = Oi(Con((m. .)
If c is primitive recursive (which is not necessary in an arbitrary acceptable
programming system but was true for the one we derived for Turing machines)..
*.
x. Define f (y) = (s.x). . y). .
(We shall omit the use of pairing from now on in order to simplify notation. . .
Xm. then s is primitive recursive as well.
= qi(Con((m.

such as length. Let 0)univ be the universal function for the {[oi system.2 If {pi } is a universal programming system and {'j } is a programming system with a recursive s-1-1 function. then we have *j}
*t(i)(X) = *s(k. running time. m) :A 0 holds for all m .5. then we have step(i. there exists a total recursive bisection between the two. that these translations ensure only that the input/output behavior of any program in the {(oi system can be reproduced by a program in the {1ijl system.i)(x) = *k(i. not between the programs themselves.7.E. we have translations back and forth between the cz two. In effect. x. there is only one acceptable programming system! It is worth noting.. x. Since the
{*j/ system contains all partial recursive functions.(i) for all i. our new formulation is a little less awkward. there is a total recursive function step such that. X) = Xi(X
147
as desired. In effect the translations are between the mathematical functions implemented by the programs of the respective programming systems. m) does not equal 0. in any acceptable programming system {ki}. as it avoids tape encoding and decoding. (But note that 111k not necessarily universal for the {f system!) Define t(i) = s(k.4 Programming Systems Theorem 5. and.e. then there is a recursive function t that translates the {oil system into the {1j} system. which we use as a flag to denote failure.)
.D. however. thus is there is some k with *k = Ouniv. (Hint: the simplest solution is to translate the step function used in the arithmetization of Turing machines.
In particular. for all x and i: * there is an mx such that step(i. the recursion theorem). By using a stronger result (Theorem 5. Exercise 5. individual characteristics of programs. i. x.) The step function that we constructed in the arithmetization of Turing machines is a version of this function.10* Prove that.mx if and only if Xi(x) converges. that ensures Oi = . since both systems are acceptable. it contains 0univ. II
Proof. any two acceptable programming systems can be translated into each other. m) =
Succ(oi (x))
(The successor function is used to shift all results up by one in order to avoid a result of 0. i). we could show that any two acceptable programming systems are in fact isomorphic-that is. are not preserved by the translation. * if step(i. X) = OunivU. and so on.
Q.

we are done. by definition). We make some elementary observations about recursive and re. if it is the empty set or the range of a recursive function. Proposition 5. then is-zero(cs) is the characteristic function of S and is also recursive. If a set is recursive. since the two enumerations together enumerate all of X*).E. Sets
We define notions of recursive and recursively enumerable (re.) sets. set.1 1. 3. is an enumerator for the re. f (1). so is its complement. Given the recursive characteristic function cS of a nonempty recursive set (the empty set is re. Hi The recursive function (call it f) that defines the re. a set is recursive if it can be decided and re. they are clearly both recursive. set.
. 2.. Formally. sets. within finite time. If a set and its complement are both re. if cs is the characteristic function of S and is recursive.5
Recursive and R. we construct a new total recursive function f whose range is S. I contains all elements of the set and no other elements. If a set is recursive. If either the set or its complement is empty. we simply enumerate both S and its complement.7 A set is recursive if its characteristic function is a recursive function. Clearly.E.D.
DH
Proof 1. let f be a function whose range is S and g be a function whose range is the complement of S. If asked whether some string x belongs to S.148
Computability Theory
5. looking for x. it is also re. Let y be some arbitrary element of S and define f = x
y
cs(x) otherwise
3. As soon as x turns up in one of the two enumerations (and it must eventually. since the list If (O). f (2).e. Definition 5. 2. Otherwise. we write
| 1 f (ty[f(y)=xorg(y)=x])=x
0 otherwise
Q. Intuitively. if it can be enumerated. then they are both recursive. a set is r.

1-1(Z). y) converges if and only if y is in the range of Ox. y) = Jz[step(x. By definition. while the recursive step will either return a newly discovered element of the domain or return again what was last returned. The empty set itself is the range of the totally undefined function. now define g(x) = s(k. Theorem 5. F
Proof. we define a new function through primitive recursion. For the third part.10 to define the partial recursive function:
149
0(x. The base case of the function will return some arbitrary element of the domain of Ox. (ii) the range of a partial recursive function is the domain of some (other) partial recursive function. We need to ensure that the function we construct is total and enumerates the (nonempty) domain of the given function Ox. Since 0 is partial recursive.5. and. (z) for all possible numbers of steps ll2(z) until the result is y. The simplest way to prove such an equivalence is to prove a circular chain of implications: we shall prove that (i) an re. In order to meet these requirements. Effectively. set is the range of a total and thus also of a partial.E.3 A set is re. II (z). recursive function. The theorem really states that three definitions of re. sets are equivalent: our original definition and the two definitions given here. there is some index k with Ok = 0. For the second part. our 0 function converges whenever y is in the range of Ox and diverges otherwise. at
.5 Recursive and R. The basis is defined as follows:
u f(x. 1`2 (z)) $0])
This construction dovetails Ox on all possible arguments II (z) for all possible steps l1 2(z) until an argument is found on which c/u converges. every nonempty ne. we use a similar but slightly more complex construction. Sets The following result is less intuitive and harder to prove but very useful. Thus our first implication is proved. we use the step function defined in Exercise 5. Observe that Og(x)(Y) = 0(x. H2(z)) = SucC(y)] This definition uses dovetailing: 0 computes Ox on all possible arguments l. if and only if it is the range of a partial recursive function and if and only if it is the domain of a partial recursive function. so that the range of Xx equals the domain of Og(x). set (as originally defined) is the range of a partial recursive function. x) by using an s-m-n construction. 0) = 1 1 Cz[step(x. (iii) the domain of a partial recursive function is either empty or the range of some (other) total recursive function.

150
Computability Theory
which point it returns that argument. K ={x I as (x)J} that is the set of functions that are defined "on the diagonal. Thus CK is not partial recursive and K is not a recursive set. Now define the recursive step as follows:
f(x." K is the canonical nonrecursive ne. Now Oh(x)(Y) = f (x. x). In proving results about sets. That it is nonrecursive is an immediate consequence of the unsolvability of the halting problem. Of particular interest to us is the halting set (sometimes called the diagonal set). there exists some index j with Oj = f.
y + 1)
f (X.E. use the s-m-n construction to define h(x) = s(j. we often use reductions from K.
. set. That it is r. 11I(Succ(y)).. FI2(Succ(y))) 0
On larger second arguments y. Define the new function g(x) = CK =0
undefined
CK (x =
We claim that g(x) cannot be partial recursive. n 2(Succ(y))) = 0
step(x. Thus the recursion serves to extend the dovetailing to 2 larger and larger possible arguments and larger and larger numbers of steps beyond those used in the base case. otherwise there would be some index i with g = hi. Q. 11I(Succ(y)). since we can just run (using dovetailing between the number of steps and the value of x) each Ox and print the values of x for which we have found a value for Ox(x). Assume that K is recursive and let CK be its characteristic function. converges. y)
r1I(Succ(y))
step(x. it follows that K = E* . y) is an enumeration function for the domain of ox.K is not even Le.e. since otherwise both it and K would be ne. and thus both would be recursive. It must terminate because we know that Ox has nonempty domain. and we would have Xi (i) = g(i) = 0 if and only if g(i) = CK (i) = 0 if and only if c1i (i) 1. From earlier results. We can also recouch the argument in recursiontheoretic notation as follows. is easily seen. In consequence every element in the domain of Ox is produced by f at some point. f either recurses with a smaller second argument or finds a value l1 (Succ(y)) on which ox converges in at most F1 (Succ(y)) steps. Since f is recursive. This is the base case-the first argument found by dovetailing on which Xx. a contradiction.D.

We define the new partial recursive function O(x. as is illustrated in Figure 5. Hence membership of x in K is equivalent to membership of j in T. To prove that T is not recursive. Example 5. namely f (x). Now use the s-m-n theorem (in its s-i-1 version) to get the new index j = s(i. since this is a valid partial recursive function.j (x.5 Recursive and R. In effect. T is not even r. On the other hand.) The purpose of a reduction from A to B is to show that B is at least as hard to solve or decide as A. x). we could use that knowledge to solve A. y) = y + Zero(0. then Oj(y) is the
Figure 5.) E Definition 5. using its answer as is. El (This particular type of reduction is called a many-one reduction. If x is in K. y) = 0(x. To prove that S(y. we again use a reduction from K. x)). then so would K. Consider some arbitrary x and define the function 0(x. then kj(y) is the identity function Oj(y) = y. (In fact.e. if we knew how to solve B. neither is T. y). to emphasize the fact that it is carried out by a function and that this function need not be invective or bijective. x)). The new box simply transforms its input.4. there must be 1 some index i with 0(x. into one that will be correctly interpreted by the blackbox for B.1 Consider the set T = {x l Xx is total). Observe that. if x is in K.4
A many-one reduction from A to B. Now we use the s-m-n theorem to obtain j = s(i. if x is not in K. then qj is the totally undefined function and thus j is not in T. x) and consider the function Oj (y). if it were recursive. it has an index. to decide membership in B-then the figure illustrates how we could construct a new blackbox to solve A.5. Since this function is partial recursive.E.
. y) = z + Zero(Ouniv(x. and thus total. so that j is in T. z) is not recursive. it suffices to show that.. Sets
151
Example 5. x. and then asks the blackbox for B to solve the instance f (x).8 A reduction from set A to set B is a recursive function f such that x belongs to A if and only if f (x) belongs to B. what a reduction shows is that. If we have a "magic blackbox" to solve B-say. z) = {x l o (y) = zi.-something we shall shortly prove.2 Consider the set S(y. Since K is not recursive. say 0 = 0/.

This being the case. F These two examples of reductions of K to nonrecursive sets share one obvious feature and one subtle feature. since the totally undefined function is precisely what the reduction produces whenever x is not in K and so must not be in the target set in order for the reduction to work.? Let us return to the set T of the total functions. since that would require verifying that the function is defined on each of an infinity of arguments. or in a set of constant functions. our 0 would already be in a set of total functions. S(y. we can use dovetailing. Hence we have x E K X j c S(y. namely K. The obvious feature is that both use a function that carries out the computation of Zero(ox (x)) in order to force the function to be totally undefined whenever x is not in K and to ignore the effect of this computation (by reducing it to 0) when x is in K. such as the set NT = {x I 3y. y) will be totally undefined and thus not in S. y) is in S. since the complement of a nonrecursive set must be nonrecursive. Ox(y) T) of nontotal functions. we cannot verify that a function is total. z). and thus. set. then bj is the totally undefined function. Example 5. we
.e.: to enumerate it. if x0 is not in K.152
Computability Theory
constant function z. So we reduce YK T. if x is not in K. then we could enumerate K. all such reductions. if we to could enumerate T. the desired reduction. or in a set of functions that return 0 for at least one argument. z). generally in the simplest possible way. we can reduce K to NT. to compute ox(Y) and check whether the result (if any) equals z. Thus proofs of nonrecursiveness by reduction from K can always be made to a set that does not include the totally undefined function. y) that includes within it Zero(Ouniv(x.3 Earlier we used a simple reduction from K to T. and thus j is not in S(y. whenever x0 is in K (and the term Zero(Ouni. say from K to a set S. Suppose now that we have to reduce K to a set that does contain the totally undefined function. 0(y) = 0(xo. (Intuitively. 0(x. j is in S(y. y) is defined so that.(xX x)) disappears entirely). z) is clearly re. giving us half of the reduction. in particular. We have claimed that this set is not re. with no additional terms. In addition. although we can enumerate the partial recursive functions. Instead of reducing K to NT. Unlike T. and so on. the function 0(y) = 0(xo. printing all x for which the computation terminated and returned a z. This feature is critical. with the same effect. that is. The more subtle feature is that the sets to which K is reduced do not contain the totally undefined function. On the other hand. we show that. z). x))-which ensures that. look much the same: all define a new function O(x. on all x and number of steps. which does not contain the totally undefined function.) We know of at least one non-r. (For instance. that is.) So how do we prove that a set is not even re.

x.10. Again. neither is T. or. x E K < g(x) V T. y) = 0 otherwise
and this is a perfectly fine partial function. if x is in K. Since K is not re.5. we cannot just define
(x. Og(x) is total and thus g(x) is in T. then Ox (x) converges and thus must converge after some number yo of steps. there is an index i with Xi = 0. x. are entirely stereotyped.i. by using the s-m-n theorem. say from K to some set S. we conclude that there is a total recursive function g with Og(x)(y) = 0(x. we conclude that our total recursive function has the property x E K X g(x) 0 T. y) is exactly f (y) and thus of the type characterized by S. y) = 0 g(y) otherwise
Typically. if xo is not in K. all feature a definition of the type Q(x. y) is undefined for almost all values of y. S contains functions that are mostly or totally undefined. Sets
153
produced a total recursive function f with x e K X f(x) E T. we could enumerate members of K by first computing g(x) and then asking whether g(x) is in T. otherwise. but then Og(x)(y) is undefined for yo and for all larger arguments and thus is not total-that is. the function O(xo. which typically will ensure that it does not belong to S.. we can
define
0(x. the function O(xo. since Ox(x) never converges for any y steps. as desired. such reductions. Putting both parts together.s =|undefined otherwise
because x 0 K can be "discovered" only by leaving the computation undefined. (If. Unfortunately. then Og(x) is just the constant function 1. Y) f (y) step(x. in particular. y) = |
undefined
step(x. then we can use the simpler reduction featuring O.)
.(Xl x). g(x) is not in T. Then. we cannot just complement our definition. However. g(y) is the totally undefined function and f(y) is of the type that belongs to S. What we need now is another total recursive function g with x E K X g(x) E T. y). if x is not in K.E. Conversely. whereas.5 Recursive and R. if x0 is in K. recalling the step function of Exercise 5. equivalently. in fact. As such. Now note that.

y) =0
otherwise
f1. certain sets may require somewhat more complex constructions or a bit more ingenuity. y) = 0(y) + Zero(Ouniv(x.1 summarizes the four reduction styles (two each from K and from K). y) = 0(y) + Zero (uit (x. x. * If S does contain the totally undefined function. (b) reductions from T to S
Table 5.154
Computability Theory
Table 5. then let 0(x. y) > 0 *t(y) otherwise
where 0 is chosen to belong to S and *r is chosen to complement 0 so as
.
in words. then let 0(x.4 Consider the set S = {x I BY [0x(Y) 41A Vz. x. x)) where 0(y) is chosen not to belong to S. by reducing k to it. we prove that it is not even r. y) =
i (y)
step(x. y) = 0 (y) step(x. then let
0(x. (a) reductions from K to S
* If S does not contain the totally undefined function. this is the set of all functions that cannot everywhere double whatever output they can produce.1
The standard reductions from K and from K. * If S does contain the totally undefined function. Xx(z) 7& X(y)]}. (y)
where ¢(y) is chosen to belong to S and V'(y) is chosen to complement 0(y) so as to form a function that does not belong to S-f (y) can often be chosen to be the totally undefined function. This set is clearly not recursive.
2 Example 5. then reduce to S instead.
* If S does not contain the totally undefined function. our suggested reduction is
O(x.e. These are the "standard" reductions. x))
where 0(y) is chosen to belong to S. Since S does not contain the totally undefined function (any function in it must produce at least one value that it cannot double).

the two functions are defined on the same arguments and return the same values whenever defined). This factor is crucial because all of our reductions work by constructing a new partial recursive function that (typically) either is totally undefined (and thus not in the set) or has the same input/output behavior as some function known to be in the set (and thus is assumed to be in the set). In other words.yo. not a property of programs. we have used much the same mechanism every time. if some partial recursive function 4i belongs to the set and some other partial recursive function Oj has the same input/output behavior (that is. any nontrivial input/output property of programs is undecidable! In spite of its sweeping scope. So now let us assume that PC is neither the empty set nor its complement. if and only if it is either the empty set or its complement. I itself contains at least one partial recursive function (call it 4) and yet does not contain all partial recursive functions. But then our function * must produce all powers of two.4 Let T be any class of partial recursive functions defined by their input/output behavior. then the set P& {x I Ox E T} is recursive if and = only if it is trivial-that is. this result should not be too surprising: if we cannot even decide whether or not a program halts. It takes a bit of thought to realize that we can set + (y) = rI (y) to solve this problem. Proof. We can choose the constant function 1 for 0: since this function can produce only 1. since. The proof makes it clear that failure to decide halting implies failure to decide anything else about input/output behavior. m
155
5. whenever x is in K.S. our 0 function will produce 1 for all y . Without loss
. it cannot double it and thus belongs to S.6 Rice's Theorem and the Recursion Theorem to form a function that does not belong to S. this similarity points to the fact that a much more general result should obtain-something that captures the fairly universal nature of the reductions. we are in a bad position to decide whether or not it exhibits a certain input/output behavior. The empty set and its complement are trivially recursive. then this other function Oj is also in the set. Formalizing this insight leads to the fundamental result known as Rice's theorem: Theorem 5. 2 In other words.6
Rice's Theorem and the Recursion Theorem
In our various reductions from K. A crucial factor in all of these reductions is the fact that the sets are defined by a mathematical property. In particular.

y).E. In contrast. not about classes of programs. Hence we have j e Fc X x E K. which we now proceed to prove for a somewhat restricted subset. x). recursive. whereas. In contrast. the set {x I x is the shortest code for the function qx) distinguishes between programs that have identical input/output behavior and thus does not fall under Rice's theorem. then qbj is the totally undefined function and thus j is not in PC. the desired reduction. that is.: the code for Oj is longer than the code for sf (since it includes the code for V as well as the code for 4. such as limited length of code. we note that our conclusion relies on the statement that. let us assume that I does not contain the totally undefined function. Therefore PT is not Q. In examining the proof. if x is in K. Thus any time we ask a question about programs such that two programs that have identical input/output behavior may nevertheless give rise to different answers to our question. many such questions remain undecidable. then qj equals Vf and thus j is in Pt. We use the s-m-n theorem to obtain i = s(i. y) = 0 (x. Rice's theorem becomes inapplicable. Theorem 5. Yet this set is also nonrecursive. since this is a primitive recursive definition. Assume there is a function. there is an index i with Oi (x. Of course. Note that. qj belongs to the same class as *. so that we get the partial recursive function /j(y) = (y) + Zero(oiv(xx)). then we could not conclude that Oj must belong to the same class as Xl. f(n) returns the length of
. since qj equals +& when x is in K. y) = +f(y) + Zero (uni. if the class were defined by some program-specific predicate.D. Note again that Rice's theorem is limited to input/output behavior-it is about classes of mathematical functions. * The set of all pairs of programs such that the two programs in a pair compute the same function. Define the function 0(x. but their undecidability has to be proved by other means.5 The length of the shortest program that prints n and halts is not computable.uni) and thus could exceed the length limit which ' meets. if x is not in K. Our proof proceeds by contradiction. * The set of all programs that never halt under any input.(x x)). That is. they must share the property defining the class. Following are some examples of sets that fall under Rice's theorem: * The set of all programs that halt under infinitely many inputs. g
Proof. call it f. because the two partial recursive functions tj and * implement the same input/output mapping (the same mathematical function).156
Computability Theory of generality. that can compute this length.

for each fixed n.5. which requires that information about m be hard-coded into the program. Then. Berry's paradox is stronger than the equally famous liar's paradox. for this MO. Berry's paradox provides the basis for the theory of Algorithmic Information Theory. Thus for large m.. define the new. The log 2 m takes into account the fact that g must test the value of f(i) against m. yet g has length less than mO itself-a contradiction. Epimenides should have simply said "I always lie. But then.C. in English. What can we say about the length of a program for g? If we code m in binary (different from what we have done for a while. We can turn the argument upside down and conclude that we cannot decide. which can be phrased as: "This sentence is false" 2 and which can be seen as equivalent to the halting problem and thus the basis for the theory of computability. For a true paradox. the length of g is certainly less than m. This problem is a variation of the famous busy beaver problem. then we can state that the length of a program for g need not exceed some constant plus log2 m. since it is consistent with the explanation that there is a Cretan (not Epimenides. Q. Our g is a formalization of the famous Berry's paradox. what is the largest value that can be printed by a program of length n that starts with an empty tape and halts after printing that value. The constant takes into account the fixed-length code for f (which does not depend on m) and the fixed-length code for the minimization loop. which asks how many steps a program of length n with no input can
2
The liar's paradox is attributed to the Cretan Epimenides. which can be phrased as: "Let k be the least natural number that cannot be denoted in English with fewer than a thousand characters.
. Now g(x). then so is g. g prints the smallest integer i such that no program of length less than mO can print i.D. is attributed to Eubulides (6th century B. let mO be such a value of m.6 Rice's Theorem and the Recursion Theorem
157
the shortest program that prints n and halts.ml
If f is recursive.E." This original version of the liar's paradox is not a true paradox." The version we use. a student of Euclid. but not affecting computability). who is reported to have said." This statement has fewer than a thousand characters and denotes k. constant-valued function g(x) as follows:
g(X) = 1uti If (i -. returns a natural number i such that no program of length less than m prints i and halts. because there are infinitely many programs that print n and then halt (just pad the program with useless instructions) and so the minimization must terminate. built by Gregory Chaitin. who also reported that he had slept for 40 years.) who is not a liar. Because it includes both self-reference and an explicit resource bound (length). for fixed m.. Hence f cannot be recursive. a true paradox. "All Cretans
are liars.).

1: the busy beaver function (for each n. set i = g(m).
A simple application of the recursion theorem is to show that there exists a program that. y). but that we defined the partial recursive functions specifically to overcome the self-reference problem. we can use the standard s-in-n construction to conclude that there is a total recursive function g with g(x)(Y) = 0(x. We conclude with a quick look at the recursion theorem. and also
. y). an exact characterization of r. Theorem 5. Thus the recursion theorem can be viewed as a very general mechanism for defining functions in terms of themselves. sets that can be used to prove that some sets are r. under any input. Since this is a partial recursive function. that is. there is an index i (depending on f) with qi = Of(i) . especially when compared to the extremely simple characterization of Rice's theorem. we have Om(m) J. print the largest number that a program of length n can compute on an empty input) grows so fast that it is uncomputable! There exists a version of Rice's theorem for r. In consequence. y) = Ouniv(Ouniv(X x). i is a fixed point for f within the given programming system. Thus we define the function O(x. Unfortunately. and others are not.158
Computability Theory run before halting. since Om is total. Our busy beaver problem should be compared to the Grzegorczyk hierarchy of Section 5.25. Superficially.
Xi(Y) = 'kg(m)(Y) =
e (m. There is some index m with ¢5.
.e. we do not state it here but leave the reader to explore it in Exercise 5. this characterization (known as the Rice-Shapiro theorem) is rather complex.6 For every total recursive function f.e.e. this result is counterintuitive: among other things. as well as those based on fixed points (such as denotational semantics for programming languages). in our terms. y) = 'km(M)(Y)
= Of(gm))(Y) = Okf(i)(Y)
Q.
Proof The basic idea in the proof is to run ox (x) and use its result (if
any) as an index within the programming system to define a new function. Now consider the total recursive function f g.E. a fundamental result used in establishing the correctness of definitions based on general recursion. sets. it states that we cannot write a program that consistently alters any given program so as to change its input/output behavior. = f g.D. Because partial recursive functions are immune to the dangers of self-reference.
as desired. Recall that no set of total functions can be immune to diagonalization. we can use self-reference to build new results. Now. outputs exactly itself. In other words.

have interesting properties in their own right. Define the function X (x. and let j e P&and k 0 Pa.) Another simple application is a different proof of Rice's theorem. then set A reduces to set C through c(f. in an acceptable programming system. The only problem with the recursion theorem is that it is nonconstructive: it tells us that f has a fixed point.
Proof.D. thus contradicting the recursion theorem. Clearly.E. this can easily be fixed by a few changes in the proof. j)). Let j be the index of a program computing the function g defined in the proof of the recursion theorem. the index of a program not in Pt. Thus reductions are reflexive and transitive and
. then we have Oh(x) = (h(.)) E This time. the fixed point is computable for any given total function f = Ox through the single function h. if Ox is total. Since. Straightforward substitution verifies that this h works as desired. Let % be a nontrivial class of partial recursive functions. (You might want to write such a program in Lisp.7 There is a total recursive function h such that. conversely. transforms the index of any program not in Pa into j. so that Op and Of (j) cannot be equal). but not how to compute that fixed point. any set reduces to itself (through the identity function). by construction. y) = x. Define the function
f
|{k X E PC k xop%
Thus f transforms the index of any program in PC into k. but f cannot have a fixed point i with Of (i) = O. Hence PT is not recursive. Q. we have an effective composition function c. and. one of i and f (i) is inside PT and the other outside. 'kf(x) (y) = x for all
x. the fixed point of f. Let c be the total recursive function for composition and define h(x) = g(c(x. Now apply the recursion theorem to obtain n.7
Degrees of Unsolvability
The many-one reductions used in proving sets to be nonrecursive or non-re. (because.
5. then f would be a total recursive function.7 Degrees of Unsolvability there is an index n with
¢. the index of a program in Pa. Theorem 5. If PF were recursive. for all x. so that we get the stronger version of the recursion theorem.
159
(x)
=
then use the s-m-n construction to get a function f with
n for all x.5. However. if set A reduces to set B through f and set B reduces to set C through g. g).

set with domain function OA.E. Q.E. as desired. so that A reduces to B through f. Indeed no other set can be reduced to the empty set and no other set can be reduced to A.2 There is a unique m-degree that contains exactly the (nontrivial) recursive sets.D.
Then x belongs to A if and only if f (x) belongs to K. we can always reduce one to the other. sets. set contains only ne. If set A is recursive and set B reduces to A through f.160
Computability Theory
can be used to define an equivalence relation by symmetry: we say that sets A and B are equivalent if they reduce to each other. then define f to map any element of A to x and any element of A to y. since A is recursive. or just m-degrees. nor can we reduce one trivial set to the other. since all sets in the degree must reduce to S and thus are recursive.
. El Proof. with characteristic function CB = CA * f. Finally. Using standard s-m-n techniques. Proposition 5. as we have seen before. Proposition 5.D. x E B and y 0 B. we formalize this intuition through the concept of completeness. sets. The classes of equivalence defined by this equivalence relation are known as many-one degrees of unsolvability. Let A be any ne. We have seen that the diagonal set K is in some sense characteristic of the nonrecursive sets. F1 Proof. f Q. then. then set B is recursive. 1 Proof. we can construct a recursive function f obeying Of(x)(Y)
=
y + Zero(OA(x))
=
uY
undefined
x EA otherwise
Q. We say that A is many-one complete for X if every set in I many-one reduces to A. and B reduces to A through f. Pick two elements. D Theorem 5.8 The diagonal set K is many-one complete for the class of ne. This function f is recursive.3 An m-degree of unsolvability that contains an ne. so that each of the two forms its own separate m-degree of unsolvability. Definition 5.D.9 Let ICbe a collection of sets and A some set in I. If A is re.E. if A and B are two nontrivial recursive sets. Hence an m-degree that contains some recursive set S must contain only recursive sets. The two trivial recursive sets are somewhat different: we cannot reduce a nontrivial recursive set to either N or the empty set. B is re. with domain function OB = OA .

5
The lattice of the recursive m-degrees.4 Any nontrivial recursive set is many-one complete for the r2 class of recursive sets. we see that we have three distinct mdegrees for the recursive sets: the degree containing the empty set. since. Figure 5. by definition. However.
nontrivial
recursive sets
A
(0) IN)
Figure 5. Definition 5. we say that the first m-degree reduces to the second. then. We begin with two definitions. sets. with the trivial function fK(i) = i. the degree containing A. D Thus f (i) is a witness to the fact that A is not ne. and the degree containing all other recursive sets.11 A set is creative if it is re.10 A set A is productive if there exists a total function f such that. The set K is productive. it shows that A is not the domain of hi. if we have some function hi with domrpi c K. then. Thus we say that both our trivial recursive m-degrees reduce to the m-degree of nontrivial recursive sets. any nontrivial recursive set many-one reduces to its complement. for each candidate partial recursive function (pi. Definition 5. degrees? We know that all reduce to the degree of K. Proposition 5. sets: for instance. for each i with domrb C A. because K is many-one complete for the ne. we shall prove that not all nonrecursive ne. we have f (i) E A -domni. the same is not true of ne.
. Oi(i) diverges and thus we have both i ¢ domrbi and i E K.5. because.. Whenever a set in an m-degree reduces to a set in a second m-degree. In terms of m-degrees. a result due to Post. What can we say about the nonrecursive ne. This extension of the terminology is justified by the fact that each degree is an equivalence class under reduction. K does not reduce to its complement-otherwise K would be ne.7 Degrees of Unsolvability We can recast our earlier observation about nontrivial recursive sets in terms of completeness.5 illustrates the simple lattice of the recursive m-degrees. sets belong to the degree of K. and its complement is productive. Since the class of nontrivial recursive sets is closed under complementation. However.

y. so that we would have domno Z C. y.M) eK
Xg ¢(i. if f (g(i. there exists a recursive function g(x.e. We have f(g(i. by the
recursion theorem. m))) would converge and f (g(i. Assume then that we have some function Oi with domed C C and consider h(i) = f (g(i. m)) is a productive function for C. z) = g(xy)(z) and. m)) T ((f It thus remains only to verify that f (g(i.e. We need to show that B many-one reduces to C. Theorem 5. m)) T< Xi (g U.y)(Z) = */(z). this fixed point can be computed for each y by some recursive function e(y) = xy.
m)) T< Xi f(O)m(g U. y. we want to show that h(i)
belongs to C
-
domin.. set is many-one complete for the class of r. But. Z)) = x(Oy W)
By the s-m-n theorem.162
Computability Theory For instance. set with productive function f. M)) EC
g(i. set with domain function OB. Now for the "if" part: let C be a creative r.. that C is productive. a contradiction.e. sets. z) to be totally undefined if y is not in B (by invoking Zero(OB(y))) and to be otherwise defined only for z = f (x). and let B be an r. y) with Og(x.y) =
jf
0
f(e(y))}
y eB
otherwise
. Z) = univ (Xx.9 An re. By the s-m-n theorem. m)) does not belong to C. m)). Ouniv(Y. K is creative. Define the new function f (x.e. but if the complement is productive. and thus witnesses against the original set's being recursive. sets if a and only if it is creative. Notice that an r. We claim that the recursive function h(x) = f (g(x. By
the extended recursion theorem. y) with Vr(x. set is recursive if and only if its complement is r. m)) would belong to domoi. K reduces to C through some function f = q5 . We need to show that C is creative or. Thus we have
domie(y) = domIg(e(y).e. there exists a fixed point xy with sexy (z) = Og(xyy) (z). Since C is complete. then (from the above) pi (f (g(i. m)) were to belong to C.e. then we have witnesses against its being re.) (9U. there exists a recursive function g(x. Now define the new function
(X. equivalently. Proof We begin with the "only if" part: assume the C is many-one complete for r.

domfe(y). with infinitely many pairs of incomparable degrees-but the proofs of such results lie beyond the scope of this text. set.7 Degrees of Unsolvability
163
But C is productive.e.D. i 12(Y)) :A 0. It is clearly r.10 There exists a simple set. since it is the range of a partial recursive function. sets of different m-degrees. we claim that S is simple. rl2(Y)) • 0])
Now let S be the range of *. We want a set S which. We also want to ensure that S is infinite by "leaving out" of S enough elements. there are infinitely many m-degrees between the degree of nontrivial recursive sets and the degree of K. Now let domox be any infinite re. sets.5.e. thus S is infinite. we need only show that there exists noncreative r. sets. Therefore.doMre(y).28. Theorem 5. El By Exercise 5. Q. in which case f (e(y)) cannot be a member of C . so that Ii(y) belongs to S and the domain of ox is not a subset of E. Hence. it cannot be many-one complete for the r.E.D. Conversely. then doMrke(y) is empty and thus a subset of C. Theorem 5. it cannot be manyone reduced to a simple set and thus cannot belong to the same rn-degree.. in order to show that there exist re. When V(x) converges. Since a simple set is not creative.12 A simple set is an r. subset. so that f (e(y)) belongs to C. Then * (x) is l I(y) for that value of y. In fact.e. 1lI(y). contains an element of that domain. n I(y). then the domain of Obe(y) is {f (e(y))}.e. sets.
Proof. it is larger than 2x by definition.e. We content ourselves with observing that our Infinite Hotel story provides us with an easy proof of the following result. Definition 5. Q. Since K is many-one complete for the r. a simple set cannot be creative. and step(x. Hence there are at least two different m-degrees among the nonrecursive r. for each x such that ox has infinite domain. so that domrne(y) C C implies f (e(y)) E C .e. if y belongs to B. so that S contains at most half of the members of any initial interval of N.11 Any two m-degrees have a least upper-bound. thereby preventing it from being a subset of S. because the domain is infinite. Define the partial recursive function * as follows:
f (x) =
lI(tyY[l I(y)
>
2x and step(x.e. sets. rlI(y) E domng.
.E. so that domrne(y) is not a subset of C and f (e(y)) must be a member of C. there is a smallest y such that we have I (y) > 2x. if y does not belong to B. set such that its complement is infinite but does not contain any infinite re. Hence we have reduced B to C through f .

g(x. Clearly both A and B many-one reduce to C.)].
The m-degrees of unsolvability of the re. Thus both sA and 2/3 reduce to I.)
<
Exercise 5.. m) is the exponent of the the mth prime in the prime power decomposition of n. The function max y x[g(y.1. . z .11 Prove that the following functions are primitive recursive by giving a formal construction. we have exp(1960. 2) = I because 1960 has a single factor of 5. x) = g(x)
f(i + 1. .) 2. Let A and 7 be our two rn-degrees. g(.
~g((X
2
I))
xis X even
x is odd Q.1) + F(n .. . (Hint: use the course-of-values recursion defined in Equation 5.13 Verify that the function f defined as follows:
I f(0.164
Computability Theory
In other words. A function f is constructed from a function g by iteration if we have f (x. Exercise 5. z. C. Define the set C by C = {2x I x c AI U {2x + I I x E BR-the trick used in the Infinite Hotel. We reduce C to D by the simple mapping
h(x)= | Hence ICreduces to 2t. and pick A e s1 and B E a. ). . .. the m-degree containing. sets form an upper semilattice. The Fibonacci function F(n) is defined by F(O) = F(1) = 1 and F(n) = F(n . . and let f be the reduction from A to D and g be the reduction from B to D. there exists an m-degree I such that (i) s4 and { both reduce to ICand (ii) if Ai and 1 both reduce to any other m-degree 9. (For instance. where we assume g0 (y) = y...E.8
Exercises
Exercise 5. and defined by. Pick some set D E X.
Proof. where g is primitive recursive. where we consider the Oth prime to be 2. 3.D. 1.. Let 9 be some m-degree to which both 4A and A reduce. y) = gX (y).2).12 Verify that iteration is primitive recursive. returns the largest value in {g(0 . then I reduces to 2. x) =f(i.. h(x))
is primitive recursive whenever g and h are.
5. given two m-degrees A and A. The function exp(n.
.

5. Exercise 5. is not the totally undefined function} 3.
.15. 1.16 For each of the following sets and its complement. This question is intuitively the converse of Theorem 5. Exercise 5. For the rest. 4.e. K. is the set of functions that converge on the diagonal in at most t steps.t.14 Write a program (in the language of your choice) to compute the values of Ackermann's function and tabulate the first few values-but be careful not to launch into a computation that will not terminate in your lifetime! Then write a program that could theoretically compute the values of a function at a much higher level in the Grzegorczyk hierarchy. is recursive. Conclude that. the set nxES domox need not be ne. that is. The three sets of Exercise 5. show that it is the range or domain of a partial recursive function. 8. 5. or non-re. be the set {x I 3y . The set of all partial recursive functions that grow at least as fast as
n. for each fixed n. y) > 01. are both ne.19 Let K. The set of all primitive recursive programs. 2.15 Prove that the following three sets are not recursive by explicit reduction from the set K-do not use Rice's theorem. S(y) = {x I y is in the range of ox).5. = K. The set of all partial recursive functions with finite domain. You may use Rice's theorem to prove that a set is not recursive.e. Exercise 5.18 Let S be an ne. the largest value that can be printed by a program of length n (that halts after printing that value). The busy beaver problem can be formalized as follows: compute. {x I there is y with Ox (y) 4 and such that Oy is total) Exercise 5.8 Exercises
165
Exercise 5. To prove that a set is r.. {x I 0. use closure results or reductions. set. 7. nonrecursive but r. classify them as recursive. K. 2. Exercise 5. 1. 1. 3. prove that the sets D = UJs dom'p.2
6..17 Prove formally that the Busy Beaverproblem is undecidable. {x I Ox is infective). sets that contain at least three elements. set. Prove that. and R = Ux~s ranO. if S is an re. x. and verify the equality UJN K. step(x. {x I Ox is a constant function) 2. for each fixed t. The set of all r.e. The set of all (mathematical) primitive recursive functions.

nonrecursive but ne.e.25* This exercise develops the Rice-Shapiro theorem. Exercise 5. In essence. 2.) Exercise 5. that there is no recursive set C with A C C and B C C.166
Computability Theory Exercise 5.. a class of partial recursive functions is ne. . which characterizes re. or non-r. Define the sequence of partial recursive functions {~rt I by
_
dec(f(i.20 Prove that every infinite ne. Prove that both sets are nonrecursive but re.. Such a set would recursively separate A and B in the sense that it would draw a recursive boundary dividing the elements of A from those of B.22 Let S = {{i. set is recursive if and only if it has an injective. one that does not repeat any element). k}) to N.24 This exercise explores ways of defining partial functions that map finite subsets of N to N. Exercise 5.21 Prove that an infinite r. x) h~(x) = ~undefined x < SucC(FI(i)) otherwise
Verify that this sequence includes every function that maps a nonempty finite initial subset of N (i. monotonically increasing enumerating function. . Define the sequence of partial recursive functions {i } by * f (i. x))
i(x)
undefined i
x < Succ( 1 I(i)) and f (i. some set (0. Define the primitive recursive function
f(i.. (Hint: use the characteristic function of the putative C to derive a contradiction. 1.e. Is S recursive. x) = nSUccxflfi))(n 2 (i))
1. set has an injective enumerating function (that is. The key to extending Rice's theorem resides in finite input/output behaviors.23 Define the following two disjoint sets: A = {x IOx (x) = 0) and B = {x I x(x) = 1}. i.. ji l pl and Oj compute the same function). x) > 0 otherwise
Verify that this sequence includes every function that maps a finite subset of N to N.? Exercise 5.e. sets in much the same way as Rice's theorem characterizes recursive sets. Exercise 5.e. . if and only if each partial recursive function in the class is the extension of some finite input/output behavior
. each of which defines a recursive set. (the same proof naturally works for both) and that they are recursively inseparable.

Recursive sets now get partitioned into finite sets of each size. set of such behaviors. Do a set and its cylindrification (see previous exercise) belong to the same one-degree?
5. we could have used one-one reductions.
Exercise 5. behaves exactly like ri on all arguments on which 7ri is defined-and may behave in any way whatsoever on all other arguments).30* Instead of using many-one reductions. set I with
167
Ox ET <=} Hi E I.27 Prove that the set S(c) = {x I c V domox}.29 Let S be a set.5. where c is an arbitrary constant. reductions effected by an infective function. set is one-complete for the re. A set and its cylindrification belong to the same m-degree.{n}. and infinite sets with infinite complements. if and only if there exists an re. 2. Exercise 5. its cylindrification is not creative.24 showed that the sequence {ari I captures all possible finite input/output behaviors. The domain of On is N . Exercise 5. The domain of (n is K and also contains n. Revisit all of our results concerning m-degrees and rephrase them for one-degrees.26 Use the recursion theorem to decide whether there are indices with the following properties:
2 1. The domain of ¢0 is {n }. One-one reductions define one-degrees rather than m-degrees.9 Bibliography in an ne. Exercise 5. If a set is simple. Let I be any class of (mathematical) partial recursive functions. Then the set {x I Xx C I} is re.9
Bibliography
Primitive recursive functions were defined in 1888 by the German mathematician Julius Wilhelm Richard Dedekind (1831-1916) in his attempt to
. our formulation of the Rice-Shapiro theorem uses this sequence. subset. 3. that is. 2. is productive. sets exactly when it is complete. Exercise 5. Note also that our basic theorem about creative sets remains unchanged: an ne.28 Prove that every productive set has an infinite re. 7ri CXx
(where ari C Ax indicates that O. Exercise 5. the cylindrificationof S is the set S x N. infinite sets with finite complements. Prove the following results about cylinders: 1.

auckland. which grew from an original solution to the question of "what is a truly random string"to which he and the Russian mathematician Kolmogorov answered "any string which is its own shortest description. and others. Church. mixed with critical discussions. The text of Epstein and Carnielli [1989] relates computability theory to the foundations of mathematics and. An encyclopedic treatment can be found in the two-volume work of Odifreddi [1989]. Readers looking for a strong introductory text should consult Cutland [1980].ac. offers much insight into the development of the field of computability. now reissued by MIT Press in paperback format. at URL http://www.
. Turing. she also showed (as did Grzegorczyk [1953]) that the bounded quantifiers and the bounded search scheme share the same property. Godel. Working along the same lines. The first text on computability to pull together all of the threads developed in the first half of the twentieth century was that of Davis [1958]. Kleene. and others. Almost all of the results in this chapter were proved by Kleene [1952]. Turing. A more modern treatment with much the same coverage is offered by Tourlakis [1984].1) was shown to be closed within the primitive recursive functions by Peter [1967]. The course-of-values mechanism (Equation 5. Berry's paradox has been used by Chaitin [1990a. These articles are as relevant today as they were then and exemplify a clarity of thought and writing that has become too rare. Rogers [1967] wrote the classic. comprehensive text on the topic. Ackermann [1928] defined the function that bears his name." Chaitin maintains. while the text of Pippenger [1997] offers an advanced treatment. Davis [1965] edited an entire volume of selected reprints from the pioneers of the 1930s-from Hilbert to Godel. through excerpts from the original articles of Hilbert.cs. Godel [1931] and Kleene [1936] used primitive recursive functions again. was strongly influenced by modern results in complexity. giving them a modern formalism. a Web site with much of his work on-line.1990b] in building his theory of algorithmic information theory.168
Computability Theory provide a constructive definition of the real numbers. whose perspective on computability. Post. whose short paperback covers the same material as our chapter.nz/CDMTCS/chaitin/. like ours. In much of our treatment. along with tools useful in exploring some of the consequences of his results. but in more detail. Kleene. we followed the concise approach of Machtey and Young [1978]. Post. who used prime power encoding rather than pairing in her proof.

it does not enable us to conclude that a given algorithm is "best"-or even that it is "good"-as we lack any reference point. N log 2 N comparisons in the worst case. The objective of complexity theory is to establish bounds on the behavior of the best possible algorithms for solving a given problemwhether or not such algorithms are known. we move to ask what can and what cannot be computed within reasonable resource bounds (particularly time and space). While this approach is certainly appropriate from a practical standpoint. we also attempt to determine how close the solution produced is to the optimal solution. Characterizing the complexity of a problem appears to be a formidable task: since we cannot list all possible algorithms for solving a problemmuch less analyze them-how can we derive a bound on the behavior of the best algorithm? This task has indeed proved to be so difficult that precise complexity bounds are known for only a few problems. In order to draw such conclusions. complexity theory is the natural extension of computability theory: now that we know what can and what cannot be computed in absolute terms. Viewed from another angle. not just about the complexity of a particular algorithm that solves it. how can we best apply the methodology to solve the problem and what can we say about the resulting algorithm? We analyze each algorithm's space and time requirements. Sorting algorithms that approach this bound exist: two
169
. Sorting is the best known example-any sorting algorithm that sorts N items by successive comparisons must perform at least [log 2 N!] . in the case of approximation algorithms.CHAPTER 6
Complexity Theory: Foundations
Problem-solving is generally focused on algorithms: given a problem and a general methodology. we need results about the complexity of the problem.

1. or other resources than the best possible algorithm for solving the latter.
6. To study complexity. meaning that the best possible algorithm for solving the former requires more time. characterization of their relative complexity has proved much more successful. For example. and then develop most of the fundamental results of the theory in the ensuing sections. While characterization of the absolute complexity of problems has proved difficult. or that all members of a class of problems are of equal difficulty. reductions among sets provide an effective tool for studying computability. In contrast. Hence the lower bound for sorting is very tight.
. However. by using bit manipulations and address computations. We illustrate the basic idea informally in the next section. Consequently. at least with regard to their asymptotic behavior.170
Complexity Theory: Foundations
such are heapsort and mergesort. even such an apparently modest goal as the characterization of problems as tractable (solvable in polynomial time) or intractable (requiring exponential time) is beyond the reach of current methodologies. In this approach. we can also devise a linear-time reduction in the other direction. note that even here the bound is not on the cost of any sorting algorithm but only on the cost of any comparison-based algorithm. it is possible to sort in o(n log n) time. we can solve the marriage problem by transforming it into a special instance of a network flow problem and solving that instance. from the convex hull problem to sorting-through the Graham scan algorithm-thereby enabling us to conclude that sorting a set of numbers and finding the convex hull of a set of points in two dimensions are computationally equivalent to within a linear-time additive term. we attempt to show that one problem is harder than another. both of which require at most O(n log n) comparisons and thus are (within a constant factor) optimal in this respect. Another example is the convex hull problem: a fairly simple transformation shows that we can reduce sorting to the computation of two-dimensional convex hulls in linear time. the best algorithms known for many of the problems in this book require exponential time in the worst case.1
Reductions
Reducibility Among Problems
As we have seen in Chapter 5. we need to define reductions among problems and to assess or bound the cost of each reduction. In this case.1
6. yet current lower bounds for these problems are generally only linear-and thus trivial. since this is no more than the time required to read the input. Indeed. space.

consider the two problems Traveling Salesman and HamiltonianCircuit. (* Function returning an optimal tour and its length *) begin let N be the number of vertices in G.6.
As a more detailed example. define G' to be K-N with edge costs given by
171
J. Since the transformation can be done in O(N 2 ) time. then this tour uses only connections of unit length and thus corresponds to a Hamiltonian circuit. as described in Figure 6. function TSP(G: graph. Given a particular graph of N vertices. the overall running time of this algorithm for the Hamiltonian circuit problem is determined by the running time of our subroutine for the traveling salesman problem. We say that we have reduced the Hamiltonian circuit problem to the traveling salesman problem because the problem of determining a Hamiltonian circuit has been "reduced" to that of finding a suitably short tour. If the length of the optimal tour exceeds N.1
Reduction of Hamiltonian circuit to traveling salesman. then no Hamiltonian circuit exists. we first transform this instance of the Hamiltonian circuit problem into an instance of the traveling salesman problem by associating each vertex of the graph with a city and by setting the distance between two cities to one if the corresponding vertices are connected by an edge in the original graph and to two otherwise. var tour: list-of-edges): integer. vj) E G
if TSP(G'. we could immediately construct a polynomial-time solution for the Hamiltonian circuit problem. were we to discover a polynomial-time solution for the traveling salesman problem. var circuit: list-of-edges): boolean.
. However. No solution that is guaranteed to run in polynomial time is known for either problem.
Figure 6.1 Reductions function Hamiltonian(G: graph. (* Returns a Hamiltonian circuit for G if one exists. circuit) = N then Hamiltonian := true
else Hamiltonian := false
end. *) var G': graph.1.
(vi. If our subroutine for Traveling Salesman returns an optimal tour of length N.

the correct conclusion to draw from such a reduction is that the original problem is. then Hamiltonian could adopt the result returned by TSP without any modification whatsoever. "I've reduced the task to an earlier problem. Some time later he was led into a room similarly equipped. the amount of work would be lessened. A mathematician was led into a room containing a table. we may well believe that the Hamiltonian circuit problem is easier than the traveling salesman problem. and a bound.) An instance of the Smallest Subsets problem also comprises a set of objects with associated sizes. Partitionis a decision problem. and an empty bucket on the floor. When asked to perform the same task. put it back on the floor. We present a simple example of such a reduction. A more complex reduction may make several calls to the subroutine with varied parameter values. 1 The original problem has not really been simplified. and with only a boolean value returned. a sink. since instances of the latter produced by our transformation have a very special form. if a solution for B can be used to construct a solution for A. on the length of the desired tour. and to reinterpret the results. The question is: "How many different subsets are there such that the sum of the elements of each
Reductions are common in mathematics. if the Hamiltonian circuit problem is intractable. had we stated both problems as decision problems. easier than the one to which it is reduced. with the calling sequence of TSP consisting of a graph. then. to run the subroutine.172
Complexity Theory: Foundations The terminology is somewhat unfortunate: reduction connotes diminution and thus it would seem that. if anything. filled it with water. and placed it on the table. it includes a positive size bound B. in addition. There might exist some entirely different solution method for the original problem. conversely. then carefully emptied the bucket in the sink. as the subroutine was called only once and its answer was adopted with only minor modification. Informally. An instance of the Partition problem is given by a set of objects and an integer-valued size associated with each object. and announced. The question is: "Can the set be partitioned into two subsets. a problem A reduces to another problem B. G. k. In fact. if the traveling salesman problem is tractable. he pondered the situation for a time. so is the Hamiltonian circuit problem and that."
1
. We can conclude from this discussion that. Our earlier example used a rather restricted form of reduction. written as A A. such that the sum of the sizes of the elements in one subset is equal to the sum of the sizes of the elements in the other subset?" (As phrased. except that this time the bucket was already full of water. Indeed. one which uses less time than the sum of the times taken to transform the problem. He was asked to put a full bucket of water on the table. This is true only in the sense that a reduction enables us to solve a new problem with the same algorithm used for another problem. B (the t to be explained shortly). In our example. by "reducing" a problem to another. then so is the traveling salesman problem. He quickly picked up the bucket.

The At relation is automatically reflexive and it is only reasonable. F Exercise 6. is: "Given a set of objects with associated sizes and given integers B and K. The difference between the two answers is the number of subsets with the property that the sum of the sizes of the elements in the subsets equals T.1 Reduce Exact Cover by Two-Sets to the general matching problem. Smallest Subsets is an enumeration problem. how much work is done by your reduction? (This is a reduction that requires a large number of subroutine calls: use binary search to determine the number of subsets for which the sum of the sizes of the elements is less than or equal to half the total sum. We then call the procedure again. Obviously. which we denote by A B. However." Exercise 6. in view of our aims. The question is: "Does there exist a subcollection of N subsets that covers the set?" An instance of general matching is an undirected graph and the objective is to select the largest subset of edges such that no two selected edges share a vertex. If it is odd. Since the general matching problem is solvable in polynomial time and since your reduction should run in very low polynomial time. using a bound of B = T .1. If this difference is zero. then. each of which contains exactly two elements. say 2N. otherwise. we set B = T. to within the coarseness dictated by the type of the reduction.
. the procedure returns some number. An instance of the first problem is composed of a set containing an even number of elements. Thus we can treat '.. we immediately answer "no" and stop. On the first call. it follows that Exact Cover by Two-Sets is also solvable in polynomial time.1 Reductions
173
subset isno larger than B? " (As phrased. to require it to be transitive-i. we use the procedure for solving Smallest Subsets with the same set of elements and the same sizes as in the partition problem. we answer "no". known as K-th Largest Subset.2 The decision version of the smallest subsets problem.e.) F If we have both A B and B at A. otherwise. this tool is entirely one-sided: in order to show
'. and a collection of subsets of the set. Excluding the work done inside the procedure that solves instances of the K-th largest subset problem. to choose only reductions with that property.) We reduce the partition problem to the smallest subsets problem. an instance of the partition problem admits a solution only if the sum of all the sizes is an even number. the two problems may be considered to be of equivalent complexity. we answer "yes.6. Let T denote half the sum of all the sizes. as a partial order and compare problems in terms of their complexity. are there at least K distinct subsets for which the sum of the sizes of the elements is less than or equal to B?" Show how to reduce the partition problem to K-tb Largest Subset. Thus we start by computing this sum.

given polynomials p( ) and q(). Thus the only restriction that need be placed on the reduction t is that it require no more than polynomial time. We can also choose a specific type of reduction: the chosen type establishes a minimum degree of similarity between the two problems.1 * A problem A Turing reduces to a problem B. if there exists a mapping. * A decision problem A many-one reduces to a decision problem B. then we want A 6. Which reduction to use depends on the type of comparison that we intend to make: the finer the comparison. While a very large number of reduction types have been proposed.e. if there exists an algorithm for solving A that uses an oracle (i. in particular. particularly when both are viewed as decision problems. such that
. we need to prove that. it can make only a constant number of calls to the procedure for B. On the other hand. For instance.174
Complexity Theory: Foundations that problem A is strictly more difficult than problem B. our first example of reduction implies a strong similarity between the Hamiltonian circuit problem and a restricted class of traveling salesman problems. the more restricted the reduction.. problem A cannot be reduced to problem B-a type of proof that is likely to require other results. Definition 6. a putative solution algorithm) for B. whereas our second example indicates only a much looser connection between the partition problem and a restricted class of smallest subsets problems. B to imply that A is solvable in quadratic time if B is. while problem B reduces to problem A. we want A St B to imply that A is tractable if B is. we shall distinguish only two types of reductions: (i) Turing (or algorithmic) reductions. f: E* -+ E. denoted A Sm B. if we wish only to distinguish tractable from intractable. We can restrict the resources available to the reduction. In particular. For instance. which apply to any type of problem. the new procedure for A can call upon the procedure for B a polynomial number of times. and (ii) many-one reductions (also called transformations). denoted A ST B. since polynomials are closed under composition (that is. These are the same many-one reductions that we used among sets in computability: we can use them for decision problems because a decision problem may be viewed as a set-the set of all "yes" instances of the problem. Thus the resources allotted to the reduction depend directly on the complexity classes to be compared. if we wish to distinguish between problems requiring cubic time and those requiring only quadratic time. The reduction t must then run in quadratic time overall so that.which apply only to decision problems. there exists a polynomial r() satisfying p(q(x)) = r(x) for all x).

2
A many-one reduction from problem A to problem B. we note that f may not call on B in performing the translation. The principle of a many-one reduction is illustrated in Figure 6.3 Verify that A Sm B implies A ST B. Viewing the mapping f in terms of a program. then we speak of a one reduction. If the map is invective.1 Reductions "yes" instances of A are mapped onto "yes" instances of B and "no" instances of A.6. The term "many-one" comes from the fact that the function f is not necessarily invective and thus may map many instances of A onto one instance of B. make one call to the oracle for B." Our reduction from the Hamiltonian circuit problem to the traveling salesman problem in its decision version is a polynomial-time transformation. as well as meaningless strings. All it can do is transform the instance of A into an instance of B. while many-one reductions are much stricter and apply only to decision problems.2. whereas our reduction from the partition problem to the smallest subsets problem is a polynomial-time Turing reduction. are mapped onto "no" instances of B.
175
no
f(A) A
Figure 6. Exercise 6. thus we speak of a "polynomial-time transformation" or of a "logarithmic-space Turing reduction. and adopt the oracle's answer for its own-it cannot even complement that answer. also verify that both Turing and many-one reductions (in the absence of resource bounds) are transitive.
. E We shall further qualify the reduction by the allowable amount of resources used in the reduction. then the two problems are isomorphic under the chosen reduction. if the map is also suriective. II Turing reductions embody the "subroutine" scheme in its full generality.

and a type of reduction. Definition 6. and (ii)if a new problem can be shown to be strictly more difficult than some complete
. then all of the problems in the class can be solved efficiently. then they characterize the boundaries of a complexity class in the following ways: (i)if any one of the complete problems can be solved efficiently. I. t.1. searching an ordered array can be done in logarithmic time on a RAM. If complete problems exist for a complexity class under a suitable type of reduction (one that uses fewer resources than are available in the class). El Writing the second condition formally. complete problems for certain classes are surprisingly common. as we shall see in this chapter and the next chapter. a problem A iscomplete for I (or simply T-complete) under t if: (i)A belongs to I. in some sense. This distinction is very similar to that made in Section 2. such as deciding whether an arbitrarily quantified Boolean formula is a tautology.176
Complexity Theory: Foundations
6. B A. differs from an equivalence class in that it includes problems of widely varying difficulty. A. so we do not expect every class to have complete problems.2
Reductions and Complexity Classes
Complexity classes are characterized simply by one or more resource bounds: informally. first introduced in Definition 5. whereas all problems in an equivalence class are of similar difficulty. For instance. we return to the notion of complete problems.) Each complexity class thus includes all classes of strictly smaller complexity. since each complete problem is supposed to be the hardest in I and thus no particular complete problem could be harder than any other. Requiring a problem to be complete for a class istypically a very stringent condition. Thus the set of all complete problems for I forms an equivalence class under t-which is intuitively satisfying.3 between the 0( ) and E3( ) notations. we obtain VB EX. A is the hardest problem in I.9. thus the class of problems solvable in exponential time on a RAM includes searching an ordered array along with much harder problems. which shows graphically that. the class of intractable problems contains the class of tractable problems.2. (We shall formally define a number of complexity classes in Section 6. yet. and (ii)every problem in I reduces to A under t. but we can also assert that it is solvable in polynomial time or even in exponential time.2 Given a class of problems. A complexity class. for instance. In order to characterize the hardest problems in a class. a complexity class for some model of computation isthe set of all decision problems solvable on this model under some given resource bounds. Any complete problem must reduce to any other. therefore.

it is a proof of the problem's intractability. E In consequence. that is.1 Reductions problem. completeness in this class (with respect to polynomial-time reductions) may safely be taken as strong evidence of intractability. A.4 Any problem complete for T under some reduction t is thus automatically I-equivalent. It has not been shown that any of the problems in the class truly requires exponential time. t. All of these considerations demonstrate the importance of reductions in the analysis of problem complexity. (Again the model is irrelevant: translations among models cause at most polynomial increases in time. since we have seen that translations among models cause only constant-factor increases in space. we set up an appropriate formalism for the definition of complexity classes. however. Thus completeness in this class (with respect to polynomial-time reductions again) constitutes a proof of intractability: if any complete problem were solvable efficiently. Exercise 6. Completeness in some classes is strong evidence of a problem's difficulty. (The chosen model is irrelevant. satisfiability. currently intractable problems.) This class is known to contain provably intractable problems. completeness can be used to characterize a problem's complexity.3 Given a class of problems. In the following sections. However. and explore various ramifications and consequences of the theory. which cannot affect the polynomial bound. Definition 6. A. etc. a problem B1 is hard for I (or S-hard) under t if we have A S BI.) for which the current best solutions require exponential time-i. then the new problem cannot be a member of the complexity class. partition. I.. then all problems in the class would also be solvable efficiently. We formalize these ideas by introducing additional terminology. complete for ICunder t. problems that cannot be solved in polynomial time. a reduction. in other classes.) This class includes many problems (traveling salesman. consider the class of all problems solvable in polynomial space. but it will indicate which approaches are likely to fail (seeking optimal solutions) and which are likely to succeed (seeking
177
. Now consider the class of all problems solvable in exponential time. consider the very important complexity class known as NP. A problem that is both 'M-hard and '-easy is termed 2 F '6-equivalent. the converse need not be true: explain why. For instance.6. contradicting the existence of provably intractable problems.e. which cannot affect the exponential bound. and a problem B is easy for I (or I-easy) under t 2 if we have B A. and a problem. To know in advance that the problem at hand is probably or provably intractable will not obviate the need for solving the problem.

The first relations
can be compared to springs and the second to a rigid rod.1 and 4. called complexity functions. Classes . Even with such a minimal structure.3. it is sometimes possible to show that "good" approximations are just as hard to find as optimal solutions.2 We need to formalize this definition in a useful way. we need to identify and remedy the shortcomings of the definition. we do add a read-only input tape and a write-only output tape to obtain the standard off-line model (see Section 4. To this end. we use as our model of computation a deterministic Turing machine with a single read/write tape. while class m is suspended between the two on springs and thus can sit anywhere between the two. although
2 More general definitions exist. we can turn our attention to complexity classes.2. The results we obtain are thus particular to one model of computation.178
Complexity Theory: Foundations approximate solutions).2. namely all of the problems that can be solved on the given computational model in time (or space) bounded by f (n).
6. We already have one tool that will enable us to set up a partial order of complexity classes. a typical situation in complexity theory
is a pair of relations of the type A{ c
C
IC and A c I. as well as to develop tools that will allow us to distinguish among various classes. Even when we have lowered our sights (from finding optimal solutions to finding approximate solutions). the theory may make some important contributions. Abstract complexity theory is based on the partial recursive functions of Chapter 5 and on measures of resource use. We defined such classes informally in the previous section: given some fixed model of computation and some positive-valued function f (n). as illustrated in Figure 6.2
Classes of Complexity
Now that we have established models of computation as well as time and space complexity measures on these models. We also need a tool that can separate complexity classes-a tool with which we can prove that some problem is strictly harder than some other. This tool takes the form of hierarchy theorems and translational lemmata. As we shall see in Chapter 8. we associate with it a family of problems.3) as needed when considering sublinear space bounds. as a result.1. not excluding equality with one or the other (by flattening the corresponding spring).
. Complexity theory is built from these two tools.A and IC are securely separated by the rod. namely the time and space relationships described by Equations 4. that are defined on all convergent computations. it becomes possible to prove versions of the hierarchy theorems and translational lemmata given in this section. In Section 6.

2. polynomial for time).2.2.
6.6. in which case they probably define the same family of problems. it allows the definition of absurd families (such as the set of all problems solvable in O(e-nn) time or O(nI sin ni) space). we discuss how we can define classes of complexity that remain unaffected by the choice of model or by any (finite) number of translations between models.2. A final source of complication is that the size of the output itself may dictate the complexity of a problem-a common occurrence in enumeration problems. as we have only a countable number of solvable problems.3
A typical situation in complexity theory. f (n). most of complexity theory is built around (though not limited to) decision problems. which adds an uncertainty factor (linear for space. 4. 3. but mainly because of convenience. Since no requirement whatsoever is placed on the resource bound. is certain to erase any distinction among many possible families. our definition is likely to be ill-formed.1
Hierarchy Theorems
We can list at least four objections to our informal definition of complexity classes: 1. Model-independence. as we have seen. In part because of this last problem. Since two resource bounds may differ only infinitesimally (say by 10-1o and only for one value of n). Decision problems.
it will be easily seen that identical results can be proved for any reasonable model.2 Classes of Complexity
C BcC
AcC
179
AcB A
Figure 6. we get a correspondingly uncountable number of classes-far more than we can possibly be interested in. may be viewed as sets of
. Since there are uncountably many possible resource bounds. In Section 6.

and (ii) for each value of n. the optimization problem is surely no easier than the decision problem.
3
. is time-constructible if there exists a Turing machine 4 such that (i) when started with any string of length n on its tape. it is conceivable that the decision version would not reduce to the original optimization version. We begin by restricting the possible resource bounds to those that can be computed with reasonable effort. then the optimization version could avoid computing the objective function altogether while the decision version would require such computation. If. Secondly. In fact. For instance. the fact that we are dealing with resource bounds introduces a number of minor complications. yet we may hope that decision versions are more easily analyzed and reduced to each other.
Definition 6. we shall see later that optimization problems typically reduce to their decision versions. decision problems are often important problems in their own right. Examples include the satisfiability of Boolean formulae. However. Finally. the machine runs
1n fact. is for convenience only. In other words. so that optimization and decision versions are of equivalent complexity. However. the planarity of graphs. we shall limit our discussion to decision problems and thus now offer some justification for this choice. any intractability results that we derive about decision versions of optimization problems immediately carry over to the original optimization versions.4 A function. f(n). it runs for at most f (n) steps before stopping. in addition. 3 Hence.180
Complexity Theory: Foundations "yes" instances. a solution to the optimization problem provides an immediate solution to the decision problem-it is enough to compare the value of the objective function for the optimal solution with the prescribed bound. The hierarchy theorems that establish the existence of distinct classes of complexity are classic applications of diagonal construction. or the existence of safe schedules for resource allocation. there exists at least one string of length n which causes the machine to run for exactly f(n) steps. the decision version reduces to the original optimization problem. asking for an answer to an instance is then equivalent to asking whether the instance belongs to the set. rather than a RAM or other model. In the next few sections. the membership of strings in a language. Let us now return to our task of defining complexity classes. the truth of logic propositions. many of which must be handled through small technical results. In such a case. 4 The choice of a Turing machine. any optimization problem can be turned into a decision problem through the simple expedient of setting a bound upon the value of the objective function. First. it does not otherwise affect the definition. we know of no natural problem of that type. if the objective function were exceedingly difficult to compute and the optimal solution might be recognized by purely structural features (independent from the objective function).

so that limiting ourselves to such resource bounds answers our second objection. the two functions must denote the same class-a result known as linear speed-up. we can reduce the storage as well as the running time by any given constant factor.6. The change in alphabet works for Turing machines and for some other models of computation but is hardly fair (surely the cost of a single step on a Turing machine should increase with the size of the alphabet. Space-constructible and fully space-constructible E functions are similarly defined. Hence given two resource bounds.2 Classes of Complexity
181
for exactly f (n) steps on every string of length n. Then any problem solvable in g(n) time (respectively space) is also solvable in f (n) time (respectively space). then it is 2 (log log n). with g(n) . for the most part. since
. the two corresponding classes can be distinct only if one bound grows asymptotically faster than the other. then f (n) is said to be fully time-constructible. Obviously.1 Let f and g be two functions from RN N such that f (n) is to 6(g(n)). the reader would be justified in questioning this result. Lemma 6. using an off-line Turing machine. one dominated by the other. concerning infinitesimally close resource bounds. there exist at most countably many fully time. By encoding suitably large (but finite) groups of symbols into a single character drawn from a larger alphabet.1 If f (n) is space-constructible and nonconstant. but clearly not time-constructible.5 Prove that any space-constructible function that is nowhere 2 smaller than n is also fully space-constructible. Theorem 6. or exponential function is both time. polynomial. Our first objection also disappears. the functions [log nI and [kFin] are fully space-constructible.f (n) for all n-so that any problem solvable within bound f (n) is also solvable within bound g(n)-under which conditions will there exist a problem solvable within bound g(n) but not within bound f (n)? We can begin to answer by noting that. Any constant. Exercise 6. Our third objection.or space-constructible functions.15). since nontrivial time-constructible functions must be Q(n) and since corresponding space-constructible functions are characterized as follows. whenever f is 0(g). must be addressed by considering the most fundamental question: given resource bounds f (n) and g(n).and spaceconstructible (see Exercise 6. F In both cases the proof consists of a simple simulation based upon a change in alphabet. However.

so that the total space needed is E)(f (n)).. We shall look at all Turing machines and focus on those that run within some time or space bound. among Turing machines that run within a certain space bound. However.2: a machine running in f (n) space cannot run for more than Cfin) steps (for some constant c > 1) without entering an infinite loop. Therefore. then there exists a Turing machine M' that obeys the same space (but not necessarily time) bound. F In order to see that this lemma holds. n + 2. The counter takes O(f (n)) space in addition to the space required by the simulated machine. we use the speed-up theorem to help us in proving the hierarchy theorems (its use simplifies the proofs). it is used only as protection against ratios that may not converge at infinity. this bound can be taken to be exactly f (n). accepts the same strings. small technical problem. n + 1. which is the largest lower bound on the ratio. Fortunately for us. and halts under all inputs. In summary. formally. we have inf h(n) = min{h(i) I i = n. together.. there exist machines that may never terminate on some inputs (a simple infinite loop uses constant space and runs forever). we have restricted resource bounds to be time. Lemma 6.2 If M is a Turing machine that runs within space bound f (n) (where f (n) is everywhere as large as Flog ni). model-independence will make the whole point moot-by forcing us to ignore not just constant factors but any polynomial factors. In the theorem. we need to clear up one last. these considerations answer all four of our objections. o The notation inf stands for the infimum.182
Complexity Theory: Foundations
each step requires a matching on the character stored in a tape square) and does not carry over to all models of computation.1.2 [Hierarchy Theorem for Deterministic Space] Let f (n) and g(n) be fully space-constructible functions as large as [log ni everywhere. by Lemma 6.or spaceconstructible and will further require minimal gaps between the resource bounds in order to preserve model-independence.
. Before we can prove the hierarchy theorems. Thus we can run a simulation of M that halts when M halts or when Cf(n) steps have been taken. Theorem 6. }. If we have
neon
lim inf f (n) = 0 g(n)
then there exists a function computable in space bounded by g(n) but not in space bounded by f (n). but we do not claim it to be a characteristic of computing models. whichever comes first.. it is enough to recall Equation 4.

so that our simulation of My (j) will not run out of space. If My (j) stops. altering diagonal elements corresponding to machines that take more than f (n) space to run does not affect our construction. using at most g(n) space in the process. a mathematical function) that could not be in the list and thus would take more than f (n) space to compute-and yet that could be computed by our own diagonalizing simulation. our simulation cannot return a value.2 Classes of Complexity
183
The proof. both are cured asymptotically. only a few are of real interest: those that run in f (n) space. a new function (not a Turing machine. we now use diagonalization purely constructively. the simulation of the second on its
. and alter it. say Mj. like our proof of the unsolvability of the halting problem. we would like to enumerate only those Turing machines. c f (n) could exceed g(n).. to build a new function with certain properties. However. this time. Of these Turing machines.6. Figure 6. the simulation of the first on its diagonal argument fails to converge. simulate them on the diagonal.4(a) illustrates the process.4(b) illustrates these points. however. then our simulation returns a value that can be altered. Clearly. thereby defining. which is guaranteed to run in g(n) space. we know that there exists some other machine. that has exactly the same behavior but can be chosen with an arbitrarily larger index (we can pad Me with as many unreachable control states as necessary) and thus can be chosen to exceed the unknown constant N. failure may occur for one of three reasons: * Our simulation incurs some space overhead (typically a multiplicative constant). Since we cannot recursively distinguish Turing machines that run in f (n) space from those that require more space. Figure 6. * The Turing machine we simulate does not converge on the diagonaleven if we can simulate this machine in g(n) space. but a list of input/output pairs. Three f (n) machines (joined on the left) compute the same function. obtain a result. i. Ideally. The first two cases are clearly two instances of the same problem. thereby ensuring that we define a mathematical function different from that implemented by My (and thus also different from that implemented by Mi). Basically. which may cause it to run out of space-for small values of n. we cannot guarantee that our simulation will succeed on every diagonal element for every program that runs in f (n) space. in proper diagonalizing manner.e. we (attempt to) simulate every machine and otherwise proceed as outlined earlier. so that our simulation will run out of space and fail. If we fail to simulate machine Mi. * Our guarantee about f (n) and g(n) is only asymptotic-for a finite range (up to some unknown constant N). what the construction does is to (attempt to) simulate in turn each Turing machine on the diagonal. uses diagonalization. f (n) may exceed g(n).

.............
p0
0 lo 0 0
0 .. .. 0
0
0 .......184
Complexity Theory: Foundations
0o
0
........................... The third case has entirely distinct causes and remedies. Mi first marks g(jxj) squares on its
.......................... but the simulation of the third succeeds and enables us to ensure that our new mathematical function is distinct from that implemented by these three machines.............................. Given input x.......4 . then there exists some other machine Mk which always stops and which agrees with Mi wherever the latter stops... 0 *-------------------------------0
* simulation succeeds o simulation runs out of space * simulationfails to converge
machines (b) failing to complete some simulations
Figure 6...... we can increase it to arbitrarily large values by padding.0
0 0o 0 0 .....~0
0
0
......... if the index k is not large enough.
*
* fln) machine o other machine
0
0 .... Again.2: if Mi(i) is undefined.
0
* simulation succeeds o simulationfails
......-
machines
(a) simulating all machines
.. It is resolved by appealing to Lemma 6.I.......................................... Proof We construct an off-line Turing machine M~ that runs in space bounded by g and always differs in at least one input from any machine that runs in space f.......4
A graphical view of the proof of the hierarchy theorems..... --------0 0
* fln) machine * other machine
......... *
* 0 .......
(different) diagonal argument runs out of space.......

XI attempts to simulate Mx run on input x.) If M encounters an unmarked square during the simulation. on input j. Now.E. Thus. it immediately quits and prints 0.6. (The details of the simulation are left to the reader.2 to obtain such a machine and again increase the index as needed). there are an infinity of Turing machines with larger codes that have the same behavior. say j. if M fails to output something different from what Mi would output under the same input (because neither machine stops or because M cannot complete its simulation and thus outputs 0. Then there exists a functionally equivalent Turing machine with an encoding. given input x. Indices in one dimension represent machines while indices in the other represent inputs. Then AM has enough space to simulate Mj. since. the difference between the space and the time results may reflect only our inability to
. M outputs something different from what Mj. g(n) grows asymptotically faster than f (n).D. because the simulation may run out of space or fail to converge. for each Turing machine. Thus. then we can apply Lemma 6. The only subtle point is that we cannot do that everywhere along the diagonal. large enough that we have a f (Ij 1)< g(I j 1). The diagonalization is very similar to that used in the classical argument of Cantor. and thus also Mi. simply because ensuring that our Turing machine simulation does not exceed given time bounds requires significant additional work. which just happens to be what Mi produces under the same input). This problem is of no consequence. Moreover. M may fail to stop. The marks will enable the machine to use exactly g(n) space on an input of size n. where we determine whether the xth machine halts when run on the xth input. Of course. fails to stop when run on input x. This encoding always exists because. Now let Mi be a machine that runs in space bounded by f (n). so that. Hence the function computed by AM within space g(n) is not computable by any machine within space f (n). if machine Mx runs within space f(Ix I). M produces an output different from that produced by Mj-and thus also by Mi. then machine Mf will run in space at most a f(Jxl) for some constant a. it adds 1 to the value produced by Mx.2 Classes of Complexity
185
work tape. from which our conclusion follows. Q. and produce a new machine M~that differs from each of the enumerated machines along the diagonal. because M. we can assume that Mi halts under all inputs (if not. if M successfully completes the simulation. on input j. by hypothesis. we look along the diagonal. outputs. however. then there exists an Mj equivalent to Mi such that M successfully simulates Mj(j). However. The situation is somewhat more complex for time bounds. which it can do since g is fully space-constructible.

We construct a Turing machine that carries out two separate tasks (on separate pieces of the tape) in an interleaved manner: (i) a simulation identical in spirit to that used in the previous proof and (ii) a counting task that checks that the first task has not used more than g(n) steps.3 [Hierarchy Theorem for Deterministic Time] Let f(n) and g(n) be fully time-constructible functions.D. Proof The proof follows closely that of the hierarchy theorem for space. up to the cut-off of g(n) simulation steps. if the ratio flg f goes to zero in the limit. the overhead associated with the bookkeeping forces the larger bound. although our machines are now one-tape Turing machines. Since the counting task uses at most log g(n) tape squares. The second task uses the Turing machine implicit in the full time-constructibility of g(n) and just runs it (simulates it) until it stops. That formulation placed the logarithmic factor in the ratio rather than the class definition. Either formulation suffices for our purposes of establishing machineindependent classes of complexity. While our construction proceeds as if we wanted to prove that the constructed function is computable in time bounded by g(n). but the interleaving of the two adds a factor of log g(n): we intercalate the portion of the tape devoted to counting immediately to the left of the current head position in the simulation. E1 Our formulation of the hierarchy theorem for deterministic time is slightly different from the original version.c
then there exists a function computable in time bounded by g(n) [log g(n)1 but not in time bounded by f(n). If we have lim inf
f (n) =0 g(n)
n .E. Theorem 6. but ours is slightly easier to prove in the context of single-tape Turing machines. An initialization step sets up a counter with value g(n). then there exists a function computable in g(n) time but not in f(n) time. Q.186
Complexity Theory: Foundations prove a stronger result for time rather than some fundamental difference in resolution between space and time hierarchies.
. which was phrased for multitape Turing machines. thereby allowing us to carry out the counting (decrementing the counter by one for each simulated step) without alteration and the simulation with only the penalty of shifting the tape portion devoted to counting by one position for each change of head position in the simulation. stating that. the penalty for the interleaving is exactly log g(n) steps for each step in the simulation. Each task run separately takes only g(n) steps (the first only if stopped when needed).

6. For instance.2 Classes of Complexity These two hierarchy theorems tell us that a rich complexity hierarchy exists for each model of computation and each charging policy. we often adopt the same convention for space classes for the sake of simplicity and uniformity. but equally valid (with the obvious changes) for time. hence the lowest class of time complexity of any interest includes all sets recognizable in linear time. according to Theorem 6.3. We can now return to our fourth objection: since we do not want to be tied to a specific model of computation. this statement applies only to time bounds: space bounds need only be invariant under multiplication by arbitrary constants. with g2(n) everywhere larger than log n and f (n) everywhere larger than n. (Strictly speaking. the time bounds nk and nk+1 define distinct classes on a fixed model of computation. the bounds used for defining classes of time complexity should be invariant under polynomial mappings. Lemma 6. However. Model-independence
. Further structure is implied by the following translationallemma. However. If every function computable in gI(n) space is computable in g2 (n) space. how can we make our time and space classes model-independent? The answer is very simple but has drastic consequences: since a change in computational model can cause a polynomial change in time complexity.) In consequence.3 Let f (n). (n). classes that are distinct on a fixed model of computation will be merged in our model-independent theory. verifying only that the definition of each is well-founded but not discussing the relationships among these classes. given here for space. as each model of computation induces an infinite time hierarchy and a yet finer space hierarchy. the polynomial increase in time observed in our translations between Turing machines and RAMs means that the bounds nk and nk+1 are indistinguishable in a model-independent theory. We now briefly present some of the main model-independent classes of time and space complexity. and g2 (n) be fully space-constructible functions.
187
6. Deterministic Complexity Classes Almost every nontrivial problem requires at least linear time. the result is a very rich hierarchy indeed. g. since the input must be read. then every function computable in g. (f (n)) space is computable in g 2 (f (n)) space.2
Model-Independent Complexity Classes
We have achieved our goal of defining valid complexity classes on a fixed model of computation.2.

"no" otherwise. 2. The next higher complexity class will include problems of superpolynomial complexity and thus include intractable problems. Although E is a fairly natural class to define. a set S is in P if and only if there exists a Turing machine M such that M. it is not closed under polynomial transformations. Formally. a set S is in E if and only if there exists a Turing machine M such that M. run on x. While the hierarchy theorems allow us to define a large number of such classes. which we denote P. Most of us have seen a large number of
. stops after 0 (2 0(Ix )) steps and returns "yes" if x belongs to S.188
Complexity Theory: Foundations now dictates that such sets be grouped with all other sets recognizable in polynomial time. from our point of view. it is also the class of tractable (decision) problems. Exercise 6. We need no more sets because. Ii The difference between our two exponential classes is substantial. Definition 6. which we denote E and Exp. Thus the lowest model-independent class of time complexity is the class of all sets recognizable in polynomial time. both classes contain problems that take far too much time to solve and so both characterize sets that include problems beyond practical applications.6 Verify that P and Exp are closed under polynomial-time transformations and that E is not. Formally. "no" otherwise. "no" otherwise. stops after 0 (2 1xi' )) steps and returns "yes" if x belongs to S. run on x. it presents some difficulties-in particular. Exp is the class of all sets recognizable in exponential time.5 1. yet. 3. stops after O(Ix lo(1 )) steps and returns "yes" if x belongs to S. Thus we use mostly Exp rather than E. as defined. Formally. we mention only two of them-the classes of all sets recognizable in two varieties of exponential time. E Our hierarchy theorem for time implies that P is a proper subset of E and that E is a proper subset of Exp. in general. We need all of these sets because the polynomial cost of translation from one model to another can transform linear time into polynomial time or. a set S is in Exp if and only if there exists a Turing machine M such that M. our class is invariant under polynomial translation costs (since the polynomial of a polynomial is just another polynomial). run on x. P is the class of all sets recognizable in polynomial time. E is the class of all sets recognizable in simple exponential time. which will be our main tool in classifying problems. a polynomial of degree k into one of degree k'.

since we have seen that an algorithm running in f (n) space can take at most cfn) steps (Equation 4.6. for every possible sequence of moves for the first player. Again our hierarchy theorem for space implies that PSPACE is a proper subset of ExPSPACE. we also have PSPACE C Exp. moreover. the players cannot see each other's moves. we must now check that the same fixed choice for the ith move of the first player is a winning move regardless of the moves made so far by the second player. whether that sequence is a winning strategy. where the hierarchy is relatively coarse due to the constraint of model-independence. along all possible paths dictated by that strategy and by the possible moves of the second player. we immediately have P C PSPACE and Exp c ExPSPACE. In our modified version. As we have required our models of computation not to consume space faster than time (Equation 4. the decision versions of these problems thus belong to P. for each move of the second player. instead of checking at each node of the game graph that. Most of the game problems discussed earlier in this text are solvable in polynomial space (and thus exponential time) through backtracking. A suitable candidate is a version of the game of Peek. since each game lasts only a polynomial number of moves. Thus we content ourselves with space classes similar to the time classes. so that a player's strategy no longer depends on the opponent's moves but consists simply in a predetermined sequence of moves. Since our translations among models incur only a linear penalty in storage. this approach takes doubly
. we lack the proper tools for classifying a set within a rich hierarchy. we define the space complexity classes PSPACE and EXPSPACE. by analogy with P and Exp.5 illustrates this idea. by checking every possible corresponding path in the game graph. In terms of algorithms. for each possible strategy of the first player. it appears that we have to check. Figure 6. This game is clearly in Exp because a standard game-graph search will solve it. However. the first player has a winning choice. our main interest is in time complexity.2 Classes of Complexity
189
problems (search and optimization problems. we could define a very complex hierarchy of model-independent space complexity classes. As a result. we have to define a very complex problem indeed. To find in EXPSPACE a problem that does not appear to be in Exp. 1). as we shall see. However. In its standard version. Peek is a two-person game played by sliding perforated plates into one of two positions within a rack until one player succeeds in aligning a column of perforations so that one can "peek" through the entire stack of plates. We can certainly do that in exponential space by generating and storing the game graph and then repeatedly traversing it. Moreover.2). each player can move only certain plates but can see the position of the plates manipulated by the other player. for the most part) solvable in polynomial time.

190
Complexity Theory: Foundations
(a) a box with eight movable plates. which we denote by L.
exponential time. For instance.5
A sample configuration of the game of Peek. six slid in and two slid out
I
(b) a typical plate
Figure 6. and it is not clear that (simply) exponential time can suffice. all of these actually use at least logarithmic storage in our theoretical model. We also know of many problems that can be solved in sublinear space. While we often speak of algorithms running in constant additional storage (such as search in an array). binary search maintains three indices into the input array. hence we should also pay attention to possible space complexity classes below those already defined. in order to address any of the n array positions. Using the same reasoning as for other classes. each index must have at least Flog 2 n] bits. we see that L is a proper subset
. Thus we can begin our definition of sublinear space classes with the class of all sets recognizable in logarithmic space.

Formally. Definition 6. Examples of such include strong connectivity and biconnectivity as well as the matching problem: all are solvable in polynomial time. which form the partial order described in Figure 6. w1 We now have a hierarchy of well-defined. To verify that the same is true of each L'. by replacing log n with log' n.6. distinct class. indeed.11. Thus we cannot assert that L2 is a subset of P. El The reader will have no trouble identifying a number of problems solvable in logarithmic or polylogarithmic space. Since each class is characterized by a polynomial function of log n. PoLYL. stops having used O(log Jxl) squares on its work tape and returns "yes" if x belongs to S. In order to identify tractable problems that do not appear to be thus solvable. "no" otherwise. is a proper superset of L but remains a proper subset of PSPACE. Exercise 6. Lk. as the class of all sets recognizable in space bounded by some polynomial function of log n. our relation between time and space allows an algorithm using O(log 2 n) space to run for O(co°g2n) = 0 (nalogn) steps (for some constant a > 0). run on x. classifying a problem means finding the lowest class that contains the
191
. L' is defined similarly. stops having used O(log°(l) lxj) squares on its work tape and returns "yes" if x belongs to S. run on x. Theorem 6.7 Verify that PoLYL is closed under logarithmic-space transformations. On the other hand. which is not polynomial in n.2 tells us that the resulting class. it is natural to define a new class. the nature of the relationship between L2 and P remains unknown. Since each class in the hierarchy contains all lower classes. 2. A similar derivation shows that each higher exponent. PoLYL is the class of all sets recognizable in space bounded by some polynomial function of log n. call it L2 . but all appear to require linear extra space. L is the class of all sets recognizable in logarithmic space.2 Classes of Complexity of PSPACE and also that it is a (not necessarily proper) subset of P. since the exponent of the logarithm can be increased indefinitely. "no" otherwise. both L and this new class are model-independent. we can increase the space resources to O(log 2 n). Formally.6 1. refer to Exercise 6. k 3 2. the hierarchy theorem for space implies PoLYL C PSPACE. Since translation among models increases space only by a constant factor.6. a set S is in L if and only if there exists an off-line Turing machine M such that M. we must identify problems for which all of our solutions require a linear or polynomial amount of extra storage. a set S is in PoLYL if and only if there exists an off-line Turing machine M such that M. defines a new. From O(log n). model-independent time and space complexity classes.

nor do we have proofs that they require superpolynomial time. given a solution structure-an example of a certificate-the correctness of the answer is easily verified in (low) polynomial time.e. verifying that an instance of the traveling salesman problem admits a tour no longer than a given bound is easily done in linear time given the order in which the cities are to be visited.
. However. i. the decision versions of a large fraction of our difficult problems share one interesting property: if an instance of the problem has answer "yes. also involves proving that the problem does not belong to a lower class. the decision versions seem no easier to deal with than the optimization versions. we can easily verify that the answer is correct. by identifying a solution structure." then. verifying that a formula in conjunctive normal form is indeed satisfiable is easily done in linear time given the satisfying truth assignment. even when restricted to the question of tractability.
problem-which. Unfortunately.192
Complexity Theory: Foundations
EXPSPACE
XP
P
L
Figure 6.6 A hierarchy of space and time complexity classes. In this respect. unless the problem belongs to L. then. the latter task appears very difficult indeed. no polynomial time algorithms are known. For most of the problems that we have seen so far. Answering a decision problem in the affirmative is most likely done constructively. For instance.. Similarly. for the problems in question.

so that at most p(Ixj) characters of the certificate are meaningful in the computation. Definition 6. since this can be done in polynomial time. let us briefly define some classes of complexity by using the certificate paradigm. let alone verify. redundant: since the Turing machine runs for at most p(Ix I) steps. correspond to nondeterministic classes. it can look at no more than p(jxl) tape squares. strictly speaking. returns "yes" in no more than p(jxj) steps. that is.6. run with x and cx as inputs. and easily verified. the winning strategy presumably does not admit a succinct description and thus requires exponential time Just to read. for the second. For instance. Before explaining why certificates and nondeterminism are equivalent. For instance. A certificate is something that we chance upon or are given by an oracle-it exists but may not be derivable efficiently. it would seem. defined by a bound placed on the time or space required to verify a given certificate. but in order to answer "no." we must be thorough and check all possible structures for failure. while conceptually easy to verify (all we need is the game tree with the winning move identified on each branch at each level). since its length is polynomially bounded.) While each
193
. but the reader should keep in mind that the notion of a certificate is certainly not an algorithmic one. Hence the asymmetry is simply that of chance: in order to answer "yes.2 Classes of Complexity Not all hard problems share this property. Succinct and easily verifiable certificates of correctness for "yes" instances are characteristic of the class of decision problems known as NP. is very expensive to verify-after all." it suffices to be lucky (to find one satisfactory solution). we shall assume that the certificate is written to the left of the initial head position. F1 (For convenience. the problem "Is the largest clique present in the input graph of size k?" has a "yes" answer only if a clique of size k is present in the input graph and no larger clique can be found-a certificate can be useful for the first part (by obviating the need for a search) but not. an answer of "yes" to the game of Peek (meaning that the first player has a winning strategy). (The requirement that the certificate be succinct is. The lack of symmetry between "yes" and "no" answers may at first be troubling. Certificates and Nondeterminism Classes of complexity based on the use of certificates. Other problems do not appear to have any useful certificate at all.7 A decision problem belongs to NP if there exists a Turing machine T and a polynomial p() such that an instance x of the problem is a "yes" instance if and only if there exists a string cx (the certificate) of length not exceeding p( Ix l) such that T.) The certificate is succinct.

Q. Thus a "no" instance of a problem in NP simply does not have a certificate easily verifiable by any Turing machine that meets the requirements for the "yes" instances. where at least one of the two containments is proper. given a problem. no equivalent result is known for E. Specifically. We know that E and NP are distinct classes. Generating them all requires time proportional to p(lxl) dP(IxI). both containments are conjectured to be proper. in contrast. for a problem not in NP. when given a "yes" instance and an arbitrary (since it will not be used) certificate.194
Complexity Theory: Foundations distinct "yes" instance may well have a distinct certificate. each of length p(JxD). the certificatechecking Turing machine T and its polynomial time bound p() are unique for the problem. so that they exist "only" in exponential number. Exercise 6. feeding each in turn to the certificatechecking Turing machine. if the tape alphabet of the Turing machine has d symbols (including the blank) and the polynomial bound is described by p). and its polynomial bound-we enumerate all possible certificates.
time. this Turing machine. In particular. cE
Proof. any problem in NP has a solution algorithm requiring at most exponential
. since we have P c Exp.
Each potential certificate defines a separate computation of the underlying deterministic machine.D. the power of the NP machine lies in being able to guess which computation path to choose. its certificate-checking Turing machine. there exists a Turing machine which.4 NP is a subset of Exp. there does not even exist such a Turing machine. until either the machine answers "yes" or we have exhausted all possible certificates. when started with x as input. Given a problem in NP-that is. While proving that NP is contained in Exp was simple. A somewhat more elaborate result is the following. Since p(lxI) dP(Ixl) is bounded by 2 q(lxl) for a suitable choice of polynomial q. returns "yes" within polynomial time. then an instance x has a total of dP(IxI) distinct certificates. Exponential time allows a solution by exhaustive search of any problem in NP as follows. Theorem 6. F It is easily seen that P is a subset of NP.E. Thus we have P C NP C Exp. For any problem in P. returns "yes" or "no" within polynomial time. simply because
.8 Verify that NP is closed under polynomial-time transformations. The key to the proof is that all possible certificates are succinct. although no one has been able to prove or disprove this conjecture. as does checking them all (since each can be checked in no more than p(Jxj) time).

I. the nondeterministic machine is faced with a choice of steps-one for each possible character on the tape-and chooses the proper one-effectively guessing the corresponding character of the certificate. are easy to verify-a concise solution would not be very useful if we could not verify it in less than exponential time. solutions which." problems encountered in practice fall (when in their decision version) in this class. of course. Since our certificate-checking machine takes polynomial time to verify the certificate. 5 Another reason.) Conversely. these problems admit concise solutions (the certificate is essentially a solution to the search version). if a decision problem is recognized in polynomial time by a nondeterministic machine. Otherwise. the class was first characterized in terms of nondeterministic machines-rather than in terms of certificates. The article was a hoax. and the opening move should be Pawn to Queen's Rook Four (a never-used opening that any chess player would scorn. (This idea of guessing the certificate is yet another possible characterization of nondeterminism. In that context. the "yes" instances of which have succinct certificates. a decision problem is deemed to belong to NP if there exists a nondeterministic Turing machine that recognizes the "yes" instances of the problem in polynomial time. is that it embodies an older and still unresolved question about the power of nondeterminism. However. The acronym NP stands for "nondeterministic polynomial (time)". the two classes could be incomparable or one could be contained in the other-and any of these three outcomes is consistent with our state of knowledge. even if it had been true. The main reason is simply that almost all of the hard. Whenever our machine reads a tape square where the certificate has been stored. that had allegedly run a chess-solving routine on some machine for several years and finally obtained the solution: White has a forced win (not unexpected).T. Yet. First. then it has a certificate-checking machine and each of its "yes" instances
5 A dozen years ago. it had just the right touch of bizarreness). let us verify that any problem. of more importance to theoreticians than to practitioners. the two machines are identical. who would have trusted it?
. The class NP is particularly important in complexity theory.6.2 Classes of Complexity
195
the latter is closed under polynomial transformations while the former
is not. Thus we use the convention that the charges in time and space (or any other resource) incurred by a nondeterministic machine are just those charges that a deterministic machine would have incurred along the least expensive accepting path. although hard to solve. yet "reasonable. also has a nondeterministic recognizer. the nondeterministic machine also requires no more than polynomial time to accept the instance. a chess magazine ran a small article about some unnamed group at M. This definition is equivalent to ours. By "reasonable" we mean that. moreover.

(Their proofs. Nondeterminism is a general tool: we have already applied it to finite automata as well as to Turing machines and we just applied it to resourcebounded computation.196
Complexity Theory: Foundations
has a succinct certificate. are rather more technical. the search technique used in the proof of Theorem 6.4 takes on a new meaning: the exponential-time solution is just an exhaustive exploration of all the computation paths of the nondeterministic machine. However. In a sense. of the two relations for time. in fact. one translates without change. causes only a quadratic one. hierarchy theorems similar to Theorems 6.4 can be used for any nondeterministic time class.) Moreover. Nondeterministic space classes can also be defined similarly.
Proof. which is why we shall omit them. so that we have
DTIME(f(n)) C NTIME(f(n)) C DTIME(C f(n))
where we added a one-letter prefix to reinforce the distinction between deterministic and nondeterministic classes. such an artifact has been defined: an alternatingTuring machine has both "or" and "and" states in which existential and universal quantifiers are handled at no cost. the proof of Theorem 6. In the light of this equivalence. in turn.
.2 and 6.5 [Savitch] Let f (n) be any fully space-constructible bound at
least as large as log n everywhere. The certificate is just the sequence of moves made by the nondeterministic Turing machine in its accepting computation (a sequence that we know to be bounded in length by a polynomial function of the size of the instance) and the certificate-checking machine just verifies that such sequences are legal for the given nondeterministic machine. Thus we can consider nondeterministic versions of the complexity classes defined earlier. nondeterminism appears as an artifact to deal with existential quantifiers at no cost to the algorithm. the source of asymmetry is the lack of a similar artifact 6 to deal with universal quantifiers. What makes this result nonobvious is the fact that a machine running in NSPACE(f) could run for 0(2f) steps.3 hold for the nondeterministic classes. Theorem 6. instead of causing an exponential increase (as for time).
DSPACE(f(n)) C NSPACE(f(n))
whereas the other can be tightened considerably: going from a nondeterministic machine to a deterministic one. making choices all along
6 Naturally. however. then we have NSPACE(f) c DSPACE(fJ2 ).

1.E. whereas a time-efficient algorithm would store the result of each and look them up rather than recompute them. we can reduce O(f 2 (n)) to a strict bound of f 2(n). Figure 6.7 illustrates this idea. which appears to leave room for a superexponential number of possible configurations. Thus the total space required for checking one tree is ()(f 2 (n)). If accepting configuration Ia can be reached from initial configuration Io. for each accepting configuration. for each accepting configuration. we check whether there exists some intermediate configuration I. an encouragingly simple situation after the complex hierarchies of time complexity classes. is a tree of configurations with 0 ( 21 (n)) leaves and height equal to f (n). we need only store e (f (n)) of them-one for every node (at every level) along the current exploration path from the root. Each accepting configuration is checked in turn. hence the space needed for the entire procedure is e(f 2 (n)). while we have L C NL C L . We can generate successive accepting configurations.2 Classes of Complexity
197
the way. The simulation used in the proof is clearly extremely inefficient in terms of time: it will run the same reachability computations over and over.D. The effective result. such that it can be reached from Io in at most 2 k-1 steps and Ia can be reached from i. the polylogarithmic
. We show that a deterministic Turing machine running in DSPACE(f 2(n)) can simulate a nondeterministic Turing machine running in NSPACE(f(n)). On the other hand. so we need only store the previous accepting configuration as we move from one tree search to the next. both inclusions are conjectured to be proper. whether this configuration can be reached from the initial one. since it cannot exceed the total number of tape configurations times a constant factor. but. with a depth-first traversal of the tree. whereas time is of no import. by generating all possible configurations at the final time step and eliminating those that do not meet the conditions for acceptance. it can be reached in at most 0( 2 f(n)) steps. Intermediate configurations (such as Ii) must be generated and remembered. Q. can be reached from Io in at most 2k steps. Each configuration requires O(f (n)) storage space and only one accepting configuration need be kept on tape at any given time. although all 0(2 f(n)) potential accepting configurations may have to be checked eventually. To check whether I. Savitch's theorem implies NPSPACE = PSPACE (and also NEXPSPACE = ExPSPACE).6. The simulation involves verifying. In fact. in at most 2 k-1 steps. for example. By Lemma 6. This number may seem too large to check. but we can use a divide-and-conquer technique to bring it under control. the number of possible configurations for such a machine is limited to 0(2f ).. But avoiding any storage (so as to save on space) is precisely the goal in this simulation. thereby proving our theorem. in fact.

If g can be generated through composition. as we need only prove that a certificate for a "yes" instance can be checked in that much space-a much simpler task than designing a deterministic algorithm that solves the problem.1 Consider the problem of Function Generation. from S to S. we prove that it belongs to NPSPACE. .k-1) then reachable = true function transition(I-1. can g be expressed as a composition of the functions in the collection? To prove that this problem belongs to PSPACE..I_2. and a target function g.f(n)) then print "yes" and stop print "no" function reachable(Il.7
The divide-and-conquer construction used in the proof of Savitch's theorem. Given a finite set S.
space hierarchy is defined by the four relationships:
Lk
NLk
c
C
Lk+1
NLk+l
Lk
NLk
c
C
NLk
L2k
Fortunately.k-1) then if reachable(I.198
Complexity Theory: Foundations
for each accepting ID IIa do if reachable(IO. a collection of functions.I12. Savitch's theorem has the same consequence for PoLYL as it does for PSPACE and for higher space complexity classes: none of these classes differs from its nondeterministic counterpart. f2. These results offer a particularly simple way of proving membership of a problem in PSPACE or PoLYL.I_2) /* returns true whenever ID I-2 is reachable from ID I-1 in at most one step */
Figure 6. .I_2) else for all ID I while not reachable do if reachable(Il.k) /* returns true whenever if ID I-2 is reachable from ID I-1 in at most 2-k steps */
reachable = false if k = 0
then reachable = transition(I_1. it can be
. fn}. {fi. Example 6. .I.Ia.

37. At the same time. that is. Since we have not defined NL-completeness nor proved this particular result. while we clearly have NL C NP (for the same reasons that we have L C P). .
A simple way to prove this result is to use the completeness of DigraphReachability for NL. We can require that each successively generated function. the machine rejects the input. the last function generated. (If there was a repetition. In other words..8 illustrates them and those interrelationships that we have established or can easily derive-such as NP C NPSPACE = PSPACE (following P C PSPACE). the reader may simply want to keep this approach in mind and use it after reading the next section and solving Exercise 7. be distinct from all previously generated functions. since the problem of reachability in a directed graph is easily solved in linear time and space. given such a problem. so that the extra storage is simply the room needed to store the description of three functions (the target function. the machine maintains a counter to count the number of intermediate compositions.) There are at most SI 5Is distinct functions from S to S.
fij
for j < k.'.. if the counter exceeds IS11sI before g has been generated. to PSPACE. and the newly generated function) and is thus polynomial. there exists an algorithm that solves it in polynomial time (but may require polynomial space) and there exists another algorithm that solves it in O(log 2 n) space (but may not run in polynomial time). Only the previous function gjI is retained in storage. proving NL C P is somewhat more difficult.A
for some value of k.6. Hence the problem belongs to NPSPACE and thus. we can count to this value in polynomial space. each function gj defined by
gj = At
in .. by Savitch's theorem.'.2 Classes of Complexity generated through a composition of the form
199
g = A.. Devising a deterministic algorithm for the problem that runs in polynomial space would be much more difficult. 7 The reader should beware of the temptation to conclude that problems in NL are solvable in polynomial time and O(log 2 n) space: our results imply only that they are solvable in polynomial time or O(log2 n) space. Figure 6. we could omit all intermediate compositions and obtain a shorter derivation for g.. The one exception is the relationship NL C P. the result follows.fi2
'. F We now have a rather large number of time and space complexity classes. which sets a bound on the length k of the certificate (this is obviously not a succinct certificate!). Now our machine checks the certificate by constructing each intermediate composition gj in turn and comparing it to the target function g.
7
. we content ourselves for now with using the result..

The hierarchy
. The first part is usually done by devising an algorithm that solves the problem within the resource bounds characteristic of the class or.8
A hierarchy of space and time complexity classes.
6. for nondeterministic classes.3
Complete Problems
Placing a problem at an appropriate level within the hierarchy cannot be done with the same tools that we used for building the hierarchy. The second part needs a different methodology. we must establish that the problem belongs to the class and that it does not belong to any lower class. In order to find the appropriate class. by demonstrating that "yes" instances possess certificates verifiable within these bounds.200
Complexity Theory: Foundations
EXPSPACE
Figure 6.

Consider for instance a problem that we have proved to belong to Exp. This difference is illustrated in Figure 6. if we assume P 0 NP. In proving a succeeding problem complete. In proving our first problem to be complete. Fortunately.e. they do not apply to a specific problem.
C
201
Thus completeness and hardness offer simple mechanisms for proving that a problem does not belong to a class. thereby facilitating the development of a reduction. Not every class has complete problems under a given reduction. we increase our flexibility in developing new reductions: the more complete problems we know. we must begin by establishing that the problem does belong to the class. as we increase our catalog of known complete problems for a class. Given the same two classes and given a problem not known to belong to T2. Similarly. In proving any problem to be complete for a class.3 Complete Problems theorems cannot be used: although they separate classes by establishing the existence of problems that do not belong to a given class. but for which we have been unable to devise any polynomial-time algorithm. In order to show that this problem does not belong to P. we can prove that this problem does not belong to IC. that does not enable us to solve problems outside of XI within the resource bounds of TI). by proving that it is %2-hard under the same reduction.9. Our first step. then. in what is often called a generic reduction. unchanged (i. we can show that a problem in NP does not also belong to P by proving that it is complete for NP under polynomial-time (Turing or many-one) reductions. by combining the implicit reduction to the known complete problem and the reduction given in our proof. In general. we already have a suitable tool: completeness and hardness. is to establish the existence of complete problems for classes of interest under suitable reductions..
. we must show that every problem in the class reduces to the target problem.9 Prove these two assertions.6. it suffices to show that it is complete for Exp under polynomial-time (Turing or many-one) reductions. Moreover.and '2 with XI C 2 and we want to show that some problem in '62 does not also belong to IC. it suffices to show that the problem is {2 -complete under a reduction that leaves IC. The second part of the proof will depend on our state of knowledge.. the more likely we are to find one that is quite close to a new problem to be proved complete. Specific reductions are often much simpler than generic ones. Exercise 6. if we have two classes of complexity IC. we need only show that some known complete problem reduces to it: transitivity of reductions then implies that any problem in the class reduces to our problem.

respectively. In order to distinguish between the two classes. we establish a first complete problem for a number of classes of interest.e. because of the apparent asymmetry of NP. Karp then used polynomialtime transformations in the paper that really put the meaning of NPcompleteness in perspective. i. polynomial-time transformations have been most common. beginning with NP. only P is as clearly closed under the Turing version.
In the rest of this section. both P and NP are clearly closed under polynomial-time many-one reductions. An instance of the problem is given by a collection of clauses. requiring the reductions to be many-one (rather than Turing) is not likely to impose a great burden and promises a finer discrimination. we must use a reduction that requires no more than polynomial time. although logarithmic-space transformationsa further restriction-have also been used. polynomial-time Turing reductions were the first used-in Cook's seminal paper.9
Generic versus specific reductions. Thus we define NP-completeness through polynomial-time transformations.1 NP-Completeness: Cook's Theorem
In our hierarchy of space and time classes.) Cook proved in 1971 that Satisfiability is NP-complete. 8 Since then. in Chapter 7 we develop a catalog of useful NP-complete problems. 6.
. Since decision problems all have the same simple answer set.. whereas.202
Complexity Theory: Foundations
(a) generic
(b) specific
Figure 6. (Historically. an assignment of the logical values true or false
8 This historical sequence explains why polynomial-time many-one and Turing reductions are sometimes called Karp and Cook reductions. the class immediately below NP is P.3. the most useful of these classes. Moreover. the question is whether these clauses can all be satisfied by a truth assignment.

hence it need not be specified in the simulation of the initial configuration. the proof proceeds by simulating the certificate-checking Turing machine associated with each problem in NP. {b). moreover.
E
Proof The proof is long. Example 6. cz Theorem 6. it suffices to simulate a Turing machine through a series of clauses. a literal is either a variable or the logical complement of a variable. satisfying the second and third clauses then requires that b be set to "true" and c be set to "false." But then the fourth clause is not satisfied. it clearly cannot be satisfied by any truth assignment to the variable a. and the current state of the finite control-what is known as an instantaneous description (ID).3 Complete Problems
203
to each variable. and e-and four clauses-{a. so that there is no way to satisfy all four clauses at once. As seen previously. Since the instance is part of the certificate-checking Turing machine in its initial configuration (the instance is part of the input and thus written on the tape). e}. e). d. given an instance of a problem in NP-as represented by the instance x and the certificate-checking Turing machine T and associated polynomial bound p( )-an instance of Satisfiability can be produced in polynomial time.2 Here is a "yes" instance of Satisfiability: it is composed of five variables-a. Z}. A clause is a logical disjunction (logical "or") of literals. and c-and four clauses-{a). and {d. c. the position of the head. The certificate itself is unknown: all that need be shown is that it exists. b. c. we show that. b. but not complicated.6. The first has one variable. we can write it as the Boolean formula
(a
V c Ve) A
(b)
A
(b v c v d V e)
A
(d V e)
That it is a "yes" instance can be verified by evaluating the formula for the (satisfying) truth assignment a -false b -false c -true d <-false e -false
In contrast. The second has three variables-a. as characterized by the contents of the tape. a. Since there is clearly no hope of reducing all problems one by one. and
. here are a couple of "no" instances of Satisfiability. it is quite instructive. and one clause. c}. {a-. and {b. d. c. e}. {b. b}. simulating a Turing machine involves: (i) representing each of its configurations. which is satisfiable (a "yes" instance) if and only if the original instance is a "yes" instance. the empty clause. Specifically. {a. Satisfying the first clause requires that the variable a be set to "true".6 [Cook] Satisfiability is NP-complete. Using Boolean connectives.

As for the control state.p(IxI). q(i. then T is in state j at step i. exactly one of the q(i. At step i. however.i . at each step.k < 1 . 1). -p(jxj) + 1 .s.q(i. at each step. Thus we set up a total of s (p(jxj) + 1) state variables. Hence we set up a total of d * (2p(jxI) + 1) (p(jxj) + 1) variables. Each square. The following clauses ensure that T is in at least some state at each step:
. we need clauses to ensure that the head scans some square at each step. Hence we need only consider the squares between -p(lxl) + 1 and p(jxj) + 1 when describing the tape contents and head position. we need a variable describing the current control state. .p(Ixj) Now. h(i. it cannot scan any square to the left of -p(lxI) + 1 or to the right of p(Ixl) + 1. j). j).. then the head is on square j at step i. q(i. . k). 0 .
{h(i. Let us first address the description of the machine at a given stage of computation.p(jxj)
and that it scans at most one square at each step. 1). -p(jxj) + 1). all variables describing the machine will exist in p(Ix ) + 1 copies. 1)). . j) must be true. s control states which cannot be accounted for by a single Boolean variable. . 2). h(i.l)}. must contain at least one symbol.i .k). If q(i.j . s)}.i < p(Ixj). 0 S i S p(Ixj). For each step and each such square. t(i. -p(jxj) + 1 S j S p(lxj) + 1. thereby necessitating d variables for each possible square at each possible step. 1 . and for each pair of states (k. for each i.
{q(i. 0 . h(i. the Turing machine has. which is ensured by the following clauses:
. h(i. 0 . Since a X* b is logically equivalent to a-v b. q(i. O-i -p(xD. say. . . 0). p(jxj) + 1)}. 1). 1 k<1--s
Now we need to describe the tape contents and head position.204
Complexity Theory: Foundations (ii) ensuring that all transitions between configurations are legal. j. Of course. . except that each square contains one of d tape alphabet symbols.. All of this must be done with the sole means of Boolean variables and clauses.k < I.. then it cannot be in state 1. . if T is in
state k. 1 < k S d. 1 . If h(i. we set up a variable describing the head position. j) is true.s. Since the machine starts with its head on square 1 (arbitrarily numbered) and runs for at most p(lxI) steps. the following clauses will ensure that T is in a unique state at each step:
{q(i. k) X> q(i. . This requirement can be translated as q(i. then. 0 . k). j) is true. {h(i.p(jxj) + 1 The same principle applies for describing the tape contents. Since the machine runs for at most p(lx I) steps.i .

{t(0. x2 )}. tt(p(lxi). 1.i . then the contents of square j will remain unchanged at step i + 1 (note that the implication (a A b) =: c is translated into the disjunction a v b v c): {h(i. 1)}. say symbol 2 in square 1.
205
0_-i -p(ixJ). squares -p(lxl) + 1 through -1 contain the certificate. if T is not scanning square j at step i. t(i. . {t(O. -p(IxD)+ 1 . and squares Ix + 1 through p(Jx I) + 1 are blank.j _ p(JxJ) + 1.1)1. {t(0. At this time also.i < p(lxl). well-defined configuration at each step. 2). 1. square 0 contains the separator. 1)}.lxi +1 I p(-xI) + 1. where xi is the index of the ith symbol of string x. {q(p(JxJ). {t(O. the tape must contain the code for yes. {t(p(JxJ). and then enforce its transitions. t(i. j. _jj ).2)}. j.i . 2. squares 1 through IxI contain the description of the instance. 1 k<I-d
The group of clauses so far is satisfiable if and only if the machine is in a unique.p(Ixl). 1. -p(lxl) + I -. 1
Now it just remains to ensure that transitions are legal. k). . j =# 1)}. t(i. The following clauses accomplish this initialization: {h(O.P(lxI) + 1 Each tape square. 0. 1 _-k _-d
. also let the first tape symbol be the blank and the last be a separator. i. . k)}.
o -. The halt state must be entered by the end of the computation (it could be entered earlier. which is in turn ensured by the following clauses: {T(T. . Let 1 be the start state and s the halt state.6. j. xl)}. the following clauses ensure that. {q(O. d)}. 0 .
-p(IxJ)+
1 s j _p(Jxl)+ 1. may contain at most one symbol. with the only tape square changed being square j. j). then T will be in state q' at step i + 1. {t(0. 1)). These conditions are ensured by the following clauses:
[h(p(JxJ). j. In the initial configuration. t(i. 1). but the machine can be artificially "padded" so as always to require exactly p(IxI) time). j. with the head on that square. xlx)). j.s)}. one for each possible transition.. 1)1. at each step. All that need be done is to set up a large number of logical implications. Now we must describe the machine's initial and final configurations. of the form "if T is in state q with head reading symbol t in square j at step i. -pOxI) + 1 --j p--xI) + 1. IxI. d)}. .3 Complete Problems It(i." First. t(i + 1.

j. moves its head to adjacent square j' (either j + 1 or j . the contents of the tape to the left of the head is the only piece of information needed to reconstruct the complete truth assignment to all of our variables in deterministic polynomial time.) Assuming P #& NP. k. but the "standard" complete problem described there-a bounded version of the halting problem-is not nearly as useful as Satisfiability. the result of each transition (new state. lxi. 0 . j).25 for NP.
Q. and new tape symbol) is described by three clauses. 1). j')}. over 0(p 2 (IxI)) variables. t(i +1. j. cx. k). while the total number of clauses produced is O(p 3 (lx I).206
Complexity Theory: Foundations Secondly.i < p(Ix ). there exists a certificate cx such that T.E. 1). such that T. no
. k). such that this truth assignment is part of a satisfying truth assignment for the entire collection of clauses.p(lxl) + 1. and enters state 1'.
Thus NP-complete problems exist. writes symbol k' on square i.. th (i. and 1 . 1). k).1). new head position. j. j). q(i. k')).e. Our transformation mapped a string x and a Turing machine T with polynomial time bound p() to a collection of 0(p 3 (IxI)) clauses. q(i + 1. t(i. when in state I with its head on square j reading symbol k.D. by a quantity that does not depend on x). q(i. {h(i. (Note that the implication (a A b A c) X d is translated into the disjunction ai v b v . Hence the size of the instance of Satisfiability constructed by the transformation is a polynomial function of the size of the original instance.1')). -p(lxl) + 1 .s.v d. for each quadruple (i.) Table 6. j. leaving symbol 2 ("yes") on its tape if and only if the collection of clauses produced by our generic transformation for this instance of the problem is satisfiable. stops after p(xD) steps. 1). These clauses are fh(i.k S d.1 summarizes the variables and clauses used in the complete construction. (We could have established this existence somewhat more simply by solving Exercise 6. started with x and cx on its tape. t(i. is the only piece of information that the checker needs in order to answer the question deterministically in polynomial time. 1 < I . The existence of a certificate for x thus corresponds to the existence of a valid truth assignment for the variables describing the tape contents to the left of the head at step 0. The length of each clause-with the single exception of clauses of the third type-is bounded by a constant (i.j . j. each of constant (or polynomial) length. h(i + 1. q(i. For each "yes" instance x. t(i. it is easily verified that the construction can be carried out in polynomial
time. Just as the certificate. j).

and y = 2p(lxl) + 1
. j). j. I)} {q(a. x. k). {t(O.) I2 h(i. 0. j. . 1)). j. 0). l). d)}
s(s . 1). q(i.s)at
Meaning The Turing machine is in
least one stare at time
fq(i {h(i. 1)). k)} {h(i.3 Complete Problems
207
Table 6. 1). j) t (i j k)
. j. j).
d(d 2
1)
{h(O. 1. 1)1
Initial tape contents
{h(a. 1')} {h(i. 1)}
1
1 Y
Final head position Final state Final tape contents From time i to time i + I no change to tape squares not under the head Next state Next head position Next character in tape square
{h(i.
~
. Variables
Name
q(i.1)
The Turing machine is in at most one state at time i The head sits on at least one tape square at time i The head sits on at most one tape square at time i Every tape square contains at least one symbol at time i Every tape square contains at most one symbol at time i Initial head position Initial state
h(i. t(a. . j.B
. j. q(i. xIxi)). 1)}
{t (0. symbol
The head is on
d
y
Tape square j contains k at time i
Clauses Clause
{qqi. h(i + 1. -e).1
Summary of the construction used in Cook's proof. IxI + 1. 2)}. t(i.
{t(i. . 1)}. k). q(i. {t(O. t(i. j).
k-). 1). . 1)). j.6. . -a. .
. . . d))
1
P+
I
{t(O. t(i. {t(a. 2)
Number
2). k). q(i.
j. j. j. s))
. j).
Meaning The Turing machine is state j at time square j at time r
h (i. k). . j. x)}. 2. = p(lxI) + 1. q(i + 1. k')}
d
y
s d a -y
s d a y s d a -y
with a = p(jxI). {t(a.{t(a. t(i + 1. 2). j')} {h(i. 1). 0. a. a)}
{t(i. 1.q(i. q(i_. k). t(i. t(i + 1.t(i. . s)}
{t(a. {t(O. t(i. j)
q~iI)
Number
s fiin . t(i. 0)} {q(O.

further NP-completeness proofs will be composed of two steps: (i) a proof of membership in NPusually a trivial task. implying P = NP).P that are not NP-complete (a result that we shall not prove). such as Integer Programming and its special cases. Cook first proved that Satisfiability and Subgraph Isomorphism (in which the question is whether a given graph contains a subgraph isomorphic to another given graph) are NP-complete. chemistry.208
Complexity Theory: Foundations
Figure 6. which asks whether two given graphs are isomorphic. which have been studied for many years by researchers and practitioners in mathematics. The fact that no polynomial-time algorithm has yet been designed for any of them. Are all problems in NP either tractable (solvable in polynomial time) or NP-complete? The question is obviously trivial if P equals NP. the answer is no: unless P equals NP. and computer science. then so are all problems in NP. physics. Karp then proved that another 21 problems of diverse nature (including such common problems as HamiltonianCircuit.
NP-complete problem may belong to P (recall that if one NP-complete problem is solvable in polynomial time. Unfortunately. finance. Candidates for membership in this intermediate category include Graph Isomorphism. but it is of great interest otherwise. Now that we have an NP-complete problem.10
The world of NP. etc. biology.). coupled with the sheer size and diversity of the equivalence class of NP-complete
. taken from all areas of computer science as well as some apparently unrelated areas (such as metallurgy. the list of NP-complete problems has grown to thousands. By now. because a negative answer implies the existence of intractable problems that are not complete for NP and thus cannot be proved hard through reduction. there must exist problems in NP . Among those problems are several of great importance to the business community.10. and Primality. Set Cover. and Knapsack) are also NP-complete. operations research. which asks whether a given natural number is prime. and (ii) a polynomial-time transformation from a known NP-complete problem. so that the picture of NP and its neighborhood is as described in Figure 6.

even though. yet we do not even know whether there are
. would send all algorithm designers back to the drawing board.2
Space Completeness
The practical importance of the space complexity classes resides mostly in (i) the large number of PSPAcE-hard problems and (ii) the difference between PoLYL and P and its effect on the parallel complexity of problems.3).
6.4. Similarly. a negative answer would turn the thousands of proofs of NP-completeness into proofs of intractability. it is conjectured that P is a proper subset of NP. deciding whether an arbitrary graph has such a circuit is NP-complete. These considerations explain why the question "Is P equal to NP?" is the most important open problem in theoretical computer science: on its outcome depends the "fate" of a very large family of problems of considerable practical importance. A positive answer. In other words.6. Thus the class P n PoLYL is of particular interest in the study of parallel algorithms. is considered strong evidence of the intractability of NP-complete problems. For instance. we mean worst-case complexity larger than polynomial.) Moreover.3. a randomly generated graph almost certainly has a Hamiltonian circuit. complexity theory has been remarkably successful at identifying intractable problems. all problems of any interest are solvable in polynomial space. a single proof of NP-completeness immediately yields proofs of intractability for the various versions of the problem. if the problem is also tractable (in P). If we assume P #& NP. however unlikely. a proof of NP-completeness is effectively a proof of intractability-by which.3 Complete Problems
209
problems.1. (Many NP-complete problems have relatively few hard cases and large numbers of easy cases. even though deciding whether an arbitrary graph is three-colorable is NP-complete. in fact. (Any problem within PoLYL requires relatively little space to solve. then it becomes a good candidate for the application of parallelism. a randomly generated graph is almost certainly not three-colorable.3.) Polynomial Space With very few exceptions (see Section 6. showing that. as the reader will recall. as we shall prove in Section 7. We shall return to this topic in Section 9. although this conjecture has resisted the most determined attempts at proof or disproof for the last twenty-five years. since the decision version of a problem Turing reduces to its optimization version and (obviously) to its complement (there is no asymmetry between "yes" and "no" instances under Turing reductions: we need only complement the answer).

) Indeed. through the characterization of complete problems. The conjecture.
this interest is justified. of course. among other things. while NP-complete problems would become tractable. that is. is essentially an arbitrarily quantified version of SAT. PSPAcE-complete problems would remain
intractable. (Note the difference between a predicate and a fully quantified Boolean formula: the predicate has unbound variables and so may be true for some variable values and false for others.8 are all strict. we must start by identifying a basic PSPAcE-complete problem.5. known as Quantified Boolean Formula (or QBF for short). Since we cannot separate P from PSPACE. short of a direct proof of intractability. If in fact the containments described by Figure 6. this inequality would follow immediately from a proof of P # NP. Since a large number of problems. since
. A fairly simple example of such instances is Va:1b((Z
A
(Vc(b
A c))) V
(3dee(d X (a V e))))
You may want to spend some time convincing yourself that this is a "yes" instance-this is due to the second term in the disjunction. whereas the fully quantified formula has no unbound variables and so has a unique truth value. our basic NPcomplete problem. as is conjectured. A convenient problem. PSPAcE-hard problems are not "reasonable" problems
in terms of our earlier definition: their solutions are not easily verifiable. A further reason for studying PSPACE is that it also describes exactly the class of problems solvable in polynomial time through an interactive protocol between an all-powerful prover and a deterministic checker-an interaction in which the checker seeks to verify the truth of some statement with the help of questions that it can pose to the prover.210
Complexity Theory: Foundations
problems solvable in polynomial space that are not solvable in polynomial time. an instance of QBF can make use of any of the Boolean connectives and so can be quite complex. so that. As in our study of NP. the reductions used must be of the same type as those used within NP. A study of PSPACE must proceed much like a study of NP.3 In its most general form. We shall return to such protocols in Section 9. An instance of QBF is given by a well-formed Boolean formula where each variable is quantified.) Example 6. including a majority of two-person game problems. can be proved PSPAcE-hard. using at most polynomial time. is P 0 PSPACE. The interest of the class PSPACE thus derives from the potential it offers for strong evidence of intractability. since. we could still have P 7& PSPACE. (Even if we had P = NP. either universally or existentially. then a proof of PSPAcE-hardness is the strongest evidence of intractability we can obtain. the question is whether the resulting proposition is true.

M may make up to Cp(n) moves on inputs of size n. d its number of alphabet symbols. If)]
. we write a quantified Boolean formula Fj (Ii.7 QBF is PSPAcE-complete. instances of QBF take some more restricted form where the quantifiers alone are responsible for the complexity.3 Complete Problems we can choose to set d to "true. the expression should evaluate to "true" for any assignment of values to the two variables. For some constant c. Only one truth assignment need be stored at any step. Since the number of moves is potentially exponential. 0 ." E Theorem 6. That QBF is in PSPACE is easily seen: we can just cycle through all possible truth assignments. verifying the truth value of the formula for each assignment. the arbitrarily quantified version of Satisfiability.
Qx = 3I03If [INITIAL(IO) A FINAL(If) A Fp(n. used in exactly the same manner as in the proof of Cook's theorem.6. Let M be a polynomial space-bounded deterministic Turing machine that decides a problem in PSPACE. The generic reduction from any problem in PSPACE to QBF is done through the simulation by a suitable instance of QBF of the given space-bounded Turing machine. and s its number of states. Now. More typically. E
Proof. One such form is QSAT. We encode each instantaneous description of M with d s p2 (n) variables: one variable for each combination of current state (s choices). Evaluating the formula for a given truth assignment is easily done in polynomial space. 12) (where I] and 12 are distinct sets of variables) that is true if and only if I. We encode the transitions in exponential intervals. and 12 represent valid instantaneous descriptions (IDs) of M and M can go in no more than 2i steps from the ID described by if to that described by 12. let p( ) be its polynomial bound." thereby satisfying the implication by default. yet it evaluates to "false" (due to the second conjunct) when both are set to "false. For each j.j _ p(n) log c. An instance of QSAT is Va3bVc3d((a v b)
A
211
(a V c)
A (b V c Vd))
This instance is a "no" instance: since both a and c are universally quantified. together with a counter of the number of assignments checked so far. this requires only polynomial space. we use our divide-and-conquer technique.)ogc(Io. Thus the problem can be solved in polynomial space (albeit in exponential time). and current tape contents (d p(n) choices). current head position (p(n) choices). our quantified formula becomes
. where the quantifiers are all up front and the quantified formula is in the form prescribed for Satisfiability. for input string x of length n.

212
Complexity Theory: Foundations where Io and If are sets of existentially quantified variables. contriving 1 to write it as a single "subroutine" in the formula. Set j = log c -p(n).
X. and FINAL(If ) asserts that If represents an accepting ID of M.
Y2. since each variable takes O(log j + log p(n)) space. The key in the formulation of this problem is the arbitrary alternation of quantifiers because. these assertions are encoded with the same technique as used for Cook's proof.. -
Z2. so that we have a polynomial-time reduction. As noted in Example 6.
. QSAT becomes our old acquaintance Satisfiability and thus belongs to NP. in particular. An instance of this simplified problem.
Zn)
where P() is a collection of clauses.
Zn. when all quantifiers are existential. INITIAL(IO) asserts that Io represents the initial ID of M under input x.133K[((J= IIA K=I) V(J =
. the divide-and-conquer approach does not help since all of the steps end up being coded anyhow.E. can be written
'VXI. X2.) An ingenious trick allows us to use only one copy of Fj. YI
Yn. thereby using more than polynomial space. QSAT.D. * .I2) = 3I. and 12 represent valid IDs of M and that either they are the same ID (zero step) or M can go from the first to the second in one step. With two auxiliary collections of variables J and K. 12) for each j. (Used in this way. K)]
We can code a variable in Fj in time O(j p(n) * (log j + log p(n))). Z2.
3YI. we assert that I. When j equals 0. Xn. we can simply remove the second.
Yn. we set up a formula which asserts that Fj 1-(J. whenever two identical quantifiers occur in a row. The obvious induction step is Fj (It. K) must be true when we have either J = I. 12)] but that doubles the length of the formula at each step. how then do the arbitrary quantifiers of QSAT arise? Cl
. *. .I(I.
P(XI.
Y2. 12) = 31 [Fj-i(11. Q. Exercise 6. Thus it remains to show how to construct the Fj (II.10 The proof of completeness for QBF uses only existential quantifiers.
VZI.Zi. This formula obviously has the desired property that it is true if and only if M accepts x.and K = I or J = I and K = 12:
Fj(II.
. which is easily done by recursion.
can be written in O(p2 (n) *log n) time.. we can restrict QBF to Boolean formulae consisting of a conjunction of disjuncts-the arbitrarily quantified version of Satisfiability.
IA K=
2)) =}Fj_-(J.3. X 2 . I) A Fj. then Q.

Asking whether the first player in a game has a winning strategy is tantamount to asking a question of the form "Is it true that there is a move for Player I such that. logspace transformations have the crucial property of reductions: they are transitive. generalized Chess. that is. L equals P-the equivalent. To be PSPAcE-complete. and automata theory and.11* Verify that logspace transformations are transitive. one level lower in the complexity hierarchy. The solution is a logspace transformation: a many-one reduction that uses only logarithmic space on the off-line Turing machine model. without the special termination rules.3 Complete Problems
213
Many other PSPAcE-complete problems have been identified. One of the most interesting questions is whether. from the area of two-person games. Gomoku.. (In fact. and Instant Insanity are PSPAcE-complete. including problems from logic. the resulting position is a win for Player I?" This question is in the form of a quantified Boolean formula. Examples include generalizations (to arbitrarily large boards or arbitrary graphs) of Hex. We have seen that both L and NL are subsets of P. the problem must also be in PSPACE. since L is such a restricted class. and Go. for any move of Player 2. . It comes as no surprise. but it is believed that L2 and all larger classes up to and including PoLYL are incomparable with P and with NP. formal languages. there is a move for Player 1 such that. but our reason for discussing them is their importance in the study of parallelism. and Go are not. intractable. However. all three of these games have been proved Exp-complete.. we need a reduction with tighter resource bounds. we resort to our familiar method of identifying complete problems within the larger class. Thus. where each variable is quantified and the quantifiers alternate. that deciding whether the first player has a winning strategy is PSPACE-hard for many games. (The difficulty is that the output produced by the first machine cannot be considered to be part of the work tape of the compound machine. such that for any move of Player 2.) Polylogarithmic Space The classes of logarithmic space complexity are of great theoretical interest. Chess. while generalized Hex. Checkers. unless special termination rules are adopted to ensure games of polynomial length. so as to be closed within L. . thus we seek Pcomplete problems. in fact. Despite their very restricted nature. since
. then. Exercise 6. Checkers. of the question "Is P equal to PSPACE?" Since the two classes cannot be separated.6. more interestingly for us. Gomoku. for any move of Player 2. which puts a basic requirement on the game: it cannot last for more than a polynomial number of moves.

e. adding
The same tool could be. S = {a. In particular. and a relation R C V x V x V. bi. V. y. c). c. (e. A vertex v E V is deemed accessible if it belongs to S or if there exist accessible vertices x and y such that (x. hi. (a. El An immediate consequence of these properties is that. a member of the target set. Path System Accessibility (PSA). (c. if set A belongs to any of the complexity classes mentioned so far and set B (logspace) reduces to set A. d). g. h)i. (a. b. the problem is tractable and requires very little space. if any logspace-complete problem for NP belongs to Lk. involve logspace reductions. T = {g.8 PSA is P-complete. including the proof of Cook's theorem. d. c. c). though. then we have L = P. b. (d. e. An instance of PSA is composed of a finite set. Thus we concentrate on the distinction between P and the logarithmic space classes. 9 Notice that most of our proofs of NP-completeness.4 Here is a simple "yes" instance of PSA: V = {a. b. and R = {(a. whether a problem belongs to L or to NL makes little difference: in either case. and by using the fourth triple. S = (a).
E
Proof. (d. f). v) e R. of vertices. f. Note that not every triple is involved in a successful derivation of accessibility for a target element and that some elements (including some target elements) may remain inaccessible (here e and h). hi. then set B also belongs to that complexity class. then NP itself is a subset of Lk-a situation judged very unlikely. The first P-complete problem. That PSA is in P is obvious: a simple iterative algorithm (cycle through all possible triples to identify newly accessible vertices.214
Complexity Theory: Foundations that might require more than logarithmic space. b. b. then P is a subset of Lk-and thus of PoLYL. An interesting consequence is that. (a. a subset T C V of terminal vertices. c. if a problem is P-complete and also belongs to Lk. we conclude that c and d are also accessible. (c. d). (d. e)). by using the second and third triples with our newly acquired knowledge that both a and b are accessible. f. we never stored any part of the output under construction. was identified by Cook. we conclude that b is also accessible. f. effectively recomputing as needed to obtain small pieces of the output. and R = {(a. is accessible. For practical purposes. T = {e). g). b). a subset S C V of starting vertices.) Further show that.
9
. By applying the first triple and noting that a is accessible. we now conclude that e. b. d. d. The way to get around this difficulty is to trade time for space. only a constant number of counters and indices. Another "yes" instance is given by V = {a. el. a. e. if a P-complete problem belongs to L. g). and is. used for attempting to separate NL from L by identifying NLcomplete problems. f). C Theorem 6. The question is whether T contains any accessible vertices. Example 6.

h.p(lxl). i. h. 1. since log p(Lxj) is O(log xj). one for each tape square: (0. short IDs. which describe a step number. until a complete pass through all currently accessible vertices fails to produce any addition) constructs the set of all accessible vertices in polynomial time. (t. we cannot provide complete instantaneous descriptions. (t + 1.t < p(Ixj).
. i. c)-where i designates the square and c the character stored there. and ((t.
1. so that all computations take exactly p(IxI) steps. We call these abbreviated instantaneous descriptions. 1). i 0 h. These 2p(lx I) + 1 IDs together form subset S. 0 . s). h. c. h. moves to state s'. If the machine. h. (O. h. Given an arbitrary problem in P. (t. .6.h . replacing symbol c with symbol c' and moving its head to the right. (t + 1. 1). s. then our relation includes. s). for -p(IxI) + 1 S i S 0. the contents of a tape square at that step. for each value of t. h. 1.
i S p(lxI) + 1. i. j. c.
(0.
. 1. j. s). s')) indicating the changed contents of the tape square under the head at step t. We can count to p(jxI) in logarithmic space. c. -p(lxj) . (t. c. 1. c'. x2. h. 1. 1). i. 1). going through p((x() + I IDs. s).xi. (i. We shall construct V to include all possible IDs of M in a p(IxI)time computation. h + 1. of which there are O(p 3 (IXj)). . 2. s).
using the same conventions as in our proof of Cook's theorem. h + 1. and tape symbol.
IXI. s')) for each possible combination of tape square. we must reduce it in deterministic logspace to a PSA problem. Let M be a deterministic Turing machine that solves our arbitrary problem in time bounded by some polynomial p0). h. XIAI.3 Complete Problems
215
them to the set when they are found. and the control state at that step.
(0. we allow the final configuration to repeat. because there is an exponential number of them. If M terminates in less than p(jxI) time on input X.while further IDs can be made accessible through the relation. The initial state is described by 2p(IxI) + 1 short IDs. However. for xl + 1 i. t. indicating that the contents of tape squares not under the head at step t remain unchanged at step t + 1. 1). and for each head position. Instead the vertices of V will correspond to five-tuples. 1. the head position at that step. the following triples of short IDs: ((t. when in state s and reading symbol c. 1. (O. h. j.

with the head on tape square 1. and make the new set of initially accessible elements consist of the single element a. In the PSA problem. is very easy to verify. such problems do not exist for PoLYL! This rather shocking result. Given an arbitrary instance V. we cannot simply place these five-tuples in T. and one for all control states). one for all alphabet symbols. It then follows from Exercise 6. after exactly p(IxI) steps. one for all tape squares. which we organize. Rather than set up a class of P-complete problems. and T C V of PSA.216
Complexity Theory: Foundations In an accepting computation. one for all head positions.E. Had we instead defined acceptance as reaching a distinct accepting state at step p(Ix 1).'. which together describe the new configuration. R C V x V x V. regardless of tape contents. one for all steps. Any problem complete for PoLYL would have to belong to Lk for some k. v) to R for each element v C S. Since the problem asks only that some vertex in T be accessible. this part of the construction could have been avoided. we could also attempt to separate P from PoLYL by setting up a class of problems logspacecomplete for PoLYL. Since the problem is complete. S C V. Q. all problems in PoLYL reduce to it in logarithmic space. a. coming as it does after a string of complete problems for various classes. which contains symbol 2. Similarly. and with the tape empty except for square 1. it becomes accessible if and only if every one of the 2p(IxI) + 1 five-tuples describing the final accepting configuration are accessible. we shall require that all of the corresponding 2p(jx) + 1 five-tuples be accessible. as a binary tree with the 2p(jxj) + 1 five-tuples as leaves. The root of this tree is the single element of T. add one triple (a. Thus we can state that PSA remains P-complete even when restricted to a single starting element and a single target element. we can restrict the set S of initially accessible elements to contain exactly one vertex. (This last construction is dictated by our convention of acceptance by the Turing machine. we add one new element a to the set. we set up a fixed collection of additional five-tuples.
Our proof shows that PSA remains P-complete even when the set T of target elements contains a single element. the machine. which contradicts our result that Lk is a proper subset of Lk. An immediate consequence of this result is that
. the relation allows us to make another 2p(Ix I) + 1 five-tuples accessible. However. Since we can count to p(IxI) in logarithmic space. Instead.D. through the relation R. the entire construction can be done in logarithmic space (through multiply nested loops.11 that Lk equals PoLYL. is in its halt state.) The key behind our entire construction is that the relation R exactly mimics the deterministic machine: for each step that the machine takes.

Exercise 6. then A is intractable. under our usual assumptions regarding the time and space complexity hierarchies. Theorem 6.6.14 Use the hierarchy theorems to prove that both Exp and EXPSPACE contain intractable problems. Why then does the argument that we used for PoLYL not apply to P? We can make a similar remark about PSPACE and ask the same question. that Exp and EXPSPACE are two such classes. thereby contradicting Theorem 6. since this would imply PSPACE = EXPSPACE. without proof. many of these generic transformations.3 Complete Problems PoLYL is distinct from P and NP: both P and NP contain logspace-complete problems. using polynomial-time transformations. basic complete problem for either class.13 Prove this result.9 If a complexity class IC contains intractable problems and problem A is IC-hard. given any algorithm solving the problem. Proving hardness for Exp or EXPSPACE is done in the usual way. Even more strongly. while PoLYL does not. this is due to the fact that all proofs of intractability to date rely on the following simple result.. Exercise 6. the exact "flavor" of intractability that is proved) is that. this style of proof cannot lead to a proof of intractability of problems in NP or in PSPACE. proving that a problem in NP is hard for Exp would imply NP = PSPACE = Exp. For instance. very few problems have been proved intractable. The trouble is that.
El
What complexity classes contain intractable problems? We have argued. while not particularly
. In good part. a problem in PSPACE cannot be hard for EXPSPACE. LI Exercise 6. D
217
6. LI The specific result obtained (i. which would be a big surprise. there are infinitely many instances on which the running time of the algorithm is bounded below by an exponential function of the instance size.12 We can view P as the infinite union P = UkEN TIME(nk). exponential time or space allow such a variety of problems to be solved that there is no all-purpose. then a problem can be proved intractable by proving that it is hard for Exp or for EXPSPACE. Many of the published intractability proofs use a generic transformation rather than a reduction from a known hard problem.2.3
Provably Intractable Problems
As the preceding sections evidence abundantly.e. However. If such is indeed the case.3.

that is. The first "natural" problems (as opposed to the artificial problems that the proofs of the hierarchy theorems construct by diagonalization) to be proved intractable came from formal language theory. Not all of these intractable problems belong to Exp or even to EXPSPACE. since each proof is still pretty much ad hoc. in a sense. it cannot be solved by any algorithm running in time bounded by
22. these were quickly followed by problems in logic and algebra. of course.) To a large extent. Indeed.218
Complexity Theory: Foundations difficult. The complexity of Peek derives mostly from the fact that it permits exponentially long games. (This apparent need for more than polynomial space is characteristic of all provably intractable problems. Stockmeyer and Chandra proposed Peek as a basic Exp-complete game. Proving that such games are Exp-complete remains a daunting task. including generalizations to arbitrary large boards of familiar games such as Chess. were it otherwise. described in the previous section. a famous problem due to Meyer (decidability of a logic theory called "weak monadic secondorder theory of successor") is so complex that it is not even elementary. we do not present any proof but content ourselves with a few observations. if a polynomial-time bound is placed on all plays (declaring all cut-off games to be draws). then the decision problem becomes PSPAcE-complete-not much of a gain from a practical perspective. in fact. Exp is "succinct" P and NExp is succinct NP. Peek is nothing but a disguised version of a game on Boolean formulae: the players take turns modifying truth assignments according to certain rules until one succeeds in producing a satisfying truth assignment. and Go (without rules to prevent cycling. which could be reduced to a number of other games without too much difficulty. cause these problems to fall within PSPACE). Such "satisfiability" games are the natural extension to games of our two basic complete problems. since such rules. SAT and QSAT. that is. as mentioned earlier. however. In consequence. the instances of which are specified in an exceptionally concise manner. we shall not present any. are fairly intricate. not at this time. Checkers. we can view most Exp-complete and NExp-complete problems as P-complete or NP-complete problems.
for any fixed stack of 2s in the exponent! (Which goes to show that even intractability is a relative concept!) Many two-person games have been proved intractable (usually Exp-complete). as well as somewhat ad hoc games such as Peek. the problems would be in PSPACE and thus not provably intractable-at least. A simple example is the question of inequivalence of regular expressions: given regular expressions El and
.

(f (n)) space becomes. while leaves are labeled with 0.
. if we allow both Kleene closure and intersection.
Exercise 6. If the guessed string has length n. the basic expressions all denote strings of length 1 or less. 1E11 and 2 the verification takes polynomial time.1.4
Exercises
Exercise 6. The closure allows us to denote arbitrarily long strings with one expression.
Li
219
Proof. indeed. The checker is given a guess of a string that is denoted by the first expression but not by the second. E with k terms). recognizable in g.
In fact. in its padded version. it has either found a way to represent the entire string or verified that such cannot be done. it has n + 1 prefixes (counting itself).IE2 j)). where IEjI denotes the length of expression Ej. consider the same problem when Kleene closure is allowed.3. Thus n cannot exceed max{IEl. however. so that the time needed is O(n max{IEl 1.and space-constructible. is it true that they denote different regular languages? This problem is in NP if the regular expressions cannot use Kleene closure (the so-called star-free regular expressions).E. or alphabet symbols as needed). Q.16 Prove Lemma 6.
6.6. even if one of the two expressions is
simply E*.4 Exercises E2 . . and concatenation only sums lengths at the price of an extra symbol. . When done. then the inequivalence problem is complete (with respect to polynomial-time
reductions) for Exp. Pad the input so that every set recognizable in g.
the problem is then PSPAcE-complete. and exponential functions
are all time. then it is complete for EXPSPACE. In fact. union does not increase the length of strings. the problem is NP-complete.
. It constructs in linear time an expression tree for each regular expression (internal nodes denote unions or concatenations.D. If such is the case. It then traverses each tree in postorder and records which prefixes of the guessed string (if any) can be represented by each subtree. thus it is now possible that the shortest string that is denoted by the first expression but not by the second has superpolynomial length. Indeed.1 Star-FreeRegular Expression Inequivalence is in NP. Proposition 6. Now note that a regular expression that does not use Kleene closure cannot denote strings longer than itself. as we shall see in Section 7. we allow ourselves to write Ek for E E . and if we allow both Kleene closure and exponential notation (that is. Now.15 Verify that constant. polynomial. our checking mechanism will take superpolynomial time. E. (n)
.

16. the hierarchy theorem for space. there exists some x in the domain of f with f (x) = y and lxl -.) Exercise 6.17* Use the translational lemma. This result has no bearing on whether P is a proper subset of PSPACE.37) in O(log 2 n) space. then construct a machine to recognize the original set in g 2 (f (n)) space by simulating the machine that recognizes the padded version in g2 (n) space. and Savitch's theorem to build as detailed a hierarchy as possible for nondeterministic space. thus use the same technique as in proving the translational lemma-see Exercise 6. for every value y in the range of f. (Hint: this statement can be viewed as a special case of a translational lemma for time. for all x in the domain of f.20* Prove that P = NP implies Exp = NExp. polynomially computable function.19* A function f is honest if and only if. halts after at most p(lxl) steps and returns f (x).p(IyI) for some fixed polynomial po. (Hint: trade time for space by resorting to recomputing values rather than storing them. Exercise 6. as long as that same output is produced "honestly" for at least one input. (DSPACE(n) is the class of sets recognizable in linear space. What would be the consequences of such a result? Exercise 6. (Hint: you will need to use dovetailing in one proof.) Exercise 6. since DSPACE(n) is itself a proper subset of PSPACE. A function f is polynomially computable if and only if there exists a deterministic Turing machine M and a polynomial p() such that. M.)
.23 Devise a deterministic algorithm to solve the DigraphReachability problem (see Exercise 7. thereby proving that Satisfiability is NP-complete with respect to logspace reductions. Also note that an honest function is allowed to produce arbitrarily small output on some inputs.) Most NP-complete problems can be shown to be NP-complete under logspace reductions.220
Complexity Theory: Foundations space and thus also in g2 (n) space. a machine-independent class.22 (Refer to the previous exercise. started with x on its tape. Exercise 6.21 Verify that the reduction used in the proof of Cook's theorem can be altered (if needed) so as to use only logarithmic space.18* Use the hierarchy theorems and what you know of the space and time hierarchies to prove P :A DSPACE(n). Prove that a set is in NP if and only if it is the range of an honest.) Exercise 6. Suppose that someone proved that some NP-complete problem cannot be logspacecomplete for NP. Exercise 6.

(Contrast with the result of Exercise 7.. NP. We prove P = NP nonconstructively by showing that. First Proof. We saw that the halting set is complete for the recursive sets. By definition.where ICcan be any of NL. (Recall that two problems are isomorphic if there
.) Exercise 6. there must exist an equivalent polynomial-time deterministic Turing machine. Prove that. it gives some necessary background. Classes with this property are called syntactic classes of complexity. We prove P $ NP by showing that any two NP-complete problems are isomorphic. f is 0 ( 2 "l).26 The following are flawed proofs purporting to settle the issue of P versus NP. If there is a suitable move.6. x) I M E I and M accepts x} (a bounded version of the halting problem) is complete for the class IC.e. if all NP-complete problems are isomorphic. Point out the flaw in each proof. it is intuitively satisfying to observe that the appropriately bounded version of the same problem is also complete for the corresponding subset of the recursive sets. (The next paragraph is perfectly correct. then we must have P # NP.) Investigate the class SUBExP: Can you separate it from P. or Exp? How does it relate to PoLYL and PSPACE? What would the properties of SuBExP-complete problems be? What if a problem complete for some other class of interest is shown to belong to SUBExP? Exercise 6. it will choose. Second Proof. the set {(M. this move does exist.4 Exercises
221
Exercise 6. a machine running within the resource bounds of the class).51. thus there exists a deterministic machine that correctly simulates the nondeterministic machine at this step. although we do not know which is the correct next move. the nondeterministic machine applies a choice function at each step in order to determine which of several possible next moves it will make.25 Let I denote a complexity class and M a Turing machine within the class (i. By merging these steps. 2"1 is not O(f). P. for any positive constant (. we get a deterministic machine that correctly simulates the nondeterministic machine-although we do not know how to construct it. In the former case. f: N
-F N. NP. is said to be subexponential if. otherwise the choice is irrelevant. Define the but complexity class SuBExP by
SUBEXP = U{TIME(f) I f
is subexponentiall
(This definition is applicable both to deterministic and nondeterministic time. for each polynomial-time nondeterministic Turing machine. the deterministic machine can simulate the nondeterministic machine by using any arbitrary move from the nondeterministic machine's choice of moves. In the latter case.) It is easy to see that.24* A function. or PSPACE. under the obvious reductionss.

. We prove P : NP by contradiction. so that every problem in NP is also in TIME(nk).. Exercise 6. Yet. we transform them according to our reduction scheme but follow the binary string describing the transformed instance by a separator and by a binary string describing the "instance number"-i. We shall appeal to a standard result from algebra. for some k. by definition of completeness.222
Complexity Theory: Foundations
exists a bijective polynomial-time transformation from one problem to the other. This we do simply by padding: as we enumerate the instances of the first problem. the sequence number of the original instance in the enumeration.e. a contradiction. Therefore NP is a subset of TiME(nk) and hence so is P. which cannot be isomorphic to the infinite sets which make up NP-complete problems. if P were equal to NP.) Such is the case because P contains finite sets. Exercise 6. a contradiction. for decision problems within NP. Then Satisfiability is in P and thus. Prove that NP is closed under conjunctive polynomial-time reductions. we need only make this mapping one-to-one. known as the Schroeder-Bernstein theorem: given two infinite sets A and B with invective (one-to-one) functions f: A -+ B and g: B -) A. Third Proof. But the hierarchy theorem for time tells us that there exists a problem in TIME(nk+l) (and thus also in P) that is not in TIME(nk). is in TIME(nk). This padding ensures that no two instances of the first problem get mapped to the same instance of the second problem and thus yields the desired injective map. From the Schroeder-Bernstein theorem. Assume P = NP. there exists a bijection (one-to-one correspondence) between A and B. Turing reductions are more powerful than many-one reductions. A conjunctive polynomial-time reduction between decision problems is a truth-table reduction that runs in polynomial time and that produces a "yes" instance exactly when the oracle has answered "yes" to every query.28 Define a truth-table reduction to be a Turing reduction in which (i) the oracle is limited to answering "yes" or "no" and (ii) every call to the oracle is completely specified before the first call is made (so that the calls do not depend on the result of previous calls). there exists a one-to-one (as opposed to many-one) mapping from one to the other. given any two NP-complete problems. We know that there exists a many-one mapping from one to the other. then all problems in NP would be NP-complete (because all problems in P are trivially P-complete under polynomial-time reductions) and hence isomorphic.27 We do not know whether. we need only demonstrate that. although such is the case for decision problems in some larger classes. But every problem in NP reduces to Satisfiability in polynomial time. Verify that a proof that Turing reductions are more powerful than many-one reductions within NP implies that P is a proper subset of NP.

complexity measures. The analogs of Theorems 6.S. The notion of abstract complexity measures is due to Blum [1967]. The two-volume monograph of Balcazar. while Shmoys and Tardos [1995] present a more recent survey from the perspective of discrete mathematics.3 was proved by Ruby and Fischer [1965]. The more recent text of Papadimitriou [1994] offers a modern and somewhat more advanced perspective on the field. in a very terse manner. In a more theoretical flavor. Stockmeyer [1987] gives a thorough survey of computational complexity. a wealth of results in computability and complexity theory.6. Time and space as complexity measures were established early. Johnson [1990] discusses the current state of knowledge regarding all of the complexity classes defined here and many. 1990] offers a self-contained and comprehensive discussion of the more theoretical aspects of complexity theory. in addition to a lucid presentation of the topics. NP. and NP-completeness. the aforementioned references all discuss such measures and how the choice of a model affects them. while Seiferas [1990] presents a review of machine-independent complexity theory. the conference notes of Hartmanis [1978] provide a good introduction to some of the issues surrounding the question of P vs. The monograph of Wagner and Wechsung [1986] provides. their text contains a categorized and annotated list of over 300 known NP-hard problems. in which Theorem 6. Papadimitriou and Steiglitz [1982] and Sommerhalder and van Westrhenen [1988] each devote several chapters to models of computation.3 is due to Hartmanis [1968].2 and 6. Theorem 6. The concept of reducibility has long been established in computability theory. and Gabarr6 [1988.2 appears. New developments are covered regularly by D. and Stearns [1965]. Garey and Johnson mention early uses of reductions in the context of algorithms.1 was proved by Hartmanis. many more. and Lemma 6. while Hopcroft and Ullman [1979] provide a more detailed treatment of the theoretical foundations." which appears in the Journalof Algorithms and is written in the same style as the text of Garey and Johnson. Theorem 6. Diaz. The texts of Machtey and Young [1978] and Hopcroft and Ullman [1979] cover the fundamentals of computability theory as well as of abstract complexity theory.5
Bibliography
The Turing award lecture of Stephen Cook [1983] provides an excellent overview of the development and substance of complexity theory.3 for
. Garey and Johnson [1979] wrote the classic text on NP-completeness and related subjects. Lewis.5 Bibliography
223
6. Johnson in "The NP-Completeness Column: An Ongoing Guide. The seminal article in complexity theory is that of Hartmanis and Stearns [1965]. it is the ideal text to pursue a study of complexity theory beyond the coverage offered here.

The idea of NP and of NP-complete problems had been "in the air": Edmonds [1963] and Cobham [1965] had proposed very similar concepts. We follow Garey and Johnson's lead in our presentation of Cook's theorem. in the same paper. Galperin and Widgerson [1983] and studied in detail by BalcAzar et al. with a list of over 20 important NP-complete problems. among others. for which see Seiferas [1977] and Seiferas et al. was given by Cook [1970]. Stockmeyer and Chandra [1979] investigate two-person games and provide a family of basic Exp-complete games. [1995]. while Jones [1975] and Jones et al. [1992].) The first P-complete problem. The proof of intractability of the weak monadic second order theory of successor is due to Meyer [1975]. and algebra are discussed in the texts of Hopcroft and Ullman and of Garey and Johnson. Intractable problems in formal languages. logic. An exhaustive reference on the subject of P-complete problems is the text of Greenlaw et al. Karp [1972] published the paper that really put NP-completeness in perspective. Cook's theorem (and the first definition of the class NP) appears in Cook [1971a]. soon thereafter. Viewing problems complete for higher complexity classes as succinct versions of problems at lower levels of the hierarchy was proposed by. (The k-th Heaviest Subset problem used as an example for Turing reductions is adapted from their text. they also provide Exp-complete problems. [1978].224
Complexity Theory: Foundations nondeterministic machines were proved by Cook [1973] and Seiferas et al.
. The fundamental result about nondeterministic space. Jones and Laaser [1976] present a large number of P-complete problems. appears in Savitch [1970]. [1973]. The proof that NL is a subset of P is due to Cook [1974]. PSA. Stockmeyer and Meyer [1973] prove that QBF is PSPAcE-complete.5. including the game of Peek. then further refined. while Levin [1973] independently derived Cook's result. [1976] do the same for NL-complete problems. although our definition of NP owes more to Papadimitriou and Steiglitz. Theorem 6. That PoLYL cannot have complete problems was first observed by Book [1976].

While proving a problem to be hard will not make it disappear. touch upon the use of Turing reductions in place of many-one reductions. a task we take up in the next chapter. These results apply only to the setting of massively parallel computing (spectacular speed-ups are very unlikely for P-hard problems). As the reader should expect by now. we address the question of how to prove problems hard. due to the stronger resource restrictions. but parallelism is growing more commonplace and the reductions themselves are of independent interest. We begin by completeness proofs under many-one reductions. Moreover. too many of the problems we face in computing are in fact hard-when they are solvable at all. it will prevent us from wasting time in searching for an exact algorithm. We continue in Section 7. While such tight reductions are not necessary (the more general Turing reductions would suffice to prove hardness). the same techniques can then be applied again to investigate the hardness of approximation.3. and explore briefly the consequences of the collapse that a proof of P = NP would cause. In Section 7. 225
. they provide the most information and are rarely harder to derive than Turing reductions.CHAPTER 7
Proving Problems Hard
In this chapter. we present a dozen detailed proofs of NP-completeness. we show how completeness results translate to hardness and easiness results for the optimization and search versions of the same problems. In Section 7. Such proofs are the most useful for the reader: optimization problems appear in most application settings. from planning truck routes for a chain of stores or reducing bottlenecks in a local network to controlling a robot or placing explosives and seismographs to maximize the information gathered for mineral exploration.2 with four proofs of P-completeness.1.

we name these variables of 3SAT by the same names used in the SAT instance. similarly. (3SAT has the same description as the original satisfiability problem. is also NP-complete. known as Three-Satisfiability (3SAT for short). and so forth. We intend a correspondence but cannot assume it. since the additional check on the form of the input is easily carried out in polynomial time. both as an initial catalog and as examples of specific reductions.1
Some Important NP-Complete Problems
We have repeatedly stated that the importance of the class NP is due to the large number of common problems that belong to it and. we give a sampling of such problems. In this section. For convenience and as a reminder of the correspondence. that the variables of 3SAT. That 3SAT is in NP is an immediate consequence of the membership of SAT in NP. the easier it is to reduce it to another problem: we do not have to worry about the effect of the chosen transformation on complex. thus we use x. are completely different from the variables of SAT. which work in the other direction. in both the original and the transformed instances. in spite of their names. The very reason for which Satisfiability was chosen as the target of our generic transformation-its great flexibility-proves somewhat of a liability in specific reductions. We set up one or more clauses in the 3SAT problem for each clause in the SAT problem. Consider an arbitrary clause in SAT: three cases can arise. Thus we start by proving that a severely restricted form of the problem. for each variable in the SAT problem.) Proof. z. however. recall that a proof of NP-completeness by specific reduction is composed of two parts: (i) a proof of membership in NP and (ii) a polynomial-time transformation from the known NP-complete problem to the problem at hand. except that each clause is restricted to exactly cz three literals. to the large number of NP-complete problems. Thus we need only exhibit a polynomial-time transformation from SAT to 3SAT in order to complete our proof. such that the transformed instance admits a solution if and only if the original instance does. In the first case.1 3SAT is NP-complete. the more restricted a problem (the more rigid its structure and that of its solutions). The reader should keep in mind. the clause has exactly three literals. derived from three different
. we set up a corresponding variable in the 3SAT problem. y. In reading the proof.226
Proving Problems Hard
7. Indeed. unforeseen structures. in particular. no two of which are derived from the same variable. we have to verify that such a correspondence indeed exists. Theorem 7.

a contradiction. Zc(k-3). in which case an identical clause is used in the 3SAT problem. as no truth assignment to the z variables can satisfy it. to false and all zcl. first note that a satisfying truth assignment for the original clause (i. call them ZC1 and Zc2. Conversely.1 Some Important NP-Complete Problems variables. To see this. Such a clause must be partitioned into a number of clauses of three literals each. then we set all zcj.. ZC2} fZc2. we introduce two new variables.3) new variables. to true.2) clauses:
. X4.. say c = lx} (where the symbol .e. In the second case. to true) can be extended to a satisfying truth assignment for the collection of transformed clauses by a suitable choice of truth values for the extra variables. Zcl.i was set to true. assume that all literals in the original clause are set to false. we are left with two clauses. Zc2) {X. I < i . . and transform c into (k . The idea here is to place each literal in a separate clause and "link" all clauses by means of additional variables. Combining this implication with the first clause by using modus
ponens. ZC3
{Zc(k-3). Then we introduce (k .7.
34
This collection of new clauses is equivalent to the single original clause. then the collection of transformed clauses reduces to
{ZC I
{Zcl. the clause has more than three literals. Zc2}
227
(If the clause has only two literals. we introduce only one new variable and transform the one clause into two clauses in the obvious manner. say {xl.-
{lI
X2 Zcl .. their truth value in no way affects the satisfiability of the new clauses. . ZC.indicates that the variable may or may not be complemented). that is. Zc2}
[Zc2. thus all transformed clauses are satisfied.1. Zc2} {X. Zcl. In the third case. and transform c into four clauses:
{X. . X3. except the clause that has the literal x. Then each of the transformed clauses has a true z literal in it. Zcl. Such a clause can be "padded" by introducing new. ZcI. Specifically. (The first two and the last two literals are kept together to form the first and the last links in the chain. j . Xkk. the clause has two or fewer literals. one that sets at least one of the literals x. assume that . ZC3} {Zc(k-4). Zc(k-3)) PZc(k-3)1
But this collection is a falsehood.. hki.1. If clause c has only one variable. Zc2l {X.
{Zc(k-3)}
and
[Zc(k-3) .) Let clause c have k literals. together these implications form a chain that resolves to the single implication zcl X Zc(k-3).i .
{Zcl. (Each two-literal clause is an implication. redundant variables and transforming the one clause into two or four new clauses as follows.) The new variables are clearly redundant.)
.

E. * Part 2: Reduce a known NP-complete problem to the problem at hand.
* Part 1: Prove that the problem at hand is in NP. Hence we proceed by proving that two even more restricted versions of Satisfiability are also NP-complete:
* One-in-Three-3SAT (lin3SAT) has the same description as 3SAT. This part is normally done through the contrapositive: given a transformed instance that is a "yes" instance of the problem at hand. In fact. scanning clauses one by one. we can refine our description of a typical proof of NP-completeness. obtaining the structure described in Table 7.
-
-
Define the reduction: how is a typical instance of the known NP-complete problem mapped to an instance of the problem at hand? Prove that the reduction maps a "yes" instance of the NP-complete problem to a "yes" instance of the problem at hand. 3SAT is quite possibly the most important NP-complete problem from a theoretical standpoint. and either dropping. Q. * Not-AII-Equal 3SAT (NAE3SAT) has the same description as 3SAT. In light of this proof.228
Proving Problems Hard
Table 7.
except that a satisfying truth assignment must set exactly one literal to true in each clause.
Hence the instance of the 3SAT problem resulting from our transformation admits a solution if and only if the original instance of the SAT problem did. it can be done in linear time.1
The structure of a proof of NP-completeness. two. we are not quite satisfied yet: 3SAT is ill-suited for reduction to problems presenting either inherent symmetry (there is no symmetry in 3SAT) or an extremely rigid solution structure (3SAT admits solutions that satisfy one. Verify that the reduction can be carried out in polynomial time. or transforming each clause as appropriate (transforming a clause produces a collection of clauses. since far more published reductions start from 3SAT than from any other NP-complete problem. However. or three literals per clause-quite a lot of variability). prove that it had to be mapped from a "yes" instance of the NP-complete problem. copying. the total size of which is bounded by a constant multiple of the size of the original clause). Prove that the reduction maps a "no" instance of the NP-complete problem to a "no" instance of the problem at hand. It remains only to verify that the transformation can be done in polynomial time.1. removing duplicate literals. except that a satisfying truth assignment may not set all three literals
.D.

do there exist in the collection n subsets that together cover the set? * Vertex Cover (VC): Given an undirected graph of n vertices and a positive integer bound k. each with a positive integer size. P. can the set be partitioned into two subsets such that the sum of the sizes of the elements of one subset is equal to that of the other subset? * Hamiltonian Circuit (HC): Given an undirected graph. is there a subset of at most k vertices that covers all edges (i.1 Some Important NP-Complete Problems of any clause to true. can at most B "guards" be placed at vertices of the polygon in such a way that every point in the interior of the polygon is visible to at least one guard? (A simple polygon is
229
. is it the case that they denote different languages? (Put differently. does there exist a string x that belongs to the language denoted by one expression but not to the language denoted by the other?) * Art Gallery (AG): Given a simple polygon. we shall then prove that the following eight problems. While Positive 3SAT is a trivial problem to solve. each selected for its importance as a problem or for a particular feature demonstrated in the reduction. can it be colored with three colors? * Partition: Given a set of elements. we shall prove that both Positive lin3SAT and Positive NAESAT are NP-complete. are NP-complete: * Maximum Cut (MxC): Given an undirected graph and a positive integer bound.7. B . of n vertices and a positive integer bound.n. and such that the sum of all element sizes is an even number. This constraint results in a symmetric problem: the complement of a satisfying truth assignment is a satisfying truth assignment. each of which contains exactly three elements. neither one of which uses Kleene closure. does it have a Hamiltonian circuit? * Exact Cover by Three-Sets (X3C): Given a set with 3n elements for some natural number n and a collection of subsets of the set. With these problems as our departure points.. such that each edge has at least one endpoint in the chosen subset)? * Star-Free Regular Expression Inequivalence (SF-REI): Given two regular expressions. can the vertices be partitioned into two subsets such that the number of edges with endpoints in both subsets is no smaller than the given bound? * Graph Three-Colorability (G3C): Given an undirected graph. A further restriction of these two problems leads to their positive versions: a positive instance of a problem in the Satisfiability family is one in which no variable appears in complemented form.e.

while HC and X3C are quite rigid (the number of edges or subsets in the solution is fixed. so is lin3SAT: we need to make only one additional check per clause. bi} a
bi.2 One-in-Three 3SAT is NP-complete. Theorem 7. Our scheme of reductions is illustrated in Figure 7. 1in3SAT to the next two. These observations suggest reducing NAE3SAT to the first three problems. the covering subset is distinguished from its complement-hence no symmetry-and each edge can be covered by one or two vertices-hence no rigidity). a point is visible to a guard if and only if the line segment joining the two does not intersect any edge of the polygon. and VC. Ci {ci. to verify that exactly one literal is set to true. ai.
D
Proof.1
The scheme of reductions for our basic proofs of NPcompleteness. with the exception of Partition. and di.) Notice that MxC.
one with a well-defined interior. and so is their relationship). Specifically. and Partitionare symmetric in nature (the three colors or the two subsets can be relabeled at will). 1 and produce three clauses: ai. say {xi . di. Xi3}
Hence our transformation takes an instance with n variables and k clauses and produces an instance with n + 4k variables and 3k clauses. G3C.230
Proving Problems Hard
3SAT
NAE3SAT
SF-REI
lin3SAT
VC
MxC
G3C
POSITIVE'in3SAT
HC
Partition
AG
X3C
Figure 7.. for each original clause. and AG are neither (in the case of VC. and 3SAT to the last three. xi 2. c.we proceed exactly in this fashion.1. we introduce four new variables. xji. . for instance. Since 3SAT is in NP. it is easily
carried out in polynomial time. In fact. Our transformation takes a 3SAT instance and adds variables to give more flexibility in interpreting assignments.i3 .
. bi. SF-REI.

in terms of the new variables. Since the complement of a satisfying truth assignment is also a satisfying truth assignment. so is NAE3SAT: we need to make only one additional check per clause. If the middle literal is set to true in such a solution. in effect. for each variable. and di to false. " y'. assigning a value of true to x will correspond to assigning different truth values to the two variables x' and x". y'1.7. For example. z'. say x' and x". we use distributivity and expand the disjunctive form given above. VI. z} { I. the conditions under which each original clause is satisfied. y"1.z' z) 17'. we may set ai and ci to true and bi and di to false. a solution to NAE3SAT is not so much a truth assignment as it is a partition of the variables. y ". 3SAT.
Theorem 7. x. in doing so. y"f. a solution for the original instance implies the existence of a solution for the
transformed instance. Yet we must make a distinction between true and false in the original problem. forcing the remaining literals in their clauses to false. In all cases. then we set bi and ci to false in the transformed instance and let ai = xi I and di = xi 3 . we set up two variables in NAE3SAT.D. to verify that at least one literal was set to false. '. ci. xI/.XI/. this requirement leads us to encode truth values. if only the first literal is true (the case for the last literal is symmetric).
Q. in 3SAT.z" {x. assume that the original instance admits a solution. I' y"IzI I' II} 1. Conversely.
F1
Proof Since 3SAT is in NP. ' (x' ". (The first and last literal are complemented and hence true in the transformed instance. Finally. y. we set bi to true and ai. x. Observe that such a solution cannot exist if all three original literals are set to false. z'. '. I' z"} {x' x". However.y". I' y". y. a number of terms cancel out (because they include the disjunction of a variable and its complement) and we are left with the eight clauses
[XI. we cannot distinguish true from false for each variable.3 Not-All-Equal 3SAT is NP-complete. z"})
. this sets all four additional variables to false. y'. so that the middle clause is not satisfied. VI.) Hence at least one of the original literals must be true and thus any solution to the transformed instance corresponds to a solution to the original instance. z} gives rise to the formula
(X' A X ) V (X A x") V (y' A y") V (Y A
7') V (z
A ") V (z A
z")
Since we need a formula in conjunctive form. the clause {x.E. z'. Specifically. y". ". Now we just write a Boolean formula that describes. If the middle literal is false but the other two are true. z/ ) [xI". Assume that the transformed instance admits a solution-in which exactly one literal per clause is set to true.1 Some Important NP-Complete Problems
231
We claim that the transformed instance admits a solution if and only if the original instance does.

E. "false" to another. Exercise 7. Repeat for Positive NAE3SAT. all-purpose starting points. and "don't care" to a third. assigns "true" to one literal.1. in each clause. The transformation is easily accomplished in polynomial time. The eight clauses of six literals become transformed into thirty-two clauses of three literals. 3. An exhaustive examination of the seven possible satisfying assignments for an original clause shows that there always exists an assignment for the additional variables in the transformed clauses that ensures that each transformed clause has one true and one false literals. except that the three literals in a clause must be all complemented or all uncomplemented. Prove that Positive 1in3SAT is NP-complete (use a transformation from lin3SAT). it becomes clear that a satisfying truth assignment for the problem is one which. an instance of this problem is similar to one of 3SAT.D. Prove that Maximum Two-Satisfiability (Max2SAT) is NP-complete. This completes the transformation. how can the variables be partitioned into two sets so that each clause includes a variable assigned to each set? In this view.1 1. our advice to the reader is: first attempt to identify a close cousin of your problem (using your knowledge and reference lists such as those appearing in the text of Garey and Johnson or in Johnson's
. Prove that Monotone 3SAT is NP-complete. NAE3SAT may be viewed as a partition problem. the satisfiability problems provide convenient. Three more are the subject of the following exercise.232
Proving Problems Hard It only remains to transform these six-literal clauses into three-literal clauses. An instance of 3SAT with n variables and k clauses gives rise to an instance of NAE3SAT with 32k clauses and 2n + 24k variables. Given n variables and a collection of clauses over these variables. 2. Now we have three varieties of 3SAT for use in reductions. three additional variables are needed for each of the eight clauses. An instance of Max2SAT is composed of a collection of clauses with two literals each and a positive integer bound k no larger than the number of clauses. Q. using the same mechanism as in Theorem 7. In other words. 4. The construction guarantees that a solution to the transformed instance implies the existence of a solution to the original instance. so that a total of twenty-four additional variables are required for each original clause. The question is "Does there exist a truth assignment such that at least k of the clauses are satisfied?" D While many reductions start with a problem "similar" to the target problem.

O(IEI log IVI).
233
* Assess the size of the input instance in terms of natural parameters. scanning the edges. then verify that this time is polynomial in the input size. Given a candidate partition. Membership in NP is easily established. because the result is generally not a graph due to the creation of multiple edges between the same two vertices. Since the vertices must be partitioned into two subsets. The size of the input is the size of the graph. plus the bound. The problem is due to the interaction of the triangles corresponding to clauses. using the same natural parameters. and counting the cut ones takes O(IEI) time. * Define a certificate and the checking procedure for it.4 Maximum Cut is NP-complete. since a triangle can only be cut with two vertices on one side and one on the other-provided. we finish by comparing this count with the given bound. Of course. Another aspect of the same problem is that
. Of the three literals in each clause. we shall ensure that any solution cuts each such edge. that we ensure that each triangle be cut. Theorem 7. such proofs of membership can be divided into three steps.1 Some Important NP-Complete Problems Table 7. we can examine each edge in turn and determine whether it is cut. this suggests using a triangle for each clause. keeping a running count of the number of cut edges. but do not spend excessive time in your search-if your search fails.2. We transform NAE3SAT into MxC. 1
Proof. as summarized in Table 7. * Analyze the running time of the checking procedure. determining whether each is cut. of course. we want one or two to be set to true and the other(s) to false. This suggests using one edge connecting two vertices (the "true" and the "false" vertices) for each variable. we cannot simply set up a pair of vertices for each variable.7. and connect in a triangle the vertices corresponding to literals appearing together in a clause (to ensure a satisfying truth assignment). you must start by establishing that the problem does belong to NP. use one of the six versions of 3SAT described earlier. O(log B). we can use the partitioning for ensuring a legal truth assignment. which is clearly polynomial in the input size. Unfortunately.2 The three steps in proving membership in NP. connect each pair (to ensure a legal truth assignment).
column in the Journal of Algorithms). The following proofs will show some of the approaches to transforming satisfiability problems into graph and set problems.

we set the minimum number of edges to be cut to n + 5k.3 shows the result of the transformation applied to a simple instance of NAE3SAT. we need to ensure that all of these vertices end up on the same side of the partition together. Zl A {y. For each clause. Figure 7. however. Specifically. leads to a new problem: consistency. zI
Figure 7. given an instance of NAE3SAT with n variables and k clauses. t." Finally. Each aspect is illustrated in Figure 7. The obvious solution is to keep these triangles separate from each other and thus also from the single edges that connect each uncomplemented literal to its complement. Given a satisfying truth assignment for the instance of NAE3SAT. where each vertex of the triangle is connected by an edge to the complement of the corresponding "literal vertex.234
Proving Problems Hard
x
w (a) {x.2
Problems with the naive transformation for MxC. Since we now have a number of vertices corresponding to the same literal (if a literal appears in k clauses. we put all vertices corresponding to true literals on one side of the partition and
.2.
triangles that do not correspond to any clause but are formed by edges derived from three "legitimate" triangles may appear in the graph. we set up a triangle. For each variable. Such separation. we have k + 1 vertices corresponding to it). we transform it in polynomial time into an instance of MxC with 2n + 3k vertices and n + 6k edges as follows. W}
t
y
U
(b) {x. The resulting construction is thus comprised of three parts. y
A {X. V. U. we set up two vertices (corresponding to the complemented and uncomplemented literals) connected by an edge. yz A {x. which are characteristic of transformations derived from a satisfiability problem: a part to ensure that the solution corresponds to a legal truth assignment. and a part to ensure such consistency in the solution as will match consistency in the assignment of truth values. To this end. we must connect all of these vertices in some suitable manner. y. a part to ensure that the solution corresponds to a satisfying truth assignment. This transformation is clearly feasible in polynomial time.

Since the truth assignment is valid. these hard instances suffice to make problem B hard. Referring back to Figure 6. each edge between a literal and its complement is cut.4.) Hence a solution to MxC yields a solution to NAE3SAT: cutting each segment ensures a valid truth assignment and cutting each triangle ensures a satisfying truth Q. Since the truth assignment is a solution. assignment.
. while infinite.1 Some Important NP-Complete Problems
235
Figure 7. at most three of the six edges associated with the clause can be cut. Setting up triangles corresponding to clauses is a common approach when transforming satisfiability problems to graph problems. each triangle is cut (not all three vertices may be on the same side. In effect. the cut sum of n + 5k can be reached only by cutting all triangles and segments.3 makes it clear for Maximum Cut. we observe that the subset f(A) of instances produced through the transformation. As we saw with Satisfiability and as Figure 7. the instances produced are often highly specialized. the transformation f identifies a collection f (A) of hard instances of problem B. It is worth repeating that instances produced by a many-one reduction need not be representative of the target problem. The next proof uses the same technique. Moreover.E. observe that n + 5k is the maximum attainable cut sum: we cannot do better than cut each clause triangle and each segment between complementary literal.3
The construction used in Theorem 7. may be a very small and atypical sample of the set B of all instances of the target problem. thereby contributing a total of n to the cut sum.
all others on the other side. but we have gained no information about the instances in B -f (A).2.D. Hence we have a solution to MxC.7. as this would correspond to a clause with three false or three true literals). (If all three vertices of a clause triangle are placed on the same side of the partition. thereby contributing a total of 5k to the cut sum. Conversely.

The transformation takes an instance with n variables and k clauses and produces a graph with 2n + 3k + 1 vertices and 3(n + 2k) edges. which is surely not the only way to proceed but has the virtue of simplicity. we connect each vertex of a clause triangle to the corresponding (in this case. All triangles corresponding to variables have a common vertex.
E
Exercise 7. Thus the transformed instance admits a solution if and only if the original instance does.) One solution is to let two of the colors correspond to truth values and use the third for other purposes-or for a "third" truth value.6 Vertex Cover is NP-complete.236
Proving Problems Hard
Theorem 7. We now present a graph-oriented proof of medium difficulty that requires the design of a graph fragment with special properties. corresponding to the complemented and uncomplemented literals. the complement) literal vertex. using a construction similar to one of those used above. if and only if not all three literals in the clause have been assigned the same truth value. Such
. that is. Starting from an instance of NAE3SAT.
D
Proof Membership in NP is easily established. To ensure that the only colorings possible correspond to satisfying truth assignments. Assigning these two colors corresponds to a legal truth assignment. which preempts one color.E. Each such edge forces its two endpoints to use different colors.4.2 Prove this theorem by reducing 3SAT to VC. must be assigned two different colors chosen from a set of two. so that the other two vertices of each such triangle. Theorem 7. verifying that its endpoints are colored differently. which "paints" each variable in one of two "colors. Since the input (the graph) has size O(FEI log IVI) and since the verification of a certificate takes (I VI + IEl) time. we set up a triangle for each variable and one for each clause. Q.5 Graph Three-Colorabilityis NP-complete. then look at each edge in turn. namely the "don't care" encountered in NAE3SAT.D." how do we go to a coloring using three colors? (This assumes that we intend to make vertices correspond to variables and a coloring to a truth assignment. Not all proofs of NP-completeness involving graphs are as simple as the previous three. The reader can easily verify that a clause triangle can be colored if and only if not all three of its corresponding literal vertices have been given the same color. it is easily done in polynomial time. Transforming a satisfiability problem into G3C presents a small puzzle: from a truth assignment. the checker runs in polynomial time. Given a coloring. we need only verify that at most three colors are used. A sample transformation is illustrated in Figure 7.

5.7. Ideal for this purpose would be a gadget that acts as a logical exclusive-OR (XOR) between edges. we could then set up a graph where the use of one truth value for a variable
. Theorem 7. Section 8.5. more complex gadgets have been used for problems where the graph is restricted to be planar or to have a bounded degree. How to design a gadget remains quite definitely an art. called gadgets. it suffices to scan each vertex in turn.7 Hamiltonian Circuit is NP-complete. Forcing a selection in the HC problem can then be done by placing all of these selection pieces within a simple loop.4 The construction used in Theorem 7.1 Some Important NP-Complete Problems
237
T4
z
Figure 7. a permutation of the vertices). we can look at it as requiring the selection of certain edges. adding vertices of degree 2 to force any solution circuit to travel along the loop. With such a tool. verifying that an edge exists between the current vertex and the previous one (and verifying that an edge exists between the last vertex and the first). Then a truth assignment can be regarded as the selection of one of two edges. as illustrated in Figure 7. we need to look at our problem in a different light. are typical of many proofs of NP-completeness for graph problems. Membership in NP is easily established. similarly. El
Proof. Given a guess at the circuit (that is. Ours is a rather simple piece. In order to transform a problem involving truth assignments to one involving permutation of vertices.1 presents several such gadgets. setting exactly one out of three literals to true can be regarded as selecting one out of three edges. It remains somehow to tie up edges representing clause literals and edges representing truth assignments.
fragments. Instead of requiring the selection of a permutation of vertices.

6 fulfills those conditions. .. given one "edge" (a pair of vertices.
(i. yi}.238
Proving Problems Hard
/
varibe
clauses
v
Figure 7. Hence the specific construction we need is one which. it cannot be used to satisfy any clause).
.
-ay
x
(a) the fragment
(b) its symbolic representation
D
y3
(c) its use for multiple edges
Figure 7.5
The key idea for the Hamiltonian circuit problem. bl) and a collection of other "edges" (pairs of vertices. in which case any path from xi to yi each must use only edges from the gadget. Ak).
a
b
a
I
xv.e. the use of one of the two edges associated with the variable) prevents the use of any edge associated with the complementary literal (since this literal is false. in which case no path from xi to yi may use any edge of the gadget.{Xk. . .6 The graph fragment used as exclusive OR and its symbolic representation. say {xi. The graph fragment shown in Figure 7. is such that any Hamiltonian path from a to b either does not use any edge of the gadget. say {a. or uses only edges from the gadget.

with the alternation reversed between the "edge" from a to b and the other "edges. so that all edges drawn vertically in the figure must be part of any Hamiltonian circuit.1 Some Important NP-Complete Problems
239
(a) a path through the gadget
(b) its symbolic representation
Figure 7.) This construction is illustrated in Figure 7.7
How the XOR gadget works. then one of the two "edges" in its truth-setting component is a real edge.7. so that any path from xi to yi must use only edges external to the fragment. In the remainder of the
construction. Now the construction is simple: for each clause we set up two vertices connected by three "edges" and for each variable appearing in at least one clause we set up two vertices connected by two "edges.8. which shows one of the two paths through the gadget. adding one intermediate vertex between any two successive components. so that the fragment of Figure 7. This property allows us to set up two or three such "edges" between two vertices without violating the structure of a graph (in which there can be at most one edge between any two vertices). we use the graphical symbolism illustrated in the second part
of Figure 7. (If a variable appears only in complemented or only in uncomplemented form." as illustrated in Figure 7.
Notice that the "edge" from a to b or from xi to yi is not really an edge but a chain of edges. it takes an instance with n variables and k clauses and produces a graph with 3n + 39k vertices and 4n + 53k edges and can be done in polynomial time. The converse follows from the same reasoning.7. It follows that a Hamiltonian path from a to b using at least one edge from the fragment must visit all internal vertices of the fragment. Finally we tie variables and clauses pieces together with our XOR connections. Since we constructed truth-setting components only for those variables that appear at least once in a clause." We then connect all of these components in series into a single loop. there is no risk of creating duplicate edges.6 to represent our gadget.6 indeed fulfills our claim.
. To verify that our gadget works as advertised. since it is not part of any XOR construct. Hence only alternate horizontal edges can be selected. first note that all middle
vertices (those between the "edge" from a to b and the other "edges") have
degree 2.

D.
a
Proof.240
Proving Problems Hard
x y z
Figure 7. An instance of the problem is a graph and the question asks whether or not the graph has a Hamiltonian path. Theorem 7. i. This ensures a legal truth assignment by selecting one of two "edges" in each truth-setting component and also a satisfying truth assignment by selecting exactly one "edge" in each clause component.8 The entire construction for the Hamiltonian circuit problem. by scanning each subset in the cover in turn. We want to reduce Positive I in3 SAT to X3 C.3 Prove that Hamiltonian Path is NP-complete. a simple path that includes all vertices. Membership in NP is obvious. given a satisfying truth assignment. F1 Set cover problems provide very useful starting points for many transformations. we need only verify.) Conversely. Q. devising a first transformation to a set cover problem calls for techniques somewhat different from those used heretofore. Hence the transformed instance admits a solution if and only if the original one does. we obtain a Hamiltonian circuit by traversing the "edge" corresponding to the true literal in each clause and the edge corresponding to the value false in each truth-setting component. Given the guessed cover. However. that all set elements are covered. (The actual truth assignment sets to false each literal corresponding to the edge traversed in the truth-setting component.
Any Hamiltonian circuit must traverse exactly one "edge" in each component.8 Exact Cover by Three-Sets is NP-complete. because of the effect of the XOR. Exercise 7.e. The first question to address is the representation of a truth assignment. one possible solution is to set up two three-sets for each variable and to ensure that exactly one
..E.

The component associated with variable x has 2nx sets:
2i{p1 * fP'-. Let variable x occur n.i . Since a variable may occur in several clauses of the lin3SAT problem and since each element of the X3C problem may be covered only once. we construct a component with two attaching points (one corresponding to true. tc.8.7. while the other three will distinguish the true literal from the two false literals. of which 2nx will be used as attaching points while the others will ensure consistency. i < nx. we set up six elements. we need several copies of the construct corresponding to a variable (to provide an "attaching point" for each literal). times (we assume that each variable considered occurs at least once). as illustrated in Figure 7.nx. XIf} for I
nx. y. Zc. For each clause. The first three will represent the three literals. Let an instance of Positive lin3SAT have n variables and k clauses.
a variable with three occurrences
Figure 7.
. For each variable. P 2
2
i
i+.9
The component used in Theorem 7. The latter can be achieved by taking advantage of the requirement that the cover be exact: once a three-set is picked. Call the attaching points Xi and Xf. the other to false) for each of its occurrences. pP~ii~tfr~~~ 1 Xi'} for
* {PX . fc. for I . xc. any other three-set that overlaps with it is automatically excluded. z}. call the other points px. and
* {Px'XPj 7 Xnx}. We set up 4nx elements. for 1 S i S 2nx. and fcj. This in turn raises the issue of consistency: all copies must be "set" to the same value. c = {x. Yc. Now we construct three-sets.9 and described below.1 Some Important NP-Complete Problems
241
of the two three-sets is selected in any solution.

The first of these sets. we see that the choice of cover for pl entirely determines the cover for all p'. namely {p2nx-l.. Whichever set is selected to cover the element t. this set is {xc. y. xf I If the first is chosen.Xf} and {xc.
. The following proof presents a simple example of the use of enforcers. thereby ensuring that at least one literal per clause is true. we see that a typical reduction from a satisfiability problem to another problem uses a construction with three distinct components. in time proportional to the length of the words representing the weights-not in time proportional to the weights themselves). We often transform an asymmetric satisfiability problem into a symmetric problem in order to take advantage of the rigidity of lin3SAT. Once again. fc7 x/}.D. Now notice that. From this reduction and the preceding ones. these sets are {xc. which we can do in linear time (that is. indicates that the associated literal is the one set to true in the clause. three for each literal. fA.9 Partitionis NP-complete. if picked for the cover. Turning to the components associated with the clauses. X'I for some attaching point i. Thus a covering of the components associated with variables corresponds to a legal truth assignment. we must provide an indication of which part of the solution is meant to represent true and which false. where the uncovered elements correspond to literal values.242
Proving Problems Hard Each clause c = {x. The other two sets chosen cover fc' and fC and thus must contain one false literal each. the associated literal is set to false. for literal x in clause c. so that the element p2nx must be covered by the only other three-set in which it appears. If one of the other two is picked. the element p' can be covered only by one of two sets: {pI pi . Our conclusion follows. Overall. Xn }. in the process covering either (i) all of the X/ and none of the X' or (ii) the converse. Xl} or {fp2fl.
F
Proof. notice that exactly three of the nine sets must be selected for the cover. our transformation produces an instance of X3C with 18k elements and 15k three-sets and is easily carried out in polynomial time. This is often done by means of enforcers (in the terminology of Garey and Johnson). Theorem 7. then the second cannot be chosen too. for each variable x. p2nx. as summarized in Table 7. ensuring that at most one literal per clause is true. for our literal. must include a true literal. In such cases. it also illustrates another important technique: creating exponentially large numbers out of sets. Given a guess for the partition. we just sum the weights on each side and compare the results. pl. z} gives rise to nine three-sets. t.3.E. Continuing this chain of reasoning. membership in NP is easily established. Q.

7.
* Trutb Assignment: This component corresponds to the variables of the satisfiability instance. typically. we want two of its literals on one side of the partition and the remaining literal on the other side. there is one piece for each clause. we mean that each variable is assigned one and only one truth value. so that we are back to individual features of the original instance. each assigned a weight of k + n digits (where n is the number of variables and k the number of clauses). the key to the transformation resides in the construction of the weights. to Partition.3
The components used in reductions from satisfiability problems. where each digit corresponds to some feature of the original
instance. The role of this component is to ensure that any solution to the transformed instance must include elements that force consistency among all parts corresponding to the same literal in the satisfiability instance.) * Satisfiability Cbecking: This component corresponds to the clauses of the satisfiability instance. our construction must provide means of distinguishing one side of the partition from the other-assuming that the transformation
follows the obvious intent of regarding one side of the partition as corresponding to true values and the other as corresponding to false values. also. for each clause. (It prevents using one truth value in one clause and a different one in another clause for the same variable.)
Since we intend to reduce lin3SAT. The role of this component is to ensure that any solution to the transformed instance must include elements that correspond to a satisfying truth assignment-typically.
* Consistency: This component typically connects clause (satisfiability checking)
components to variable (truth assignment) components. With these observations we can proceed with our construction. In addition. typically. each digit can be considered separately. We want a literal and its complement to end up on opposite sides of the partition (thereby ensuring a legal truth assignment). there is one piece for each variable. each piece ensures that its corresponding clause has to be satisfied.1
Some Important NP-Complete Problems
243
Table 7. The easiest way to produce numbers is to set up a string of digits in some base. These observations suggest setting up two elements per variable (one for the uncomplemented literal and one for the complemented literal). The last n digits are used to identify each variable: in the weights of the two elements
. (By legal. The role of this component is to ensure that any solution to the transformed instance must include elements that correspond to a legal truth assignment to the variables of the satisfiability instance. a problem without numbers. A critical point is to prevent any carry or borrow in the arithmetic operations applied to these numbers-as long as no carry or borrow arises.

0100. z}. the first k digits of which are all equal to 4. c2 = {x. while a multiple of 2. y. The first k digits characterize membership in each clause. all such digits are set to 0. 3
. we do this by adding an enforcer. The two weights have the same last four digits. since all other weights have one of these digits set to 1) and the first k digits to 1. The side of true literals will be flagged by the presence of the enforcer. The complete construction takes an instance of lin3SAT
Cl X
C2
C3
y
Y 7
W enforcer
1 0 1 0 0 1 0 0 1
1 0 0 1 1 0 0 0 1
0 1 0 1 0 0 1 0 1
y Z W 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 X
Figure 7. the jth of these digits is set to 1 in the weight of a literal if this literal appears in the jth clause. then the weight of the element corresponding to the literal x2 will be 1000100 and that of the element corresponding to x2 will be 0100100. Observe that the sum of all 2n weights is a number.10 shows a sample encoding. and the last n digits of which are all equal to 2-a number which. identifying them as belonging to elements corresponding to the second of four variables. for instance. in the form of an element with a uniquely identifiable weight-which also ensures that the total sum becomes divisible by 2 on a digit by digit basis. y. W}. Now the overall sum is a number. the first k digits of which are all equal to 3. A suitable choice of weight for the enforcer sets the last n digits to 0 (which makes this weight uniquely identifiable.10
The encoding for Partitionof the lin3SAT instance given by cl = {x. if variable x2 (out of four variables) appears uncomplemented in the first clause and complemented in the second (out of three clauses). It remains to identify each side of the partition. Figure 7. and the last n digits of which are all equal to 2 (which indicates that considering these numbers to be written in base 5 or higher will prevent any carry). and C = {I. this number is divisible by 2 without borrowing. which is set to l. except for the ith. is not divisible by 2 without carry operations. y. z}. Thus.244
Proving Problems Hard corresponding to the ith variable. otherwise it is set to 0.

false. the instance produced by the transformation has 9 elements with decimal weights 1. in addition.100. 100.true. M)
f(0.. and I-for a total weight of 4. assume that the instance of lin3SAT admits a satisfying truth assignment. 100. because any instance of Partitionwith small numbers is solvable in polynomial time using dynamic programming. which contributes a 1 in each of the first k positions. We claim that the transformed instance admits a solution if and only if the original does. this is not a polynomial-time
.. f (i .000.
each side must include exactly one literal for each variable. Conversely.010.444. indicating whether there exists a subset of the first i elements that sums to M.010. 211 . and 1 on one side-corresponding to the assignment x <.1. 1.001.110. with just four variables and
three clauses. j) =0 for j •& 0
=
where f (i. Thus the first k digits of the sum of weights on each side are all equal to 2. 10. which ensures a satisfying truth assignment. 1. Notice that exponentially large numbers must be created. We place all elements corresponding to true literals on one side of the partition together with the enforcer and all other elements on the other side. M
-
si))
f (0. 1.000. Hence the sum of the weights on each side is equal to 22.101. The dynamic program is based upon the recurrence
f (i. each with a weight of
n + k digits. Since the enforcer contributes a I in each of the first k positions.false. Q.000. M) equals 1 or 0. 211 . Note that. the largest weights produced already exceed a million.true. which ensures a legal truth assignment. Since the sum of all weights on either side must be 22 . 11. .000.000.100.E. 1.000.222. and w -. . so that the last n digits of the sum of all weights on either side are all equal to 1. . this program produces an answer in O(n N2 ) time. Since the size of the input is O(n log N).1 Some Important NP-Complete Problems
245
with n variables and k clauses and produces (in no more than quadratic
time) an instance of Partitionwith 2n + I elements.
. . 1..010. A solution groups the elements with weights 1.
In our example.D. the "true" side must include exactly one literal per clause. . 0)
= max(f (i -1.
y <. 11.110.7. 1 and our proof is complete.000. Assume then that the transformed instance admits a
solution. the "true" side also includes the enforcer. Given an instance of Partitionwith n elements and a total sum of N. Thus each side has one element for each variable. Since each clause has exactly one true literal and two false ones.
M). z <.000. 110. the "true" side includes exactly one element per clause and the "false" side includes exactly two elements per clause.100.

. F}. 1 through n. while a simple backtracking search provides an algorithm that is linear in log N but exponential in n. There are 2n such strings.246
Proving Problems Hard
algorithm. E1 . (The reader will note that producing such numbers does not force our transformation to take exponential time. z}. each subexpression will describe all truth assignments that make its corresponding clause evaluate to false.i. The alphabet E has two characters. say {T. each clause will give rise to a subexpression and the final expression will be the union of all such subexpressions. The intent of this expression is to describe all truth assignments that make the collection of clauses evaluate to false. the difference is that. Each subexpression is very similar to the one expression above.) Partition is interesting in that its complexity depends intimately on two factors: the subset selection (as always) and the large numbers involved. we construct an instance A. Given an instance of 3SATwith variables V = {xi. In order to avoid problems of permutations. One of the regular expressions will denote all possible strings of n variables. The second expression is derived from the clauses. It uses the freedom inherent in having two separate structures. instead of the union of both truth values. the values described. and where the variables appear in order.10 Star-Free RegularExpression Inequivalence (SF-REI) is NPE complete. Theorem 7.i . it behaves as one whenever N is a polynomial function
of n. x} and clauses C = {{. . 5i. . since n bits are sufficient to describe a number of size 2 '..1 that this problem is in NP. 1 . where each variable may appear complemented or not. constructing one to reflect the details of the instance and the other as a uniform "backdrop" (reflecting only the size of the instance) against which to set off the first. The interest of our next problem lies in its proof. For instance.
(T + F)
The intent of this expression is to describe all possible truth assignments. the clause
. but exponential in log N. the number of elements involved. Thus the instances of Partitionproduced by our transformation must involve numbers of size Q (2 n) so as not to be trivially tractable. which can be denoted by a regular expression with n terms:
El = (T + F) * (T + F)
. The dynamic programming algorithm provides an algorithm that is linear in n. Proof We have seen in Proposition 6. for the literals mentioned in the corresponding clause. We prove it NP-complete by transforming 3SAT to it. only the false truth value appears in a term.r). . however. we also require that variables appear in order I through n. . E2 of SF-REI as follows.

and for each guard a description of the area under its control. A certificate for our problem will consist of a triangulation of the polygon and its dual tree. then at least one clause would not be satisfied and our string would appear in E2 by its association with that clause. it arbitrarily assigns multiply-covered areas to some of its guards so as to generate a partition of the interior of the simple polygon. we need to design an appropriate collection of geometric gadgets and secondly. if it were not.7.. a placement of the guards.E. This last includes bounding segments as needed to partition a triangle as well as the identity of each guard whose responsibility includes each adjacent piece of the partitioned triangle.
(T+F)
This construction clearly takes polynomial time. in low polynomial time. we can then verify that the triangulation and its dual tree are valid and that all triangles of the triangulation of the simple polygon are covered by at least one guard.) Finally. we verify in polynomial time that each piece (triangle or fraction thereof) is indeed visible in its entirety by its assigned guard.D. if a string denoted by El is not denoted by E2 . then this assignment makes all clauses evaluate to true. then the corresponding truth assignment must be a satisfying truth assignment.11 Art Gallery is NP-complete. We do not go into details
. like all other strings corresponding to legal truth assignments. each vertex of which is a vertex of the polygon. Theorem 7. Conversely. Now suppose there is a satisfying truth assignment for the variables. instead. Q. is in the language denoted by El).
X4 } gives rise to the subexpression:
T F (T+F) T (T+F) ..
X2. we need to ensure that all coordinates are computable (and representable) in time and space polynomial in the input size. (The certificate does not describe all of the area covered by each guard. contradicting our hypothesis.1 Some Important NP-Complete Problems
247
{x-. The planar dual of this decomposition (obtained by placing one vertex in each triangle and connecting vertices corresponding to triangles that share an edge) is a tree. In polynomial time. so that the corresponding string is not in the language denoted by E2 (although it.. Such constructions are often difficult for two reasons: first. given in terms of the dual tree and the triangulation. Finally.
FE
Proof A well-known algorithm in computational geometry decomposes a simple polygon into triangles. we turn to a geometric construction.

.n . .2.... An instance of Vertex Cover is given by a graph. Pn-. any point in a convex polygon can see any other point inside the polygon.248
Proving Problems Hard
pocket
graph edg
.
.. In general.. we need to construct the union of the pockets that were set up for each edge incident upon this vertex and thus need to
'For instance. which decreases as i increases.12(a)... and a bound B.. To prove that the problem is NP-complete.. pj (for j > i ¢ 1) is . Now we need to demonstrate that the vertices on the perimeter of the resulting polygon can be produced (including their coordinates) in perimeter order in polynomial time... A single guard suffices for a convex art gallery: by definition of convexity.1). Thus all quantities are polynomial in the input size. we reduce the known NPcomplete problem Vertex Cover to it. We begin with the vertices of the polygon before adding the pockets.. v}... for 1 . that can be seen in their entirety only from the vertices corresponding to the two endpoints of an edge....... Our basic idea is to produce a convex polygon of n vertices. In order to specify the pockets for a single vertex. see Computational Geometry by E Preparata and 1. We shall place two additional constructs for each graph edge.. we create n points PO. E).... illustrated in Figure 7.. Shamos. the slope between pi and these slopes determine the sides of the pockets... The resulting polygon.i we place point po at the origin and point pi.. Given a graph of n vertices. at coordinates (ni(i . 2ni)...
pocket
i
polygon
Figure 7. then to augment it (and make it nonconvex) with constructs that reflect the edges. G = (V.1.i .. Let the graph have n vertices. ..11
The two pockets corresponding to the graph edge {u.
here but refer the reader to one of the standard texts on computational geometry 1 for a description of the algorithms involved. n = IVJ... Thus our additional constructs will attach to the basic convex polygon pieces that cannot be seen from everywhere-indeed. Figure 7.11 illustrates the concept of pocket for one edge of the graph. these constructs will be deep and narrow "pockets" (as close as possible to segments) aligned with the (embedding of the corresponding) graph edge and projecting from the polygon at each end of the edge. is convex: the slope from pi to pi+1 (for i 3 1) is l.

that is: we set the depth and width by using the floor of square roots rather than their exact values in order to retain rational values) are deep and narrow enough to ensure that only two vertices of the original convex polygon can view either pocket in its entirety-namely the two vertices corresponding to the edge for which the pockets were built.7. then we can even save a guard in the process).12(b) illustrates the result: if a vertex has degree d in the graph. it has d associated pockets in the corresponding instance of AG. we just place guards at the vertices (of the original convex polygon) corresponding to the vertices in the cover.D. E) a simple polygon with lV I + 61 E I vertices. note that we can always move a guard from one of the additional vertices to the corresponding vertex of the original convex polygon without decreasing coverage of the polygon (if two guards had been placed along the pockets of a single original vertex. Our conclusion follows. In all of the reductions used so far. vertices that have no immediate counterpart in the original graph instance. Figure 7. Q. However.E. including this latest reduction from a problem other than satisfiability. The converse is somewhat obscured by the fact that guards could be placed at some of the additional 61EI vertices defining the pockets. Each intersection is easily computed from the slopes and positions of the lines involved and the resulting coordinates remain polynomial in the input size.
compute the intersection of successive pockets. In all. Now it is easy to see that a solution to the instance of VC immediately yields a solution to the transformed instance. we construct from a graph G = (V. The result is a solution to the instance of AG that has a direct counterpart as a solution to the original instance of VC.1 Some Important NP-Complete Problems
249
P2
(a) the convex polygon for 6 vertices
(b) the pockets for a vertex of degree 3
Figure 7. with 3d vertices in addition to the original. Pockets of depth n and width 1 (roughly. we established an explicit and
.12
The construction for the Art Gallery problem.

We summarize this principle as the last of our various guidelines for NP-completeness proofs in Table 7. can a subcollection including no more subsets than the given bound be found which covers the set? 3. Restriction is a simplified reduction where the transformation used is just the identity. Subset Sum: Given a set of elements. * Use the characteristics of the certificates to develop a conceptual correspondence between the two problems. The following theorem demonstrates the simplicity of restriction proofs. From these basic problems. can a subset of elements be found such that the sum of the size of its elements is exactly equal to the goal?
. Theorem 7. a collection of subsets of the set. carefully list their requisite attributes before setting
out to design them.4
Developing a transformation between instances. a positive integer "size" for each element. of course).12 is restriction.250
Proving Problems Hard
Table 7. can the graph be colored with no more colors than the given bound? 2. and a positive integer goal. can a subset of elements be found such that the sum of the sizes of its elements is no larger than the size bound and the sum of the values of its elements is no smaller than the value bound? 4. then develop it into a correspondence between the elements of the two instances. The proof technique in the six cases of Theorem 7. Chromatic Number: Given a graph and a positive integer bound. in other words. Knapsack: Given a set of elements.4.
* List the characteristics of each instance. by far the simplest method available. and a positive integer bound. but we may choose to look at it the other way. a positive integer "value" for each element. Set Cover: Given a set.12 The following problems are NP-complete: 1. * Where gadgets are needed. a positive integer size bound. * List the characteristics of a certificate for each instance. we can very easily prove that several other problems are also NP-complete (when phrased as decision problems. a positive integer size for each element. and a positive integer value bound. it is shown to contain all instances of this NP-complete problem as a special case.
direct correspondence between certificates for "yes" instances of the two problems. The problem to be proved NP-complete is shown to restrict to a known NP-complete problem.

not on the type of solution. 4. We restrict Subset Sum to Partitionby allowing only instances where the sum of all sizes is a multiple of 2 and where the goal is equal to half the sum of all sizes.E. we could not "restrict" Set Cover to Minimum Disjoint Cover (a version of Set Cover where all subsets in the cover must be disjoint) by requiring that any solution
251
. b the size bound. Then x denotes the sizes." and a positive integer bound. where each x is an m-tuple of integers and b an integer. and where the bound is equal to a third of the size of the set. leaving the reader to verify that the proofs thus sketched are indeed correct. We indicate only the necessary constraints. z Proof. a positive integer "bin size. where the sum of all sizes is a multiple of 2. 2. 0-1 Integer Programming: Given a set of pairs (x. with (x. We restrict Knapsack to Partition by allowing only instances where the size of each element is equal to its value. c the values. 5. can the elements be partitioned into no more subsets than the given bound and so that the sum of the sizes of the elements of any subset is no larger than the bin size? 6. b) and with (c. We restrict Binpacking to Partitionby allowing only instances where the sum of all sizes is a multiple of 2.D. For instance. We restrict 0-1 Integer Programming to Knapsack by allowing only instances with a single (x. Binpacking: Given a set of elements. does there exist an m-tuple of integers y. A restriction proof works by placing restrictions on the type of instance allowed. and Q.7. 6.1 Some Important NP-Complete Problems 5. We restrict Chromatic Number to G3C by allowing only instances with a bound of 3. b). and given an m-tuple of integers c and an integer B. b) pair and where all values are natural numbers. Membership in NP is trivial for all six problems. y) . and where the size bound and the value bound are both equal to half the sum of all sizes. where the bin size is equal to half the sum of all sizes. each component of which is either 0 or 1. a positive integer size for each element. B the value bound of an instance of Knapsack.B? Here (a. We restrict Set Cover to X3C by allowing only instances where the set has a number of elements equal to some multiple of 3. All proofs are by restriction. where all subsets have exactly three elements. and where the bound on the number of bins is 2. b) denotes the scalar product of a and b. 3. 1. y) S b for each pair (x.

the question is "Does G contain a clique (a complete subgraph) of size k or larger?" El Proof. however. The following theorems provide two more examples. G. For instance. Vertices correspond to vertices.4 Prove that Subgraph Isomorphism is NP-complete by restricting it to Clique. that this proof remains much simpler than any of our reductions from the 3SAT problems. An instance of the problem is given by two graphs. Theorem 7.13 Clique is NP-complete. where the first graph has as many vertices and as many edges as the second. note. The idea of restriction can be used for apparently unrelated problems. and a positive integer bound. We leave a proof of the next theorem to the reader. An instance of the problem is given by a graph. and a bound. and the bound for VC equals the Q. our earlier reduction from Traveling Salesman to HC (in their decision versions) can be viewed as a restriction of TSP to those instances where all intercity distances have values of 1 or 2. number of vertices minus the bound for Clique. Our restriction here is trivial-no change-because the problem as stated is already isomorphic to Vertex Cover.14 k-Clustering is NP-complete. We restrict the problem to instances where k equals 2 and all measures of dissimilitude have value 0 or 1. where each vertex corresponds to an element.D. the question is "Does G contain a subgraph isomorphic to H?" F As a last example. The question is "Can the set be partitioned into k nonempty subsets such that the sum over all subsets of the sums of the dissimilitudes between pairs of elements within the same subset does not exceed the given bound?" El Proof Membership in NP is obvious. wherever the instance of Clique has an edge.252
Proving Problems Hard be composed only of disjoint sets. where there exists an edge between two vertices exactly when the dissimilitude between the corresponding vertices
. the corresponding instance of VC has none and vice versa. a natural number k no larger than the cardinality of the set. a positive integer measure of "dissimilitude" between pairs of elements. we consider a slightly more complex use of restriction. k. Exercise 7. whereas a restriction only narrows down the collection of possible instances. The resulting problem is isomorphic to Maximum Cut. An instance of the problem is given by a set of elements. Such a requirement would change the question and hence the problem itself. confirming our earlier advice. Theorem 7. G and H. this subproblem is then seen to be identical (isomorphic) to HC.E.

by which we mean a subset such that the inclusion of an element does not lead directly to the inclusion of another. Q. (At times. (The boundary may be higher: G3C is NP-complete for all graphs of bounded degree when the bound is no smaller than 4. three-dimensional matching is NP-complete (see Exercise 7. Another striking aspect of NP-complete problems is the distinction between the numbers 2 and 3: 3SAT is NP-complete.2 Some P-Completeness Proofs equals 1. however: while scheduling tasks on one processor is just a permutation problem.) Such characteristics may help in identifying potential NP-complete problems.) This difference appears mostly due to the effectiveness of matching techniques on many problems characterized by pairs. and where the bound of MxC equals the sum of all dissimilitudes minus the bound of k-Clustering. G3C is NP-complete. A good example is the graph isomorphism problem: while subgraph isomorphism is NPcomplete. but it is solvable in polynomial time for graphs of degree 3. but X2 C is in P. but 2SAT is solvable in linear time. When this bit of leeway is absent. there also appears to be a difference between 1 and 2-such as scheduling tasks on 1 or 2 processors. graph isomorphism (is a given graph isomorphic to another one) is not known-and not believed-to be so.20).D. The difficulty of the problem resides in the subset search. a class that (as we shall see in Section 9. even when suspected not to be in P. At this point.2
Some P-Completeness Proofs
P-complete problems derive their significance mostly from the need to distinguish between P and L or between P and the class of problems that are profitably parallelizable. X3C is NP-complete. may not be NP-complete. is solvable in linear time.E. Perhaps the most salient characteristic is that the problem statement must allow some freedom in the choice of the solution structure. the reader will undoubtedly have noticed several characteristics of NP-complete problems. That distinction alone might not justify the inclusion
. This apparent difference is just an aspect of subset search. The property obeyed by the subset need not be difficult to verify.7. Many of the NP-complete problems discussed so far involve the selection of a loosely structured subset. but G2C. a problem. but two-dimensional matching is just "normal" matching and is in P.4) is a subset of P n PoLYL. indeed the definition of NP guarantees that such a property is easily verifiable. which just asks whether a graph is bipartite.
253
7. scheduling them on two processors requires selecting which tasks to run on which machine.

We prove the problem P-complete by transforming PSA to it. can the empty clause be derived by unit resolution.. element in PSA becomes a one-literal clause in our problem.. j .. as well as each terminal.15 Unit Resolution is P-complete. by resolving a one-literal clause with another clause that contains the literal's complement. Essentially. since we do not know whether L is a proper subset of P). with the indices of its two inputs (both of them less than i). at most 2mn resolutions can ever be made. Specifically. for each initially reachable element x. . we present proofs of P-completeness. In this spirit. (ii) an AND gate. each initially reachable. two of them through reductions from PSA. Intuitively. we set
. will u be visited before or after v in a recursive depth-first search of the graph? Theorem 7. however. where each ai is one of three entities: (i) a logic value (true or false). the elements of PSA become variables in our problem. any illustration of potential differences is useful. . which get used in the same fashion. by unit resolution. This process works in polynomial time because.254
Proving Problems Hard
of proofs of P-completeness here. with n variables and m initial clauses. or (iii) an OR gate. anThe question is simply "Is the output of the circuit true?" * Depth-FirstSearch: Given a rooted graph (directed or not) and two distinguished vertices of the graph. The resolution process will typically create some new single-literal clauses. with the indices of its two inputs (both of them less than i). for three different problems: * Unit Resolution: Given a collection of clauses (disjuncts).. 5 Inyield {1. An exhaustive search algorithm uses each single literal clause in turn and attempts to resolve it against every other clause.
F1
Proof The problem is in P. storing the result. I an.. the problem is P-complete because we need to store newly generated clauses. that is. for instance. {xi and {x-. u and v. the single-literal clause is then discarded. either all single-literal clauses (including newly generated ones) are used or the empty clause is derived. * Circuit Value (CV): A circuit (a combinational logic circuit realizing some Boolean function) is represented by a sequence a. the constraints imposed by the resource bound used in the transformation (logarithmic space) lead to an interesting style of transformation-basically functions implemented by a few nested loops. The output of the circuit is the output of the last gate. Each application of unit resolution decreases the total number of literals involved. while the triples of PSA become clauses of three elements in our problem. Eventually. The difference between polynomial time and logarithmic space not being well understood (obviously.

so we are proving the stronger statement that Monotone Circuit Value is P-complete. We claim that. Indeed. We can propagate the logic values from the input to each gate in turn until the output value has been computed. we can conclude that unification-based languages cannot be executed at great speeds on parallel machines. which is logically equivalent to the implication x
A
y
X
z. In PSA.
E
Actually. z). what makes the problem P-complete is the need to store intermediate computations. and for each triple in the relation (x. there is a clause {x} only if x is accessible. The basic idea is to convert a triple (x. each step in the propagation corresponds to the application of one triple from PSA. This claim is easy to prove by induction and the conclusion follows. We use the version of PSA produced in our original proof of P-completeness (Theorem 6.7. the output of the AND gate could be false. y. A step in the propagation may not yield anything new. Theorem 7.E.
We can carry out this transformation on a strictly local basis because all we need store is the current element or triple being transformed. y.D. z) of PSA into an AND gate with inputs x and y and output z. The real problem is to propagate logical values for each of the inputs. This negative result affects the interpretation of logic languages based on unification (such as Prolog). moreover.2 Some P-Completeness Proofs
255
a one-literal clause {x}. That the problem is in P is clear. which is simply the final value of one of the elements. at any point in the resolution process. The circuit will have all elements of PSA as inputs. since this mimics exactly what takes place in PSA. we set a three-literal
clause {x. we set a one-literal clause fy-. We prove that CV is P-complete by a reduction from PSA. thus our transformation runs in logarithmic space. z). although accessibility is never lost once gained. Proof.16 Circuit Value is P-complete. with those inputs that correspond to elements of the initial set of PSA set to true and all other inputs set to false. truth values could fluctuate. y. our construction uses only AND and OR gates.8). the clause {x} can be derived. elements can become accessible through the application of the proper sequence of triples. which has a single element in its target set. even though the value that we are propagating is in fact already true-that is. in our circuit. for each terminal element y. this corresponds to transforming certain inputs from false to true because of the true output of an AND gate. Since unit resolution is an extremely simplified form of unification.
. Q. if x is accessible. Thus what we need is to propagate truth values from the inputs to the current stage and eventually to the output. Intuitively.

y. n . each pass through the ordering must produce at least one newly accessible element. Thus for each element z of PSA. whereas. since. we can use just n . in order to make a difference. z).256
Proving Problems Hard
(a) the real circuit fragment
(b) the fake circuit fragment
Figure 7." i.1 stages.1 + 1 stages. however..13
The real and "fake" circuit fragments. When all propagation is complete. where k is the number of initially accessible elements and I is the size of the target set. we will set up a "propagation line" from input to output. it may not produce anything new because one of the first two elements is not yet accessible. for each triple of PSA that has z as its third element. We can view this circuit more or less as a matrix of n rows and m (n -1) columns.
We should therefore combine the "previous" truth value for our element and the output of the AND gate through an OR gate to obtain the new truth value for the element. it could produce a new accessible element. each time with the same ordering. Thus we have to repeat the process. How many times do we need to repeat it? Since. if we use a triple too early. The remaining problem is that we have no idea of the order in which we should process the triples. We have to live with some fixed ordering. This line is initialized to a truth value (true for elements of the initial set. in which the values
. false otherwise) and is updated at each "step.e. n .) From an instance of PSA with n elements and m triples. Figure 7." each with a total of m (n -1) propagation steps grouped into n . if applied later.13(a) illustrates the circuit fragment corresponding to the triple (x. The order could be crucial. yet this ordering could be so bad as to produce only one new accessible element. the "line" that corresponds to the element in the target set of PSA is the output of the circuit. (Actually.1 is no larger asymptotically and extra stages cannot hurt.1 stages (where n is the total number of elements) always suffice. we produce a circuit with n "propagation lines.k . The update is accomplished by a circuit fragment made of an AND gate feeding into an OR gate: the AND gate implements the triple while the OR gate combines the potential new information gained through the triple with the existing information.

not just at the affected row. If this triple is (x.D. d)}. y.14
The complete construction for a small instance of PSA.1 circuit fragments have exactly the same size but do not affect the new value. a.7. z). d}. c. as
illustrated in Figure 7. Q. b). S = {a}. given by X = {a. (a. then the circuit fragment implements the logical function z(i + 1) = z(i) V (x(i) A y(i)). (b. Our special version of PSA with a single element forming the entire initially accessible set can be used for the reduction.E. T = {d}. showing that Monotone CV
. for k=l to n do (* if k=z then place else place n-i stages for n elements *) one pass through all m triples *) say.14 illustrates the entire construction for a very small instance of PSA.2 Some P-Completeness Proofs
out Stage I Stage 2 Stage 3
Figure 7. it is helpful to place a circuit fragment at each row. In order to make index computations perfectly uniform. b.
they can implement the logical function z(i + 1) = z(i) V (z(i) A z(i)).
and R
=
{(a. (x.y. the other n . b.
Now the entire transformation can be implemented with nested loops:
for i=l to n-i do (* for j=1 to m do (* (current triple is.z) ) update column values *) the real AND-OR circuit fragment the fake AND-OR circuit fragment
Indices of gates are simple products of the three indices and of the constant
size of the AND-OR circuit fragment and so can be computed on the fly in the inner loop.
of the i + 1st column are derived from those of the ith column by keeping n .1 values unchanged and by using the AND-OR circuit fragment on the one row affected by the triple considered at the i + 1st propagation stage. c). c. For instance.13(b). Figure 7. Thus the transformation takes only logarithmic space (for the three loop indices and the current triple).

.) To simplify the construction. the transformation is rather atypical for a P-complete problem. i. which is used as inputs to m further gates. The other way moves from E(i) to T(i). in effect. two vertices In(i. (This problem is surprisingly difficult to show complete. then down the chain by picking up all vertices (in other gadgets) that are outputs of this gate. then moving to X(i).. 1) and In(i. ending at T(i). since we need only traverse the graph in depth-first order (a linear-time process). The gadget is illustrated in Figure 7. .m) that correspond to the fan-out of the gate.17 Depth-FirstSearch is P-complete. Circuit Value is. This graph fragment has two vertices to connect it to the inputs of the gate and as many vertices as needed for the fan-out of the gate.x . plus all of the vertices in other gadgets (vertices labeled In(j. giving us a full scale of satisfiability problems from P-complete (CV) all the way up to ExPSPAcE-complete (blind Peek). in being at least as complex as a fairly difficult NP-hardness proof. NOR or NAND gates. ascends that chain of m vertices without visiting any of the input vertices in other gadgets. noting which of u or v is first visited. One way is to proceed from E(i) through In(i. and from there moves to
.15. which we shall use to connect gadgets in a chain. The problem is clearly in P. We could also replace our AND-OR circuit fragments by equivalent circuit fragments constructed from a single. This traversal visits all of the vertices in the gadget.. Im. universal gate type. reaches S(i). it is perhaps the most important P-complete problem. We create a gadget that we shall use for each gate. y). and output Out(i). These vertices are: an entrance vertex E(i) and an exit vertex X(i). with indices jI. and is composed entirely of NOR gates. the problem is P-complete because we need to mark visited vertices in order to avoid infinite loops. where y is 1 or 2 and 1 .
D
Proof. we use the version of CV described earlier in our remarks: the circuit has a single input set to true. 2) to S(i). if gate i has inputs In(i. . a version of Satisfiability where we already know the truth assignment and simply ask whether it satisfies the Boolean formula represented by the circuit. Theorem 7.e. we set up a gadget with m + 6 vertices. We prove this problem P-complete by transforming CV to it. We can verify that there are two ways of traversing this gadget from the entrance to the exit vertex. 2) that correspond to the inputs of the gate. Intuitively. two vertices S(i) anild T(i) that serve as beginning and end of an up-and-down chain of m vertices that connect to the outputs of the gate. As such. Specifically. 1) and In(i. has a single output. 1) and In(i.258
Proving Problems Hard remains P-complete even when exactly one of the inputs is set to true and all others are set to false. 2).

whether in simple tasks. is true if and only if the depth-first search visits S(n) before T(n).15
The gadget for depth-first search. 1) and In(i. converse is similarly established. 1)
= In(j. m-l) = In (j. The complete construction is easily accomplished in logarithmic space. This traversal does not visit any vertex corresponding to inputs. -)
Out (i.7. -)
Figure 7. or in complex ones.E. m) = In (j. Since depth-first search is perhaps the most fundamental algorithm for state-space exploration. gate n. that belong to the gadget itself. not even the two. The two traversals visit S(i) and T(i) in opposite order. such as game tree search.2 Some P-Completeness Proofs
259
Out (i. We claim that the output of the last gate. the vertices In(n.. such as connectivity of graphs. We chain all gadgets together by connecting X(i) to E(i + 1) (of course. -) Out (i. The proof is an easy induction: the output of the NOR gate is true if and only if both of its inputs are false. the gadgets are already connected through their input/output vertices). by induction. which can be done only by using the first of the two possible traversals. In(i..
X(i). 2). The Q. as it is very uniform-a simple indexing scheme allows us to use a few nested loops to generate the digraph.-. 1) and In(n.D. 2) of the last gate have not been visited in the traversal of the previous gadgets and thus must be visited in the traversal of the last gadget. which visits S(n) before T(n). so that. this result shows that
.

but obvious analogs exist for any complexity class. and enumeration problems? We deliberately restricted our scope to decision problems at the beginning of this chapter.1 Turing Reductions and Search Problems
As part of our restriction to decision problems. taking advantage of the fact that search and optimization problems can be reduced to decision problems. which do not have such problems. and (ii) the less powerful many-one reductions could lead to finer discrimination. In considering optimization problems. In particular. while the use of Turing reductions enlarges the scope to search and optimization problems-problems for which no completeness results could otherwise be obtained. 7.1 A problem is NP-hardif every problem in NP Turing reduces to it in polynomial time. while we claimed that generality remained unharmed. optimization. But what about search. so that it is applicable to classes such as PoLYL. we also chose to restrict ourselves to many-one reductions.3
From Decision to Optimization and Enumeration
With the large existing catalog of NP-complete problems and with the rich hierarchy of complexity classes that surrounds NP.3. using the same type of reduction for all four. and equivalent problems to search and optimization versions by using Turing reductions from these versions to decision problems. We give the definition only for NP. it is NP-easy if it Turing reduces to some problem
. easy. our purpose was to simplify our study. complexity theory has been very successful (assuming that all of the standard conjectures hold true) at characterizing difficult decision problems. Definition 7. the class of most interest to us. we generalize the concepts of hard. easy. since they do not belong to complexity classes as we have defined them. there we defined hard. We now examine how this generalization works. We begin by extending the terminology of Definition 6. note that our generalization does not make use of complete problems. and equivalent problems in terms of complete problems. Since our present intent is to address search and optimization problems. we use the first argument iTrreverse.3. as all of our work on decision problems would extend to optimization problems. Our reasons were: (i) complexity classes are generally closed under many-one reductions.
7.260
Proving Problems Hard parallelism is unlikely to lead to major successes in a very large range of endeavors.

an NP-hard problem is solvable in polynomial time only if P equals NP. P. FP the class of all functions computable in polynomial time. and weight bound B. in fact. integer-valued weight function w. Theorem 7. is itself NP-hard.7. and other such classes are restricted to decision problems. and so forth. Let an instance of Knapsack have n objects. El
We argued in a previous section that the decision version of an optimization problem always reduces to the optimization version. and Set Cover are all NP-hard-in decision. Exercise 7. Traveling Salesman. the decision version of which is NP-complete. Proof. verifying each choice through calls to the oracle for the decision version. The following reduction from the optimization version of Knapsack to its decision version illustrates the two phases. D The characteristics of hard problems are respected with this generalization. an NP-equivalent problem is tractable if and only if P equals NP. we shall prefix the class name with an F to denote
261
the class of all functions computable within the resource bounds associated with the class. Maximum Cut. the answer is yes. and optimization versions.
2 Whether Turing reductions are more powerful than many-one reductions within NP itself is not known. Since many-one reductions may be viewed as (special cases of) Turing reductions. We use this notation only with deterministic classes. and it is NP-equivalent if it is both NP-hard and NP-easy.18 Knapsack is NP-easy. Since L.
. NP-equivalence is the generalization through Turing reductions of NPcompleteness. search. any NP-complete problem is automatically NP-equivalent.5 Prove that FP is exactly the class of P-easy problems.3 From Decision to Optimization and Enumeration in NP in polynomial time. Exp. since we have not defined nondeterminism beyond Boolean-valued functions. 2 In particular. The technique of reduction is always the same: first we find the optimal value of the objective function by a process of binary search (a step that is necessary only for optimization versions). however. k-Clustering. Can the search and optimization versions of these problems be reduced to their decision versions? For all of the problems that we have seen. Such an
instance is described by an input string of size O(n log wmal + n log Vmax). in particular. in which case all NP-easy problems are tractable. hence FL denotes the class of all functions computable in logarithmic space. Turing reductions are known to be more powerful than many-one reductions within Exp. Hence any optimization problem. then we build the optimal solution structure piece by piece. For instance. integer-valued value function v.

but it makes no difference to the correctness of the proof. We proceed one object at a time: for each object in turn. we ask the oracle whether there exists a solution to the new knapsack problem formed of (n . we try each in turn: when trying object i.w(j) and the value bound to Vp. say j.262
Proving Problems Hard
where wmax is the weight of the heaviest object and vmax is the value of the most valuable object.v(j). The weight bound is then updated to W . First note that the value of the optimal solution is larger than zero and no larger than n Vmax. is included in the partial solution. Obviously. (We made our reduction unnecessarily complex by overlooking the fact that objects eliminated in the choice of the next object to include need not be considered again. . it can be searched with a polynomial number of comparisons using binary search. eventually the answer must be "yes.1) objects (all but the ith)." we try with the next object.) While all reductions follow this model. Hence the construction phase requires only a polynomial number of calls to the oracle. Initially..16. Q. At worst. We use this idea to determine the value of the optimal solution. we shall have examined n . with weight bound set to B . Now we need to ascertain the composition of an optimal solution. so that the complete reduction runs in polynomial time.v(i). for a total of kn-3k(k-1) calls. for a solution including k objects. this fact is of paramount importance in a search algorithm that attempts to solve the problem.D. while this range is exponential in the input size.k + 1 objects-and thus called the decision routine n . the value bound is initially set at Ln vmax/ 2 J and then modified according to the progress of the search. The complete procedure is given in Figure 7. To pick the first object.w(i) and value bound set to Vpt. call it Vp.
. Hence the optimization version of Knapsack Turing reduces to its decision version in polynomial time. the partial solution under construction includes no objects. and the process is repeated until the updated value bound reaches zero. n -k for our second. we often have to rephrase the problem to make it amenable to reduction. and the corresponding object.k + 1 times-for our first choice." since a solution with value V0pt is known to exist. Our algorithm issues log n + log Vmax queries to the decision oracle. it calls upon the oracle a polynomial number of times (at most (log vmax + log n + n(n + 1)/2) times) and does only a polynomial amount of additional work in between the calls. not all are as obvious. If the answer is "no. the value of the optimal solution is known.E. At the outcome of the search. and so on. we determine whether it may be included in an optimal solution.

weight. value.limit: integer. currentweight plays the same role for weights. currentvalue :. (* 1--n is the range of objects to choose from.limit. the target value. optimal := low.value. var solution: boolarray). currentvalue := C. solution is a boolean array of size n: true means that the corresponding element is part of the optimal solution *) begin (* The sum of all values is a safe upper bound. weight. repeat (* Find next element that can be added *) index := index+l.weight.mid) then high : mid else low : mid end. weight are arrays of natural numbers of size n.0. currentweight := 0. *) sum := 0. (* Build the optimal knapsack one object at a time.n. for i:=1 to n do sum := sum+value[i]. currentweight := currentweight+weight[index] end until currentvalue = optimal
end.n. index points to the next candidate element for inclusion. if oracle(index. *) for i:=l to n do solution[i] := false. index :. *) low := 0. currentvalue is the sum of the values of the objects included so far. if oracle(l.
. (* Knapsack *)
Figure 7.7. high := sum while low < high do begin mid := (low+high) div 2. limit is the weight limit on any packing.value.3 From Decision to Optimization and Enumeration
263
Procedure Knapsack(l.currentvalue+value[index]. and returns true if the target value can be reached or exceeded.16
Turing reduction from optimization to decision version of
Knapsack.
(* Use binary search to determine the optimal value.
The oracle for the decision version takes one more parameter.n.limit-currentweight. optimal-currentvalue) then begin solution[index] := true.value: intarray.

Lemma 7. However.)
E
In fact.) The two key points in the proof are: (i) the range of values of the objective function grows at most exponentially with the size of the instance.6 Prove that Minimum Test Set (see Exercise 2. although completing a partially built tour differs considerably from building a tour from scratch-in completing a tour. what is needed is a simple path between two distinct cities that includes all remaining cities. can the piece be completed into a full structure of appropriate value) has the same structure as the optimization problem itself. it is sufficient that it reduces to some NP-complete decision problem-in fact. A search or optimization problem is termed self-reducible whenever it reduces to its own decision version. the proof of which we leave to the reader.
(Hint: a direct approach at first appears to fail.10) is NP-easy. Table 7. In order for the problem to be NP-easy. self-reducibility is not even necessary. Of course. recasting the problem in terms of pairs separated so F1 far and of pairs separated by each test allows an easy reduction. can be replaced by the oracle for n with at most a polynomial change in the
running time and number of oracle calls. While we have not presented any example of hardness or equivalence proofs for problems. as a result of the following lemma. thereby allowing a binary search to run in polynomial time. (Hint: it is possible to set up a configuration in which obtaining an optimal tour is equivalent to completing a partial tour in the original problem.264
Proving Problems Hard Exercise 7. thereby allowing it to reduce easily to the decision problem. the decision versions of which are in classes other than
.5 summarizes the steps in a typical Turing reduction from a search or optimization problem to its decision version. not a cycle including all remaining
cities. or for any finite collection of problems in NP.
n
For instance. so that a direct selfreduction works.7 Prove that Traveling Salesman is NP-easy. Traveling Salesman is NP-easy.
Exercise 7.1 Let H be some NP-complete problem. However. then an oracle for any
problem in NP. because there is no way to set up new instances with partial knowledge when tests are given as subsets of classes. it may be simpler to show that completing a partial tour is itself NP-complete and to reduce the original problem to
both its decision version and the completion problem. all NP-complete decision problems discussed in the previous sections are easily seen to have NP-equivalent search or optimization versionsversions that are all self-reducible. and (ii) the completion problem (given a piece of the solution structure. to some collection of NP-complete problems.

reflect the changes.
7. one element at a time.2 The class coNP is composed of the complements of the problems in NP. establish lower and upper bounds for the value of the objective function at an optimal solution. try all remaining elements. researchers have also used special reductions among optimization problems to study the fine structure of NP-easy optimization problems (see the bibliography). of course. this is a stronger conjecture than P #& NP.
If dealing with an optimization problem.
. however.3 From Decision to Optimization and Enumeration
265
Table 7. For instance.5
e
The structure of a typical proof of NP-easiness.
Definition 7.As usual. being extremely powerful. from the empty set. is distinct from the old. coNP. the problem is also in P). D Build up the solution. it does not appear that the complement of a problem in NP is necessarily also in NP (unless. NP. Hence Turing reductions allow us to extend our classification of decision problems to their search or optimization versions with little apparent difficulty. but with "yes" and "no" instances reversed by negating the question.2
The Polynomial Hierarchy
One of the distinctions (among decision problems) that Turing reductions blur is that between a problem and its complement. mask a large amount of structure. Since each problem in NP has a natural complement. with the same set of valid instances. As previously mentioned. It should be noted. it is conjectured that the new class. then use binary search with the decision problem oracle to determine the value of the optimal solution. This step may require considerable ingenuity. since negative answers to problems in NP need not have concise certificates-but rather may require an exhaustive elimination of all possible solution structures. similar techniques apply. In order to determine which element to place in the solution next. and then interrogate the oracle on the existence of an optimal solution to the instance formed by the remaining pieces changed as needed. In the following sections. we shall set up
a new class to characterize these problems.
* Determine what changes (beyond the obvious) need to be made to an instance when a first element of the solution has been chosen. that Turing reductions. ] Thus for each problem in NP. we have a corresponding problem in coNP. we examine in more detail the structure of NP-easy decision problems and related questions.3.
NP.7. Unsatisfiability is in coNP as is Non-Three-Colorability.

8 Prove that NP $FcoNP implies P #FNP.17: What are the classes NP n coNP and NP U coNP? Is NP n coNP equal to P?
. somewhat surprisingly.
F2
It is easily seen that.9 Prove that. The definition of coNP from NP can be generalized to any nondeterministic class to yield a corresponding co-nondeterministic class.17. given any problem complete for NP.266
Proving Problems Hard
Figure 7." As another example. then NP equals coNP. This introduction of co-nondeterminism restores the asymmetry that we have often noted. is pictured in Figure 7. The first is suggested by Figure 7. problems in nondeterministic classes have concise certificates for their "yes' instances. co-nondeterministic machines can do the same for a logical "AND. while those in co-nondeterministic classes have them for their "no" instances. F1 The world of decision problems. prompts several questions. For instance. moreover. in the rabbit analogy. Exercise 7. it turns out.56).
The introduction of co-nondeterministic classes. in particular coNP. can carry out an arbitrarily large logical "OR" at no cost. While it is conjectured that NP differs from coNP and that NExp differs from coNExp. while we could have NP = coNP and yet P =# NP.17 The world of decision problems around NP. under our new conjecture.
since NP 0 coNP implies P # NP. in view of the properties of complete problems. that NLk equals coNLk-a result known as
the Immerman-Szelepcsenyi theorem (see Exercise 7. while nondeterministic machines. no NP-complete problem can be in coNP (and vice versa) if the conjecture holds. if an NP-complete problem belongs to coNP. that NL equals coNL and. in general. Exercise 7. this problem's complement is complete for coNP.

in fact. But. to date. does the minimum cover for the graph have size K? * Minimal Unsatisfiability:Given an instance of SAT. is it the case that it is unsatisfiable. is it satisfiable by exactly one truth assignment? * Traveling Salesman Factor: Given an instance of TSP and a natural number i. Such excellent behavior is taken as evidence that a polynomial-time algorithm is "just around the corner. but that removing any one clause makes it satisfiable? * Unique Satisfiability: Given an instance of SAT. with P c NP n coNP. Duality of linear programs ensures that linear programming is in both NP and coNP. Surprisingly. the following problems: * Optimal Vertex Cover: Given a graph and a natural number K. F]
267
. using the number of calls to the decision oracle as the resource bound! Consider. primality is also in NP-that is. current primality testing algorithms run in time proportional to nloglogn for an input of size n. linear programming is in P. is the length of the optimal tour a multiple of i? (Incidentally." The similar question. this is yet another open question.10 Prove that these problems are NP-equivalent. notice the large variety of decision problems that can be constructed from a basic optimization problem. "Is NP U coNP equal to the set of all NP-easy decision problems?" has a more definite answer: the answer is "no" under the standard conjecture.7. belong to P. as usual in such cases.3 From Decision to Optimization and Enumeration The second question is of importance because it is easier to determine membership in NP n coNP than in P-the latter requires the design of an algorithm. however. although. every prime number has a succinct certificate-so that primality testing is in NP n coNP. In fact. if the extended Riemann hypothesis of number theory is true. membership in NP n coNP appears to be an indication that the problem may. Unfortunately. is clearly in NP: a single nontrivial divisor constitutes a succinct certificate.) Exercise 7. as shown by Khachian [1979] with the ellipsoid algorithm. Even without this hypothesis. the complement of primality. which is hardly worse than polynomial.P were linear programming and primality testing. Compositeness. but the former needs only verification that both "yes" and "no" instances admit succinct certificates. for example. it is strongly suspected that primality is in P. while this has not yet been proved. pay particular attention to the number of oracle calls used in each Turing reduction. the standard conjecture is that the two classes differ. However. Indeed. we can even build a potentially infinite hierarchy between the two classes. Two early candidates for membership in (NP n coNP) . then primality is definitely in P.

41). The first part is easy: SATUNSAT is the intersection of a version of SAT (where the question is "Is the collection of clauses represented by the first half of the input satisfiable?") and a version of UNSAT (where the question is "Is the collection of clauses represented by the second half of the input unsatisfiable?"). yet is not known to be DP-complete. The second part comes down to figuring out how to use the knowledge that (i) any problem X E DP can be written as the intersection X = Yl n Y2 of a problem Y e NP and a problem Y E coNP and (ii) SAT is NP1 2 complete while UNSAT is coNP-complete. and "uniqueness" for the third) that a special class has been defined for them. cannot be in NP unless NP equals coNP. as we can show that DP = NP U coNP holds if and only if NP = coNP does (see Exercise 7. it is conjectured to be a proper superset of NP U coNP. El For each of the first three problems.
LI
Proof We need to show that SAT-UNSAT is in DP and that any problem in DP many-one reduces to it in polynomial time. a satisfiability problemnamely. the SAT-UNSAT problem. we conclude that DP contains both NP and coNP. (The exact situation of Unique Satisfiability is unknown: along with most uniqueness versions of NP-complete problems. Such problems are common enough (it is clear that each of these three problems is representative of a large class-"exact answer" for the first. The separation between these classes can be studied through complete problems: the first two problems are many-one complete for DP while the fourth is many-one complete for the class of NP-easy decision problems. one of a problem in NP and the other of a problem in coNP.268
Proving Problems Hard Exercise 7. it is in DP. similarly. for XeNPand YEcoNP. that can be written
as Z=XnY. we can easily reduce UNSAT to
.19 SAT-UNSAT is DP-complete. Z. prove that the resulting problem is simply NP-complete.11 Let us relax Minimal Unsatisfiability by not requiring that the original instance be unsatisfiable. of course.
L1
From its definition. We can easily reduce SAT to SAT-UNSAT by the simple 2 device of tacking onto the SAT instance a known unsatisfiable set of clauses on a different set of variables. "criticality" for the second.) The basic DP-complete problem is. Definition 7.3 The class DP is the class of all sets. the set of "yes" instances can be obtained as the intersection of two sets of "yes" instances. An instance of this problem is given by two sets of clauses on two disjoint sets of variables and the question asks whether or not the first set is satisfiable and the second unsatisfiable. so that we have Y -P SAT 1 and Y --MP UNSAT. in fact. Theorem 7.

using a superscript for the oracle. if problems in NPNP were solvable in polynomial
. we can combine nondeterminism. coNP. say x. As with pNP.1. and from Y to UNSAT. If P were equal to NP. exists as a separate entity only under our standard
assumption. we denote this symbolically by pNP. co-nondeterminism. It is equally easy to reduce SAT and UNSAT simultaneously
269
to SAT-UNSAT: just tack the UNSAT instance onto the SAT one. In fact.D. as expected. Q. To begin with. With an oracle for one NP-complete problem. In order to understand the mechanism
for its construction. known as the polynomial hierarchy. Now our reduction is very simple: given an instance of problem X. any class (such as NP. Conversely. Another question raised by Figure 7. which adds no power whatsoever. (This is the substance of Lemma 7. i.3 From Decision to Optimization and Enumeration SAT-UNSAT.) Thus the class of NP-easy decision problems is the class of problems solvable in polynomial time with the help of an oracle for the class NP. an oracle for NP would just be an oracle for P.. characterizing the set of all problems that would thereby be tractable is of clear interest. As long as the latter equality remains possible.e. what if we used a nondeterministic one? The resulting class would be denoted by NPNP. So we apply to x the known many-one reductions from Yi to SAT. rather than using a deterministic polynomial-time Turing machine with our oracle for NP. we can solve in polynomial time any problem in NP. yielding instance x2.17 is whether NP-easy problems constitute the set of all problems solvable in polynomial time if P equals NP. Any class with this property. We then concatenate these two 2 instances into the new instance z = xI#x 2 of SAT-UNSAT. consider the class of all NP-easy decision problems: it is the class of all decision problems solvable in polynomial time with the help of an oracle for some suitable NP-complete problem. there is a potentially infinite hierarchy of such classes. we would have NPNP _
NPP = NP = P.7. and the oracle mechanism to define further classes. The reduction from x to z is a many-one polynomial time reduction with the desired properties. this class depends on our standard conjectures for its existence: if P were equal to NP.E. the resulting transformed instance is a "yes" instance if and only if both original instances are "yes" instances. yielding instance xi. hence an oracle for some NP-complete problem may be considered as an oracle for all problems in NP. we know that it is also an instance of problems Yi and Y2. since we can always solve problems in P in polynomial time. and DP) that collapses into P if P equals NP. hence we would have pNP = pP = P. Assuming that P and NP differ. since all can be transformed into the given NP-complete problem. since we have NP C NPNP (because an oracle
can only add power).

So. In other words. so that we must then have P = NP. At the next level. pNP. we can define a higher-level version of NP-easy problems. at the bottom. the class of NP-easy decision problems. while NP is El and coNP is rip. the nondeterministic classes Ek.pNP. the nondeterministic ones by A. These classes are defined recursively as:
IP
AP = EP = ri=
Ak+1
P
~
ktl =
NPYk
k*
nk+. such candidates must be solvable with the help of a guess. is AP (and also.4 The polynomial hierarchyis formed of three types of classes. Definition 7. and the co-nondeterministic ones by 1T. Since these are the names used for the classes in Kleene's arithmetic hierarchy (which is indeed similar). Exercise 7. and rop). because an oracle for P is no better than no oracle at all. Problems at any level of the hierarchy have the property that they can be solved in polynomial time if and only if P equals NP-hence the name of the hierarchy. while NpNP is E2 and NP p coNpNP is H2 . is AP. and a polynomial number of calls to oracles for NP. The deterministic classes are denoted by A. a superscript p is added to remind us that these classes are defined with respect to polynomial time bounds. compare this figure with Figure 7. then so would problems in NP. Thus P. Another level can now be defined on the basis of these NPNP-easy problems. a polynomial amount of work.
. and the co-nondeterministic classes l P. the three types of classes in the hierarchy are referred to by Greek letters indexed by the level of the class in the hierarchy.12 Present candidates for membership in NpNP . a class which we denote by coNPNP. the NpNP -easy problems. = co-Ek+1
The infinite union of these classes is denoted PH. so that a potentially infinite hierarchy can be erected.18. These two classes are similar to NP and coNP but are one level higher in the hierarchy.270
Proving Problems Hard
time. To simplify notation.17. each defined recursively: the deterministic classes AP.
ED
The situation at a given level of the hierarchy is illustrated in Figure 7. [2 We can now define the class consisting of the complements of problems in NPNP. so that NPNP may be a new complexity class. Pursuing the similarity. problems in NPNP are solvable in polynomial time if and only if P equals NP-just like the NP-easy problems! Yet we do not know that such problems are NP-easy. AP.

A proof of completeness is not very difficult but is rather long and not particularly enlightening. . x2 . which is fixed for each complete problem within the polynomial hierarchy but unrestricted for QSAT-another way of verifying that PH is contained within PSPACE. while a complete problem for riP is similar but has a universal outermost quantifier. for any truth assignment for the yi variables. the following problem is complete for EP. y.. Y2. . Xn and yi.3 From Decision to Optimization and Enumeration
271
Figure 7. Similar characterizations obtain for all nondeterministic and co-nondeterministic classes within PH. the formula evaluates to true?" In general. for which reasons we omit it. For instance. An instance is given by a Boolean formula in the variables xl. For instance. a complete problem for Z4 has k alternating quantifiers.18
The polynomial hierarchy: one level. a problem A is in EP if there exist a deterministic Turing machine M and a polynomial p() such that.. for each yes instance x of A. if the hierarchy is infinite. the question is "Does there exist a truth assignment for the xi variables such that. in time bounded by p(Ix I). Complete problems for the Ex and MP classes are just SAT problems with a suitable alternation of existential and universal quantifiers.7. The cx certificate gives the values of the existentially quantified variables of x and the family Fx describes all possible truth assignments for the universally quantified variables.
An alternate characterization of problems within the polynomial hierarchy can be based on certificates. cx. there exist a concise certificate cx and an exponential family of concise certificates Fx such that M accepts each triple of inputs (x. for any string z E Fx. with the outermost existential. z). . Note the close connection between these complete problems and QSAT: the only difference is in the pattern of alternation of quantifiers. no complete problem can exist for PH itself (as is easily verified using the same reasoning as for PoLYL).
. Complete problems are known at each level of the hierarchy-although.

the polynomial hierarchy is an intriguing theoretical construct and illustrates the complexity of the issues surrounding our fundamental question of the relationship between P and NP. There is no doubt that
3 The problem might more naturally be called Double Subset Sum. there exists a subset Is C {1. for each s C S. we could have P 0 NP but NP = coNP. 2. While decision problems may be regarded as computing the Boolean-valued characteristic function of a set. we let S be the set of natural numbers such that.
. if the hierarchy does not collapse. so are Unique Traveling Salesman Tour (see Exercise 7. The hierarchy also answers the question that we asked earlier: "What are the problems solvable in polynomial time if P equals NP?" Any problem that is Turing-reducible to some problem in PH (we could call such problem PH-easy) possesses this property.50). or optimization. more natural problems have been shown complete for various classes within PH. As we shall see in Chapter 8. several questions of practical importance are equivalent to questions concerning the polynomial hierarchy. generally refer to the encryption schemes based on Subset Sum as knapsack schemes. and natural numbers N and k. the question is "Does there exist a natural number M-to be defined-with a 1 as its kth bit?" To define the number M. NP) whether the hierarchy is truly infinite or collapses into some Ek.3 Enumeration Problems
M is empty.48) and Integer Expression Inequivalence (see Exercise 7.
E jEJ
ni with
0 if S However. A natural problem that can be shown complete for AP is the Double Knapsack problem. has an enumeration version. 3
=s. since such knowledge would solve the question P vs. with the result that the whole hierarchy would collapse into NP. asking how many (optimal or feasible) solutions exist for a given instance. Overall. . An instance of this problem is given by an n-tuple of natural numbers xi. it is not known (obviously. so that.272
Proving Problems Hard
While these complete problems are somewhat artificial. X. x2 . an m-tuple of natural numbers Y Y2.m
with Em Xi = s and there does not exist any subset Js c {1. 2. Ymn. In particular. We had already mentioned that Traveling Salesman Factoris complete for A<P.3.. Every problem that we have seen. 7. enumeration problems include all integer-valued functions. whether decision.
y-
is then the largest number in S that does not exceed N-or
There remains one type of problem to consider: enumeration problems.. who devised the problem to avoid some of the weaknesses of Subset Sum as a basis for encryption.. such problems form a proper superset of the NP-easy problems. . however. . search. cryptographers. . .

one Hamiltonian circuit for a graph does not appear to help very much in determining how many distinct Hamiltonian circuits there are in all. polynomial transformations may still be used if they preserve the number of solutions.3 From Decision to Optimization and Enumeration
273
enumeration versions are as hard as decision versions-knowing how many solutions exist. n Definition 7. we need only check whether the number is zero in order to solve the decision version. since the problems are not decision problems. started with x and cx on its tape. is exactly equal to the number of distinct concise certificates for x (that is. f (x). Exercise 7. there exists a nondeterministic polynomial-time Turing machine that can accept x in exactly f (x) different ways. counting the number of spanning trees of a graph and counting the number of Eulerian paths of a graph are two nontrivial examples. knowing how to find. if the number of solutions to the original instance equals the number of solutions of the transformed instance. However. search. Due to its properties. especially since there may be an exponential number of them. enumeration problems appear difficult even when the corresponding decision problems are simple: counting the number of different perfect matchings or of different cycles or of different spanning trees in a graph seems distinctly more complex than the simple task of finding one such perfect matching or cycle or spanning tree. Simpler examples include all problems. that is. In most cases. we would consider enumeration versions to be significantly harder than decision. the optimization version of which can be solved in polynomial time using dynamic programming techniques. By definition. Such transformations are called parsimonious. the enumeration version of any problem in NP is in #P. Completeness for #P is defined in terms of polynomial-time Turing reductions rather than in terms of polynomial transformations. However. Hence parsimonious transformations are the tool of
.5 An integer-valued function f belongs to #P (read "number P" or "sharp P") if there exist a deterministic Turing machine T and a polynomial p() such that. for each input string x. say. strings cx such that T. stops and accepts x after at most p(Ix ) steps). E In other words.7. after all. some enumeration tasks can be solved in polynomial time. or optimization versions. but also automatically induces a Turing reduction between the associated enumeration problems.13 Use dynamic programming to devise a polynomial-time algorithm that counts the number of distinct optimal solutions to the matrix chain product problem. Moreover. the value of the function. a parsimonious transformation not only is a polynomial transformation between two decision problems.

then all of its entries are either 0 or 1. In consequence. the only difference derives from the lack of alternation of signs in the definition of the permanent). and Partition are already parsimonious. The proof
. while they have so far been unable to devise any polynomial algorithm to compute the permanent. some enumeration problems associated with decision problems in P are nevertheless #P-complete. so that all NP-complete problems of Section 7.
Most proofs of NP-completeness. One such problem is counting the number of perfect matchings in a bipartite graph. all restriction proofs use an identity transformation. hence counting the number of perfect matchings in a bipartite graph is equivalent to computing the permanent of the adjacency matrix of that graph. moreover. Recall that the permanent of an n x n matrix A = (aij) is the number I ai7(i).
allowing the number of solutions to the original problem to be computed in polynomial time from the number of solutions to the transformed problem. That the generic transformation used in the proof of Cook's theorem can be made such is particularly important. can be mod-
ified so as to make the transformation weakly parsimonious. A closely related problem is computing the permanent of a square matrix. were #P-complete problems limited to the enumeration versions of NPcomplete problems.1 in the proofs of NP-completeness of MxC. However. We know that finding one such matching (or determining that none exists) is solvable in low polynomial time. Although the permanent of a matrix is defined in a manner similar to the determinant (in fact. mathematicians have long known how to compute the determinant in low polynomial time. HC. In fact. If the matrix is the adjacency matrix of a graph. each product term equals 1 if and only if the corresponding permutation of indices denotes a perfect matching. The remaining transformations involved can be made weakly parsimonious. Hence computing the permanent of a 0/1 matrix may be viewed as counting the number of nonzero product terms. in the definition in terms of cofactors. Observe that the transformations we used in Section 7. including Cook's proof. In the adjacency matrix of a bipartite graph. they would be of very little interest. Indeed. which is strictly parsimonious. so that each product term in the definition of the permanent equals either 0 or 1. where the sum is taken over all permutations 7r of the indices. yet counting them is #P-complete. it is enough that the transformation be weakly parsimonious.1 have #P-complete enumeration versions.274
Proving Problems Hard choice in proving #P-completeness results for NP-hard problems.
restricting ourselves to parsimonious transformations for this purpose is unnecessary. as it gives us our first #P-complete problem: counting the number of satisfying truth assignments for a collection of clauses. the same statement can be made about all known NP-complete problems.

Exercise 7. instead.14* Prove that these two variants of 3SAT are both in P. #P-hardness appears to be very strong evidence of intractability. problems that have only one instance for each value of n-see Exercise 7. It is easy to see that this class is contained in PSPACE. which are classes of sets.
275
7. #P-complete problems may remain intractable. for instance. It should be pointed out that many counting problems. with the following change: whenever multiple edges between a pair of vertices arise.) Exercise 7. it contains PH. a result that we shall not prove. they cannot be solved in polynomial time unless P equals NP.54). However.16 Consider a slight generalization of Maximum Cut. * Strong 3SAT requires that at least two literals be set to true in each clause. * Odd 3SAT requires that an odd number of literals be set to true in each clause. While it is difficult to compare #P. not all graphs have such covers.55). Others are too restricted (such as counting how many graphs of n vertices possess a certain property. the class P#P in our oracle-based notation.
. Will the naive transformation first attempted in our proof for MxC work in this case. regardless of the value of the bound. which is no harder than deciding whether the two graphs are isomorphicsee Exercise 7. How do #P-complete problems compare with other hard problems? Since they are all NP-hard (because the decision version of an NP-complete problem Turing reduces to its #P-complete enumeration version).4 Exercises that computing the permanent is a #P-complete problem provides the first evidence that no such algorithm may exist. a single triangle cannot be covered in this manner. replace them with a single edge of weight k. even if P equals NP. do not seem to be #P-complete. while in #P and apparently hard. we can use #P-easy decision problems.7. say k in number.15* Does Vertex Cover remain NP-complete when we also require that each edge be covered by exactly one vertex? (Clearly. a class a functions. in which each edge has a positive integer weight and the bound is on the sum of the weights of the cut edges. Some are NP-easy (such as counting the number of distinct isomorphisms between two graphs. In other words. with our other complexity classes.4
Exercises
Exercise 7.

1 < i S 3.) Exercise 7.) Exercise 7. E) of G3C. (Hint: this problem is very similar to X3C. each of size 4. Exercise 7. An instance of the problem is given by a set S of size 4k for some positive integer k and by a collection of subsets of S. with bound K = LIVI/3i. the question is "Do there exist k subsets in the collection that together form a partition of S?" (Hint: use a transformation from X3C. An instance of the problem is given by three sets of equal cardinality and a set of triples such that the ith element of each triple is an element of the ith set.21 Prove that both Vertex. E) and a natural number B.20 Prove that Three-DimensionalMatching is NP-complete. the question is "Does there exist a subset of vertices V' C V of size at most B such that every vertex (respectively.19 Prove that Cut into Acyclic Subgraphs is NP-complete. the question is "Can the set of vertices be partitioned into two subsets such that each subset induces an acyclic subgraph?" (Hint: transform one of the satisfiability problems.17 What is wrong with this reduction from G3C to Minimum Vertex-Deletion Bipartite Subgraph (delete at most K vertices such that the resulting graph is bipartite)? * Given an instance G = (V. An instance of the problem is given by an undirected graph G = (V.18 Prove that Exact Cover by Four-Sets is NP-complete. (Hint: use a transformation from VC.) Let us further restrict Vertex-Dominating Set by requiring that (i) the dominating set.) Exercise 7. the question is "Does there exist a subset of triples such that each set element appears exactly once in one of the triples?" Such a solution describes a perfect matching of all set elements into triples. An instance of the problem is given by a directed graph.276
Proving Problems Hard
Exercise 7. Identify what makes the transformation fail and provide a specific instance of G3 C that gets transformed into an instance with opposite answer.and Edge-DominatingSet are NPcomplete. is an independent set (no edges between any two of its members) and (ii) each
. edge) of the graph is dominated by at least one vertex in V'?" We say that a vertex dominates another if there exists an edge between the two and we say that a vertex dominates an edge if there exist two edges that complete the triangle (from the vertex to the two endpoints of the edge). the same construction should work for both problems.) Exercise 7. V'.22 (Refer to the previous exercise. just let the instance of MVDBS be G itself.

7.4 Exercises vertex in V - V' is dominated by at most one vertex in V'. Prove that the resulting problem remains NP-complete. (Hint: use a transformation from Positive 1in3SAT.) Exercise 7.23 Prove that Longest Common Subsequence is NP-complete. An instance of this problem is given by an alphabet X, a finite set of strings on the alphabet R C E*, and a natural number K. The question is "Does there exist a string, w E A*, of length at least K, that is a subsequence of each string in R?" (Hint: use a transformation from VC.) Exercise 7.24 Prove that the decision version of Optimal Identification Tree is NP-complete. An instance of the optimization problem is given - 0 I and a collection of n by a collection of m categories 01, 02 dichotomous tests {T,, T2, . . ., Tn} each of which is specified by an m x 1 binary vector of outcomes. The optimization problem is to construct a decision tree with minimal average path length:
m

277

Y(depth(Oi)
i=l

-

1)

where depth(0i) - I is the number of tests that must be performed to identify an object in category O0. The tree has exactly one leaf for each category; each interior node corresponds to a test. While the same test cannot occur twice on the same path in an optimal tree, it certainly can occur several times in the tree. (Hint: use a transformation from X3C.) Exercise 7.25 Prove that the decision version of Minimum Test Set (see Exercises 2.10 and 7.6) is NP-complete. (Hint: use a transformation from X3C.) Exercise 7.26 Prove that Steiner Tree in Graphs is NP-complete. An instance of the problem is given by a graph with a distinguished subset of vertices; each edge has a positive integer length and there is a positive integer bound. The question is "Does there exist a tree that spans all of the vertices in the distinguished subset-and possibly more-such that the sum of the lengths of all the edges in the tree does not exceed the given bound?" (Hint: use a transformation from X3C.) Exercise 7.27 Although finding a minimum spanning tree is a well-solved problem, finding a spanning tree that meets an added or different constraint is almost always NP-complete. Prove that the following problems (in their decision version, of course) are NP-complete. (Hint: four of them can be restricted to Hamiltonian Path; use a transformation from X3C for the other two.)

278

Proving Problems Hard 1. Bounded-DiameterSpanning Tree: Given a graph with positive integer edge lengths, given a positive integer bound D, no larger than the number of vertices in the graph, and given a positive integer bound K, does there exist a spanning tree for the graph with diameter (the number of edges on the longest simple path in the tree) no larger than D and such that the sum of the lengths of all edges in the tree does not exceed K? 2. Bounded-DegreeSpanning Tree: This problem has the same statement as Bounded-DiameterSpanning Tree, except that the diameter bound is replaced by a degree bound (that is, no vertex in the tree may have a degree larger than D). 3. Maximum-Leaves Spanning Tree: Given a graph and an integer bound no larger than the number of vertices, does the graph have a spanning tree with no fewer leaves (nodes of degree 1 in the tree) than the given bound? 4. Minimum-Leaves Spanning Tree: This problem has the same statement as Maximum-Leaves Spanning Tree but asks for a tree with no more leaves than the given bound. 5. Spanning Tree with Specified Leaves: Given a graph and a distinguished subset of vertices, does the graph have a spanning tree, the leaves of which form the given subset? 6. Isomorphic Spanning Tree: Given a graph and a tree, does the graph have a spanning tree isomorphic to the given tree? Exercise 7.28 Like spanning trees, two-colorings are easy to obtain when not otherwise restricted. The following two versions of the problem, however, are NP-complete. Both have the same instances, composed of a graph and a positive integer bound K. * Minimum Vertex-Deletion Bipartite Subgraph asks whether or not the graph can be made bipartite by deleting at most K vertices. * Minimum Edge-Deletion Bipartite Subgraph asks whether or not the graph can be made bipartite by deleting at most K edges. (Hint: use a transformation from Vertex Cover for the first version and one from MxC for the second.) Exercise 7.29 Prove that Monochromatic Vertex Triangle is NP-complete. An instance of the problem is given by a graph; the question is "Can the graph be partitioned into two vertex sets such that neither induced subgraph contains a triangle?" The partition can be viewed as a two-coloring of the vertices; in this view, forbidden triangles are those with all three vertices of the same color. (Hint: use a transformation from Positive NAE3SAT; you

7.4 Exercises must design a small gadget that ensures that its two end vertices always end up on the same side of the partition.) Exercise 7.30* Repeat the previous exercise, but for Monochromatic Edge Triangle, where the partition is into two edge sets. The same starting problem and general idea for the transformation will work, but the gadget must be considerably more complex, as it must ensure that its two end edges always end up on the same side of the partition. (The author's gadget uses only three extra vertices but a large number of edges.) Exercise 7.31 Prove that Consecutive Ones Submatrix is NP-complete. An instance of the problem is given by an m x n matrix with entries drawn from {0, 1} and a positive integer bound K; the question is "Does the matrix contains an m x K submatrix that has the "consecutive ones" property?" A matrix has that property whenever its columns can be permuted so that, in each row, all the is occur consecutively. (Hint: use a transformation from Hamiltonian Path.) Exercise 7.32* Prove that Comparative Containment is NP-complete. An instance of the problem is given by a set, S, and two collections of subsets of S, say B C 2s and C C 2 s; the question is "Does there exist a subset, X c S, obeying jib e BIX c b}l - l~c e CJX c c~l that is, such that X is contained (as a set) in at least as many subsets in the collection B as in subsets in the collection C?" Use a transformation from Vertex Cover. In developing the transformation, you must face two difficulties typical of a large number of reductions. One difficulty is that the original problem contains a parameter-the bound on the cover size-that has no corresponding part in the target problem; the other difficulty is the reverse-the target problem has two collections of subsets, whereas the original problem only has one. The first difficulty is overcome by using the bound as part of the transformation, for instance by using it to control the number of elements, of subsets, of copies, or of similar constructs; the second is overcome much as was done in our reduction to SF-REI, by making one collection reflect the structure of the instance and making the other be more general to serve as a foil. Exercise 7.33* Prove that Betweenness is NP-complete. An instance of this problem is given by a set, 5, and a collection of ordered triples from the set, C c S x S x S; the question is "Does there exist an indexing of S, i: S -{ l1, 2,. . ., ISI), such that, for each triple, (a, b, c) e C, we have

279

280

Proving Problems Hard

either i(a) < i(b) < i(c) or i(c) < i(b) < i(a)?" (Hint: there is a deceptively simple-two triples per clause-transformation from Positive NAE3SAT.) Exercise 7.34* Given some NP-complete problem by its certificate-checking machine M, define the following language: L = {(M, po), y) I3x such that M accepts (x, y) in p(jxj) time) In other words, L is the set of all machine/certificate pairs such that the certificate leads the machine to accept some input string or other. What can you say about the complexity of membership in L? Exercise 7.35 Prove that Element Generation is P-complete. An instance of the problem consists of a finite set S, a binary operation on S denoted
D: S x S
-*

S, a subset G

C

S of generators, and a target element t E S.

The question is "Can the target element be produced from the generators through the binary operation?" In other words, does there exist some parenthesized expression involving only the generators and the binary operation that evaluates to the target element? (Hint: the binary operation o is not associative; if it were, the problem would become simply NLcomplete, for which see Exercise 7.38.) Exercise 7.36 Prove that CV (but not Monotone CV!) remains P-complete even when the circuit is planar. Exercise 7.37 Prove that Digraph Reachability is (logspace) complete for NL (you must use a generic transformation, since this is our first NLcomplete problem). An instance of the problem is given by a directed graph (a list of vertices and list of arcs); the question is "Can vertex n (the last in the list) be reached from vertex 1?" Exercise 7.38 Using the result of the previous exercise, prove that Associative Generation is NL-complete. This problem is identical to Element Generation (see Exercise 7.35), except that the operation is associative. Exercise 7.39 Prove that Two-Unsatisfiabilityis NL-complete. An instance of this problem is an instance of 2SAT; the question is whether or not the collection of clauses is unsatisfiable. Exercise 7.40 Prove that Optimal Identification Tree (see Exercise 7.24 above) is NP-equivalent. Exercise 7.41* Prove that the following three statements are equivalent:
* = NP

U coNP

7.4 Exercises

281

* DP=NPUcoNP

* NP =coNP The only nontrivial implication is from the second to the third statement. Use a nondeterministic polynomial-time many-one reduction from the known DP-complete problem SAT-UNSAT to a known coNP-complete problem. Since the reduction clearly leaves NP unchanged, it shows that SAT-UNSAT belongs to NP only if NP = coNP. Now define a mirror image that reduces SAT- UNSAT to a known NP-complete problem. Exercise 7.42 Prove that, if we had a solution algorithm that ran in O(nlo f) time for some NP-complete problem, then we could solve any problem in PH in O(nlog") time, for suitable k (which depends on the level of the problem within PH). Exercise 7.43 Prove that the enumeration version of SAT is #P-complete (that is, show that the generic transformation used in the proof of Cook's theorem can be made weakly parsimonious). Exercise 7.44 Consider the following three decision problems, all variations on SAT; an instance of any of these problems is simply an instance of SAT. * Does the instance have at least three satisfying truth assignments? * Does the instance have at most three satisfying truth assignments? * Does the instance have exactly three satisfying truth assignments? Characterize as precisely as possible (using completeness proofs where possible) the complexity of each version. Exercise 7.45* Prove that Unique Satisfiability (described in Section 7.3.2) cannot be in NP unless NP equals coNP. Exercise 7.46* Prove that Minimal Unsatisfiability (described in Section 7.3.2) is DP-complete. (Hint: Develop separate transformations to this problem from SAT and from UNSAT; then show that you can reduce two instances of this problem to a single one. The combined reduction is then a valid reduction from SAT-UNSAT. A reduction from either SAT or UNSAT can be developed by adding a large collection of new variables and clauses so that specific "regions" of the space of all possible truth assignments are covered by a unique clause.) Exercise 7.47** Prove that Optimal Vertex Cover (described in Section 7.3.2) is DP-complete. (Hint: develop separate transformations from SAT and UNSAT and combine them into a single reduction from SATUNSAT.)

282

Proving Problems Hard Exercise 7.48** Prove that Unique Traveling Salesman Tour is complete for As'. An instance of this problem is given by a list of cities and a (symmetric) matrix of intercity distances; the question is whether or not the optimal tour is unique. (Hint: a problem cannot be complete for A' unless solving it requires a supralogarithmic number of calls to the decision oracle. Since solving this problem can be done by finding the value of the optimal solution and then making two oracle calls, the search must take a supralogarithmic number of steps. Thus the distances produced in your reduction must be exponentially large.) Exercise 7.49 Prove that Minimal Boolean Expression is in rl'. An instance of the problem is given by a Boolean formula and the question is "Is this formula the shortest among all equivalent Boolean formulae?" Does the result still hold if we also require that the minimal formula be unique? Exercise 7.50** Prove that Integer Expression Inequivalence is complete for A'. This problem is similar to SF-REI but is given in terms of arithmetic rather than regular expression. An instance of the problem is given by two integer expressions. An integer expression is defined inductively as follows. The binary representation of an integer n is the integer expression denoting the set In}; if e and f are two integer expressions denoting the sets E and F, then e U f is an integer expression denoting the set E U F and e + f is an integer expression denoting the set {i + j I i E E and j E F}. The question is "Do the two given expressions denote different sets?" (In contrast, note that Boolean Expression Inequivalence is in NP.) Exercise 7.51* Let ICdenote a complexity class and M a Turing machine in that class. Prove that the set {(M, x) I M E I and M accepts x} is undecidable for I= NP n coNP. Contrast this result with that of Exercise 6.25; classes of complexity for which this set is undecidable are often called semantic classes. Exercise 7.52 Refer to Exercises 6.25 and 7.51, although you need not have solved them in order to solve this exercise. If the bounded halting problem for NP is NP-complete but that for NP n coNP is undecidable, why can we not conclude immediately that NP differs from NP n coNP and thus, in particular, that P is unequal to NP? Exercise 7.53* Show that the number of distinct Eulerian circuits of a graph can be computed in polynomial time. Exercise 7.54* Verify that computing the number of distinct isomorphisms between two graphs is no harder than deciding whether or not the two graphs are in fact isomorphic.

7.4 Exercises

283

Exercise 7.55* A tally language is a language in which every string uses only one symbol from the alphabet; if we denote this symbol by a, then every tally language is a subset of {a}*. In particular, a tally language has at most one string of each length. Show that a tally language cannot be NP-complete unless P equals NP. Exercise 7.56* Develop the proof of the Immerman-Szelepscenyi theorem as follows. To prove the main result, NL = coNL, we first show that a nondeterministic Turing machine running in logarithmic space can compute the number of vertices reachable from vertex I in a digraph-a counting version of the NL-complete problem Digraph Reachability. Verify that the following program either quits or returns the right answer and that there is always a sequence of guesses that enables it to return the right answer.
I S (o) I = 1; for i=1 to JVI-1 do (* compute JS(i)1 from IS(i-1)* size-Si = 0; for j=1 to |VI do (* increment size-Si if j is in S(i) *) (* j is in S(i) if it is 0 or 1 step away from a vertex of S(i-1) *) in -Si = false; size-Si_1 = 0; (* recompute as a consistency check *) for k=1 to lVJ while not in-Si do (* consider only those vertices k in S(i-1) *) (* k is in S(i-1) if we can guess a path of
i-1 vertices from 1 to k *)

guess i-1 vertices; if (guessed vertices form a path from 1 to k)
then size-Si-1 = size-Si-1 + 1; (* k is in S(i-1) *) in E then in-Si = true (* j is in S(i) *) (* implicit else: bad guess or k not in S(i-1) *)

if

j=k or {j,k}

if

in-Si
then size-Si = size-Si + 1

else if

(* inconsistency

size-Si-1 <> IS(i-1)| then quit; flags a bad guess of i-1 vertices when testing vertices for membership in S(i-1) *)

|S(i)|

=

size-Si

Now the main result follows easily: given a nondeterministic Turing machine for a problem in NL, we construct another nondeterministic Turing machine that also runs in logarithmic space and solves the complement of the problem. The new machine with input x runs the code just given on the digraph formed by the IDs of the first machine run on x. If it ever encounters an accepting ID, it rejects the input; if it computes IS(IVI -I) without having found an accepting ID, it accepts the input. Verify that this new machine works as claimed. Fl

284

Proving Problems Hard

7.5

Bibliography

Garey and Johnson [1979] wrote the standard text on NP-completeness and related subjects; in addition to a lucid presentation of the topics, their text contains a categorized and annotated list of over 300 known NP-hard problems. New developments are covered by D.S. Johnson in "The NPCompleteness Column: An Ongoing Guide," which appears irregularly in the Journalof Algorithms and is written in the same style as the Garey and Johnson text. Papadimitriou [1994] wrote the standard text on complexity theory; it extends our coverage to other classes not mentioned here as well as to more theoretical topics. Our proofs of NP-completeness are, for the most part, original (or at least independently derived), compiled from the material of classes taught by the author and from Moret and Shapiro [19851. The XOR construct used in the proof of NP-completeness of HC comes from Garey, Johnson, and Tarjan (1976]. An exhaustive reference on the subject of P-complete problems is the text of Greenlaw et al. [1994]. Among studies of optimization problems based on reductions finer than the Turing reduction, the work of Krentel [1988a] is of particular interest; the query hierarchy that we mentioned briefly (based on the number of oracle calls) has been studied by, among others, Wagner [1988]. The proof that co-nondeterminism is equivalent to nondeterminism in space complexity is due independently to Immerman [1988] and to Szelepcsenyi [1987]. Miller [19761 proved that Primalityis in P if the extended Riemann hypothesis holds, while Pratt [1975] showed that every prime has a concise certificate. Leggett and Moore [1981] pioneered the study of A2P in relation with NP and coNP and proved that many optimality problems ("Does the optimal solution have value K?") are not in NP U coNP unless NP equals coNP; Exercise 7.50 is taken from their work. The class DP was introduced by Papadimitriou and Yannakakis [1984], from which Exercise 7.47 is taken, and further studied by Papadimitriou and Wolfe [1988], where the solution of Exercise 7.46 can be found; Unique Satisfiability is the subject of Blass and Gurevich [1982]. Unique Traveling Salesman Tour was proved to be AP-complete (Exercise 7.48) by Papadimitriou [1984], while the Double Knapsack problem was shown A'-complete by Krentel [1988b]. The polynomial hierarchy is due to Stockmeyer [1976]. The class #P was introduced by Valiant [1979a], who gave several #Pcomplete counting problems. Valiant [1979b] proved that the permanent is #P-complete; further #P-hard problems can be found in Provan [1986]. Simon [1977] had introduced parsimonious transformations in a similar context.

CHAPTER 8

Complexity Theory in Practice

Knowing that a problem is NP-hard or worse does not make the problem disappear; some solution algorithm must still be designed. All that has been learned is that no practical algorithm that always returns the optimal solution can be designed. Many options remain open. We may hope that real-world instances will present enough structure that an optimal algorithm (using backtracking or branch-and-bound techniques) will run quickly; that is, we may hope that all of the difficult instances are purely theoretical constructs, unlikely to arise in practice. We may decide to restrict our attention to special cases of the problem, hoping that some of the special cases are tractable, while remaining relevant to the application at hand. We may rely on an approximation algorithm that runs quickly and returns good, albeit suboptimal, results. We may opt for a probabilistic approach, using efficient algorithms that return optimal results in most cases, but may fail miserably-possibly to the point of returning altogether erroneous answers-on some instances. The algorithmic issues are not our current subject; let us just note that very little can be said beforehand as to the applicability of a specific technique to a specific problem. However, guidance of a more general type can be sought from complexity theory once again, since the applicability of a technique depends on the nature of the problem. This chapter describes some of the ways in which complexity theory may help the algorithm designer in assessing hard problems. Some of the issues just raised, such as the possibility that realworld instances have sufficient added constraints to allow a search for optimal solutions, cannot at present be addressed within the framework of complexity theory, as we know of no mechanism with which to characterize the structure of instances. Others, however, fall within the purview of 285

there are other "dimensions" along which we can vary the requirements placed on instances: for instance.
8. From a practical standpoint. We know that the general SAT problem is NPcomplete and that it remains so even if it is restricted to instances where each clause contains exactly three literals. when taken in their full generality.1. some means of predicting the running time would be very welcome. these form the topics of this chapter. since these terms are defined only for infinite classes of instances. Yet all instructors in programming classes routinely decide whether student programs are correct.I-SAT
. we could consider the number of times that a variable may appear among all clauses. then. although such problems are undecidable in their full generality. Since possible restrictions are infinitely varied. we cannot measure the time required by an algorithm on a single instance in our usual terms (polynomial or exponential). easily solvable just means that our solution algorithm runs quickly on most or all instances to which it is applied.1
Circumscribing Hard Problems
The reader will recall from Chapter 4 that many problems. such fundamental questions as whether a program is correct or whether it halts under certain inputs are undecidable. Consequently. We may hope that the same principle applies to (provably or probably-we shall use the term without qualifiers) intractable problems and that most instances are in fact easily solvable. the value of approximation methods. we have completely classified all variants of the problem. The moral is that. we are led to consider restricted versions of our hard problems and to examine their complexity. However. and the power of probabilistic approaches. are undecidable.
8. most of their instances are quite easily handled.1
Restrictions of Hard Problems
We have already done quite a bit of work in this direction for the satisfiability problem. From a theoretical standpoint. In terms of the number of literals per clause. Indeed. enabling us to use our algorithm judiciously. Call a satisfiability problem k.286
Complexity Theory in Practice current methodologies. such as the analysis of subproblems. We also know that the problem becomes tractable when restricted to instances where each clause contains at most two literals. we must be content here with presenting some typical restrictions for our most important problems in order to illustrate the methodology. In this context. however.

it is easy to construct a satisfying truth assignment: just set each variable to the truth value that satisfies the corresponding clause-since all representative variables are distinct.3-SAT is trivial. It is also solvable in polynomial time.2-SAT is solvable in polynomial time for any k. Thus the k. a solution is then a selection of n elements (the true literals) such that each of the m sets contains at least one variable corresponding to a selected element.1 3.1-SAT is solvable in polynomial time for any 1. in that notation. however.1-SAT may be considered as a system of m (the number of clauses) sets of k elements each (the variables of the clause). Consider now the k. In other words.4-SAT is NP-complete. it is a trivial problem because all of its instances are satisfiable! We derive this rather surprising result by reducing kl-SAT to a bipartite matching problem. Exercise 8.8. none of the k variables contained in a clause is contained in more than k clauses in all. as long as it does not involve a contradiction. Given this map. with the constraint that the selection of one element always prohibits the selection of another (the complement of the literal selected). a satisfying assignment is a set of not necessarily distinct representatives. our familiar 3SAT problem becomes 3.1-SAT. a rather different approach yields a polynomial-time algorithm that solves k. In terms of the satisfiability problem.27). We know that 2.1 Circumscribing Hard Problems
287
if it is restricted to instances where each clause contains k literals and each variable appears at most 1 times among all clauses.k-SAT problem. there can be no conflict.
. what happens to the other literals in the clause is quite irrelevant. so that the transformed problem always admits a set of distinct representatives.
Theorem 8. How many occurrences of a literal may be allowed before a satisfiability problem becomes NP-complete? The following theorem shows that allowing three occurrences is sufficient to make the general SAT problem NP-complete.k-SAT problem.2-SAT. Notice that a satisfying assignment in effect singles out one literal per clause and sets it to true. Thus an instance of k. in fact. since we have just shown that 3.l-SAT problem reduces to a much generalized version of the Set of Distinct Representatives problem. while four are needed to make 3SAT NP-complete-thereby completing our classification of 3. This condition fulfills the hypothesis of Hall's theorem (see Exercise 2.1 Prove that k.1-SAT problems. In a k. so that any i clauses taken together always contain at least i distinct variables. there exists an injection from the set of clauses to the set of variables such that each clause contains the variable to which it is mapped.

Variable fc is added to the two-literal clause and we write five other clauses to force it to assume the truth value "false. X2=>X3. and can easily be produced from the original instance in polynomial time.. X3}. qTc.." These clauses are:
{7Pc qc. fc.288
Complexity Theory in Practice Proof We first provide a simple reduction from 3SAT to a relaxed version of 3. {X2. say c = {x^. . we then finish the transformation to 3. .
. we are done. by xi.1 to turn the two-literal clauses into three-literal clauses. rJ. Pc. we replace it by k variables. Xk-I =1Xk. To ensure that all xi variables be given the same truth value.
sPc. thereby causing the substitute variables xi to appear five times. The padding would duplicate all our "implication" clauses. . Now we could use the padding technique of Theorem 7. fcA. x 1. with k > 3. For each two-literal clause. 5'. say i. fc}. {Pc. If variable x appears k times (in complemented or uncomplemented form).
The first three clauses are equivalent to the implications:
p V == fc qc
V
c X=fc
rcv C X=fc
The last two clauses assert that one or more of the preconditions are met:
(pc V c) A (qc V c) A (rc V)
. .4-SAT.. A transformation to 3. fcl (Pc. which we elaborate below. Hence SAT is NP-complete even when restricted to instances where no clause has more than three literals and where no variable appears more than three times.4-SAT requires a more complex padding technique. Xk}. Xk=>XI
which we can rewrite as a collection of clauses of two literals each:
{MI. Xk. has no variable occurring more than three times. {Xk.{Xk1. qc. we write a circular list of implications:
XI =>X2.3-SAT where clauses are allowed to have either two or three literals each. rc. X2}. 1c pc. The result is a transformation into instances of 3. showing that the latter is NP-complete. XI}
The resulting collection of clauses includes clauses of two literals and clauses of three literals. Let an instance of 3SAT be given. we use four additional variables.5SAT. and replace its ith occurrence. and r. {q. This proves that three occurrences of each variable are enough to make SAT NP-complete. If no variable appears more than three times. . qc.

Example 8. a step that takes at most quadratic time. where all elements have sizes at least equal to a third of the bin size.4-SAT. of elements. We might embark upon an exhaustive program of classification for reasonable 1 variants of a given hard problem. the running time of this algorithm is dominated by the running time of the matching algorithm. We have just described a transformation into instances of 3. or three elements. we give just one more example here. the problem is solvable in polynomial time. so that its addition to the original two-literal clause does not affect the logical value of the clause. which is a low polynomial. but the reader will find more examples in the exercises and references. as well as in the next section. specifically.1 Consider the Binpacking problem. This preprocessing phase takes at most linear time.) Now we need only select the largest subset of pairs that do not share any elementthat is. with each such group filling one bin. in that case. we collect all elements of size B/3 and group them by threes. the case of three elements is uniquely identifiable. The goal is to pack all of the elements into the smallest number of bins. two. which completes our proof. The richest field by far has proved
'By "reasonable" we do not mean only plausible. Now let us restrict this problem to instances where all elements are large. Now we identify all possible pairs of elements that can fit together in a bin. any elements not in the matching are assigned their own bin. but also easily verifiable. s: S -A N.E. Thus we can begin by checking whether B is divisible by three. Once a maximum matching has been identified. since every element involved must have size equal to B/3.8. Many otherwise hard problems may become tractable when restricted to special instances. A simple case analysis shows that the algorithm is optimal. If it is. Thus a reasonable variant is one that is characterized by easily verifiable features. (Elements too large to fit with any other in a bin will occupy their own bin. S. Q.1 Circumscribing Hard Problems Hence the five clauses taken together force f.
289
. Each of the additional variables appears exactly four times and no other variable appears more than three times. a problem for which many polynomial-time solutions exist. we need to solve a maximum matching problem. each with a size. B. Recall that we must be able to distinguish erroneous input from correct input inpolynomial time. A bin will contain one.D. This transformation is easily accomplished in polynomial time. Recall that an instance of this problem is given by a set. Overall. to be set to false. We claim that. so that the resulting instance has no variable appearing more than four times. leftover elements of size B/3 are placed back with the other elements. and by a bin size.

which can be colored only by assigning the same color to x and x' and. the result is a planar graph. if we can ensure that it is threecolorable if and only if the original graph is. Finding a coloring is a very different story. In fact. With all crossings removed from the embedding.2 PlanarG3C is NP-complete. the problem is NP-complete for any fixed number of colors larger than two. b}.
. a problem equivalent to asking whether the graph is bipartite. What about the restriction of the problem to planar graphs-corresponding. We design a planar. and (iii) the two endpoints of an original edge cannot be given the same color. thus graph coloring for planar graphs is trivial for any fixed number of colors larger than or equal to four. three-colorable gadget with four endpoints. Although the proof of the four-color theorem is in effect a polynomial-time coloring algorithm. The celebrated four-color theorem states that any planar graph can be colored with four colors.1. The reader can verify that the graph fragment illustrated in the first part of Figure 8.1 fulfills these requirements. We reduce G3C to its planar version by providing a "crossover" gadget to be used whenever the embedding of the graph in the plane produces crossing edges. x'. The second part of the figure shows how to use this gadget to remove crossings from some edge {a. independently. Thus a crossing gadget must replace the two edges in such a way that: (i) the gadget is planar and three-colorable (of course). y.2 The following theorem shows that G3 C remains hard when restricted to planar graphs. Two crossing edges cannot share an endpoint. We know that deciding whether an arbitrary graph is three-colorable is NP-complete: this is the G3C problem of the previous chapter. the same color to y and y'. as it requires the design of a gadget. We refer the reader to the current literature for results in this area and present only a short discussion of the graph coloring and Hamiltonian circuit problems. its running time and overhead are such that it cannot be applied to graphs of any significant size. and y'. 1
Proof This proof is an example of the type discussed in Section 7. to the problem of coloring maps? Since planarity can be tested in linear time. hence they are independent of each other from the point of view of coloring. more or less. (ii) the coloring of the endpoints of one original edge in no way affects the coloring of the endpoints of the other. it is easily solvable in linear time for two colors. One endpoint of
2 The problem is trivial only as a decision problem. x. we will have proved our result. such a restriction is reasonable. leaving one to rely upon heuristics and search methods.290
Complexity Theory in Practice to be that of graph problems: the enormous variety of named "species" of graphs provides a wealth of ready-made subproblems for any graph problem. Theorem 8.

a condition that can easily be checked in linear time. This time we need to replace any vertex of degree larger than four with a gadget such that: (i) the gadget is three-colorable and contains no vertex with degree larger than four (obviously). A theorem due to Brooks [1941] (that we shall not prove here) states that the chromatic number of a connected graph never exceeds the maximum vertex degree of the graph by more than one.1
The gadget used to replace edge crossings in graph colorability. as vertices are allowed to have a degree equal to four. Theorem 8.
the edge (here a) is part of the leftmost gadget.3 G3C is NP-complete even when restricted to instances where no vertex degree may exceed four. However. moreover. (ii) there is one "attaching point" for each vertex to which the original vertex was connected. an abrupt transition takes place. this building block provides three attaching points (the three "corners" of
. Thus G3C restricted to graphs of degree three is in P. A building block for our gadget that possesses all of these properties is shown in Figure 8.E. Embedding an arbitrary graph in the plane.2(a).1
Circumscribing Hard Problems
291
x
a y
b
T
T
T
(a) the graph fragment
(b) how to use the fragment
Figure 8. Planarity is not the only reasonable parameter involved in the graph colorability problem. Another important parameter in a graph is the maximum degree of its vertices. the bound is reached if and only if the graph is a complete graph or an odd circuit. D Proof. and replacing each crossing with our gadget Q.8. detecting all edge crossings. a graph having no vertex degree larger than three is three-colorable if and only if it is not the complete graph on four vertices. while the other endpoint remains distinct and connected to the rightmost gadget by an edge that acts exactly like the original edge. and (iii) all attaching points must be colored identically. In particular.D. are all easily done in polynomial time.

4 HC is NP-complete even when restricted to planar graphs where no vertex degree exceeds three.2(b). D
Proof. Theorem 8. so that a string of k building blocks provides k + 2 attaching points. Observe that our proof of this result produces graphs where no vertex degree exceeds four and that can be embedded in the plane in such a way that the only crossings involve XOR components. The reader will have observed that the component used in the proof is planar. moreover. with a net gain of one attaching point.
.
the "triangle").1 that deciding whether an arbitrary graph has a Hamiltonian circuit is NP-complete.292
Complexity Theory in Practice
(a) the building block
(b) combining building blocks into a component
Figure 8.3(a) has the required properties. More attaching points are provided by stringing together several such blocks as shown in Figure 8. The transformation preserves colorability and is easily carried out in polynomial time. and (iii) a clause gadget to prevent crossings between XOR components and clause pieces. the gadget itself is planar.2
The gadget used to reduce vertex degree for graph colorability.2 and 8. the gadget must not allow two separate paths to pass through while visiting all vertices. We know from Section 7. A similar analysis can be performed for the Hamiltonian circuit problem. so that we can combine it with the planar reduction that we now describe. Thus we can show that the Hamiltonian circuit problem remains NP-complete when restricted to planar graphs of degree not exceeding three by producing: (i) a degree-reducing gadget to substitute for each vertex of degree 4.3 may be combined (in that order) to show that G3C is NP-complete even when restricted to planar graphs where no vertex degree exceeds four.D. at the same time. A new block is attached to the existing component by sharing one attaching point. The degree-reducing gadget must allow a single path to enter
from any connecting edge and exit through any other connecting edge while visiting every vertex in the component. (ii) a crossing gadget to replace two XOR components that cross each other in an embedding. Q.E. The reader can verify that the component illustrated in Figure 8. so that the transformations used in Theorems 8.

3
The gadgets for the Hamiltonian circuit problem. by placing them in series and facing the inside of the constructed loop. Observe that the XOR components combine transitively (because the gadget includes an odd number of them) to produce an effective XOR between the vertical edges.1 Circumscribing Hard Problems
293
(a) the degree-reducinggadget
(b) the XOR crossing gadget
the concept (c) the clause gadget
the graph fragment
Figure 8. so that we must design a new gadget that will replace the triple edges of each clause. The XOR between the horizontal edges is obtained in exactly the same fashion as in the original XOR component. the result is illustrated in Figure 8. However. We can trivially avoid crossings with variable pieces by considering instances derived from Positive 1in3SAT rather than from the general lin3SAT.
We must design a planar gadget that can be substituted for the crossing of two independent XOR components. in such instances. we avoid any crossings with XOR
. Finally. We can achieve this goal by combining XOR components themselves with the idea underlying their design. We propose to place the three edges (from which any valid circuit must choose only one) in series rather than in parallel as in our original construction. The reader may easily verify that the resulting graph piece effects the desired crossing. we must also remove a crossing between an XOR piece and a "segment" corresponding to a literal in a clause piece or in a variable piece. crossings with the segments of the clause pieces remain.8. XOR components touch only one of the two segments and so all crossings can be avoided.3(b). by setting up four segmentseach of which must be traversed-that connect enlarged versions of the horizontal edges.

4)-SAT to derive completeness results for graph problems with limited degree. In fact. where m is the number of appearances of the corresponding literal. be NP-complete itself and must combine with the general reduction so as to produce only planar
. Indeed. typically done by reduction from one of the versions of 3SAT.3(c) illustrates the concept and shows a graph fragment with the desired properties. With the methodology used so far. Q. Since we can limit this number to 4 through Theorem 8. In order for this scheme to work. Alternately. Our gadget must then ensure that any path through it uses exactly one of the three series edges. In most transformations from SAT. we could design special "planar" versions of the standard 3SAT problems.E. Proposition 8. we can ensure that the graph produced by the transformation has no vertex of degree exceeding 5. the degree of the resulting graph is determined by the number of appearances of a literal in the collection of clauses. Figure 8.10). A common restriction among graph problems is the restriction to planar graphs. we can use our results on (3. thereby proving that version to be NP-complete. we could. we can design a gadget to reduce the degree down to 3 (Exercise 8.D. proceed problem by problem. the planar 3SAT version must. In order to show that the planar versions of these graph problems remain NP-complete. Such is the case for Vertex Cover: all clause vertices have degree 3. of course. For many graph problems.1. developing a separate reduction with its associated crossing gadgets for each problem. Let us examine that restriction in some detail. because each appearance must be connected by a consistency component to the truth-setting component. in the case of graph problems. such that the graphs produced by the existing reduction from the standard 3SAT version to the general graph version produce only planar graphs when applied to our "planar" 3SAT version and thus can be viewed as a reduction to the planar version of the problem. as we did so far. each reduction requires its own gadgets. every hard special case needs its own reduction.294
Complexity Theory in Practice components. we have a proof of NP-completeness for the general version of the problem. We should be able to use more general techniques in order to prove that entire classes of restrictions remain NP-complete with just one or two reductions and a few gadgets.1 Vertex Cover remains NP-complete when limited to F graphs of degree 5. but the truth-setting vertices have degree equal to m + 1. The reader may verify by exhaustion that any path through the gadget that visits every vertex must cross exactly one of the three critical edges.

the polar and nonpolar versions of PlanarThree-Satisfiability are NP-complete. Definition 8. Corollary 8. Many graph constructions from SAT connect the truth assignments fragments together (as in our construction for G3C). ii The simplest way to define a graph representation for an instance of Satisfiability is to set up a vertex for each variable. Another constraint to consider is the introduction of polarity: with a single vertex per variable and a single vertex per clause. each clause gives rise to a single vertex. the better. some additional structure may be desirable-the more we can add. Hence we seek a version of 3SAT that leads to planar connection patterns between clause fragments and variable fragments.1 PlanarVertex Cover is NP-complete. see Exercise 8. most constructions from 3SAT are made of three parts: (i) a part (one fragment per variable) that ensures legal truth assignments. and all variable vertices are connected together in a circular chain. As we have observed (see Table 7. and (iii) a part that ensures consistency of truth assignments among clauses and variables. still others do both (as in our construction for HG). In transforming to a graph problem. no difference is made between complemented and uncomplemented literals.1 Circumscribing Hard Problems
295
graphs. * Nonpolar representation: Variables and clauses give rise to a single vertex each. edges connect clauses to all vertices corresponding to variables that appear within the clause. An instance of SAT is deemed planar if its graph representation (to be defined) is planar. planarity is typically lost in the third part. a vertex for each clause. [-I
. However. cI For a proof. and edges connect clauses to all vertices corresponding to literals that appear within the clause. Theorem 8.3). Let us define two versions: * Polarrepresentation: Each variable gives rise to two vertices connected by an edge. whereas using an edge between two vertices for each variable would provide such a distinction (and would more closely mimic our general reductions).1 The PlanarSatisfiability problem is the Satisfiability problem restricted to "planar" instances. and an edge between a variable vertex and a clause vertex whenever the variable appears in the clause-thereby mimicking the skeleton of a typical transformation from SAT to a graph problem. (ii) a part (one fragment per clause) that ensures satisfying truth assignments.5 With the representations defined above. others connect the satisfaction fragments together.8.8.

E. as discussed in Exercise 8. F1 This somewhat surprising result leaves open the possibility that graph problems proved complete by transformation from NAE3SAT may become tractable when restricted to planar instances. The reader should not conclude from Proposition 8.) Indeed.2 Use the result of Exercise 8. Table 8.3(c). We would need a planar version of (3.1 summarizes our two approaches to proving that special cases remain NP-complete. The use of these planar versions remains somewhat limited.D. forces the definition of a new graph representation for satisfiability problems. surprisingly. and does not connect clause pieces. uses a clause piece (a triangle) that can be assimilated to a single vertex in terms of planarity. The most comprehensive attempt at classifying the variants of a problem is the effort of a group of researchers at the Mathematisch Centrum in Amsterdam who endeavored to classify thousands of deterministic scheduling
. (Given the direction of the reduction. however.11. however. To prove that the nonpolar version of PlanarNAE3SAT is also in P. even though we reduced 1in3SAT to HC. For instance. The first problem. Further work shows that Planar lin3SAT is also NP-complete. Maximum Cut.1 that Vertex Cover remains NP-complete when restricted to planar graphs of degree 5: the two reductions cannot be combined. Thus in order to be truly effective. such is the case for at least one of these problems. problem-by-problem approach are present but not enormous. PlanarNAE3SAT is in P. this technique requires us to prove that a large number of planar variants of Satisfiability remain NP-complete. The second problem can be disposed of by using Positive lin3SAT (which then has to be proved NP-complete in its planar version) and by using the clause gadget of Figure 8. modify the reduction. Exercise 8. in both polar and nonpolar versions. our reduction connected both the variable pieces and the clause pieces and also involved pieces that cannot be assimilated to single vertices in terms of planarity (because of the crossings between XOR components and variable or clause edges). since they do not start from the same problem. this is the strongest statement we can make.296
Complexity Theory in Practice Proof It suffices to observe that our reduction from 3SAT uses only local replacement.1 and Corollary 8. Q. The conclusion follows immediately from the NP-completeness of the polar version of Planar3SAT.4) to show that the polar version of Planar NAE3SAT is in P. the savings over an ad hoc.11 and our reduction from NAE3SAT to MxC (Theorem 7.4)-SAT in order to draw this conclusion.

scheduling tasks of arbitrary lengths is a generalization of the problem of scheduling tasks of unit length. In a recent study by the Centrum group.1
How to Prove the NP-Completeness of Special Cases
* The (semi)generic approach: Consider the reduction used in proving the general version to be NP-hard. The Centrum group started by systematizing the scheduling problems and unifying them all into a single problem with a large number of parameters." Thus the parameters include.1 Circumscribing Hard Problems
297
Table 8. if A reduces to B and A is intractable. the type of tasks. one cannot tackle each case individually. For instance. and the function to be optimized. such as "scheduling unit-length tasks on two identical machines under general precedence constraints and with release times and completion deadlines in order to minimize overall tardiness. scheduling tasks under the constraint of precedence relations (which determine a partial order) is a generalization of the problem of scheduling independent tasks. the number and type of machines. the parameters included allowed for the description of 4.
problems. conversely. e The ad hoc approach: Use a reduction from the general version of the problem to its special case. The Centrum group wrote a simple program that takes all known results (hardness and tractability) about the variants of
. then A is tractable and that. similarly. Thus the partial order becomes a powerful tool for the classification of parameterized problems.1. this reduction will typically require one or more specific gadgets. then B is intractable. when used in the reduction. Recall that. but are not limited to. You may want to combine this approach with the generic approach. Notice that. if A reduces to B and B is tractable. The parameterization of a problem induces an obvious partial order on the variants of the problem: variant B is no harder than variant A if variant A is a generalization of variant B. the problem used in that reduction may have a known NP-complete special case that. produces only the type of instance you need. as used in Section 7. then variant A reduces to variant B in polynomial time by simple restriction.8. in case the generic approach restricted the instances to a subset of the general problem but
a superset of your problem. When faced with these many different cases. additional relevant constraints (such as the existence of deadlines or the permissibility of preemption). if variant B is a generalization of variant A.536 different types of scheduling problems. the type of precedence order. Each assignment of values to the parameters defines a specific type of scheduling problem.

The question is whether or not there exists a subset S' C S. The trouble is that this problem is itself NP-complete! Theorem 8. restrictions of NP-complete problems must be verifiable in polynomial time. the latter with 67 extremal problems. checking whether an instance obeys the stated restriction cannot be allowed to dominate the execution time. With respect to the 4. C For a proof. The results of such an analysis can be used to determine the course of future complexity research in the area. we might be able to devise
. the distribution in 1982 was 3. a partial order on S denoted <.12. and a bound B. easy) such that c can be extended to a total function on S by applying the two rules: (i) x < y and c(y) = easy implies c(x) = easy. 8. that is. Yet this feature may prove unrealistic: it is quite conceivable that all of the instances generated in an application must.2 Promise Problems
All of the restrictions that we have considered so far have been reasonable restrictions. when classified. An instance of this problem is given by a set of (unclassified) problems S.730 hard variants. and (ii)x < y and c(x) = hard implies c(y)=hard. it is also possible to compute maximal tractable problems (the most general versions that are still solvable in polynomial time) as well as minimal hard problems (the most restricted versions that are still NP-hard). In order to complete the classification of all variants as tractable or hard. the program can also find extremal unclassified problems. characterized by easily verifiable features.6 The Minimal Research Program problem is NP-complete. Only such restrictions fit within the framework developed in the previous chapters: since illegal instances must be detected and rejected. see Exercise 8. the easiest and hardest of unclassified problems. From the partial order. Such problems are of interest since a proof of hardness for the easiest problems (or a proof of tractability for the hardest problems) would allow an immediate classification of all remaining unclassified problems. all we need to do is to identify a minimal subset of the extremal unclassified problems that. Given such a collection of instances. automatically leads to the classification of all remaining unclassified problems.536 scheduling problems examined by the Centrum group. In particular. due to the nature of the process.1. and 390 unclassified variants.298
Complexity Theory in Practice
a parameterized problem and uses the partial order to classify as many more variants as possible. with JSJ . and a complexity classification function c: S' -{ (hard.B. 416 tractable variants. Furthermore. obey certain conditions that cannot easily be verified.

No condition whatsoever is placed on the behavior of the algorithm when run on instances that do not fulfill the promise: the algorithm could return the correct result. a promise problem is stated as a regular problem. Several problems that are NPhard on general graphs are solvable in polynomial time on perfect graphs. W
3 By "unreasonable" we simply mean hard to verify. We can bring complexity theory to bear upon such problems by introducing the notion of a promise problem. An algorithm solves a promise problem if it returns the correct solution within the prescribed resource bounds for any instance that fulfills the promise. Exercise 8. Formally. we do not intend it to cover altogether silly restrictions. One of many definitions of perfect graphs states that a graph is perfect if and only if the chromatic number of every subgraph equals the size of the largest clique of the subgraph. Deciding whether or not an arbitrary graph is perfect is NP-easy (we can guess a subgraph with a chromatic number larger than the size of its largest clique and obtain both the chromatic number and the size of the largest clique through oracle calls) but not known (nor expected) to be in P. yet it would be well solved in practice. we should observe an apparent contradiction: the restricted problem would remain hard from the point of view of complexity theory.8. Independent Set. return an erroneous result. There is still much latitude in these definitions as the type of promise can make an enormous difference on the complexity of the task. and Clique. interval graphs.
. examples are Chromatic Number. there are many classes of perfect graphs that are recognizable in polynomial time (such as chordal graphs. making the restriction difficult to verify. That we cannot verify in polynomial time whether a particular instance obeys the restriction is irrelevant if the application otherwise ensures that all instances will be perfect graphs. such as restricting the Hamiltonian circuit problem to instances that have a Hamiltonian circuit. and permutation graphs).3 Verify that there exist promises (not altogether silly. or even fail to terminate. An important example of such an "unreasonable" 3 restriction is the restriction of graph problems to perfect graphs. but plainly unreasonable) that turn some undecidable problems into decidable ones and others that turn some intractable problems into tractable ones. Moreover. with the addition of a predicate defined on instances-the promise. even if we do not have this guarantee. As a result. Knowing the result about perfect graphs avoids a lengthy reconstruction of the algorithms for each special case.1
Circumscribing Hard Problems
299
an algorithm that works correctly and efficiently only when applied to the instances of the collection. exceed the resource bounds.

however: if the statement of NAE3SAT asked for a partition of the variables rather than for a truth assignment. An intriguing and important type of promise is that of uniqueness. For instance. since we cannot prove any such thing for most interesting problems. the problem would be unchanged and yet the promise of uniqueness would not trivialize it. The complexity of a promise problem is then precisely the complexity of the easiest of its completions-which returns us to the world of normal problems. as a direct consequence of a theorem of Thomason's [1978] stating that the only graph that has a unique edge-coloring requiring k colors. some problems apparently remain hard-but how do we go about proving that? Since we cannot very well deal with promise problems. with k . Completion is the reverse of restriction: we can view the promise problem as a restriction of the normal problem to those instances that obey the promise. Such a promise arises naturally in applications to cryptology: whatever cipher is chosen. with a promise of uniqueness. Formally. How does such a promise affect our NP-hard and #P-hard problems? Some are immediately trivialized. it appears that NP.4 we introduce several classes of randomized complexity. which we know how to solve in polynomial time. For instance. Other problems become tractable in a more interesting manner. then we can look at the classes of complexity arising from our definition. a simple decision problem.300
Complexity Theory in Practice
The next step is to define some type(s) of promise. that is. Thus a completion problem is an extension of a promise problem. we introduce the notion of a problem's completion.4. that is. it must be uniquely decodable-if. Proving that a promise problem is hard then reduces to proving that none of its completions is in P-or rather. a "normal" problem is the completion of a promise problem if the two problems have the same answers for all instances that fulfill the promise. any symmetric problem (such as NAE3SAT) becomes solvable in constant time: since it always has an even number of solutions. which lies between P and
. is the k-star. In Section 8. including the class RP. once that is done. Finally. the Chromatic Index problem (dual of the chromatic number. in that it considers edge rather than vertex coloring) is solvable in polynomial time with a promise of uniqueness. As another example of trivialization. there is no need to verify the validity of the promise of uniqueness. with answers defined arbitrarily for all instances not fulfilling the promise. proving that none of its completions is in P unless some widely believed conjecture is false. the promise of uniqueness is tantamount to a promise of nonexistence. Thus in cryptology. the string under consideration is indeed the product of encryption. the promise is that each valid instance admits at most one solution. a stronger conjecture is needed. however. The conjecture that we have used most often so far is P 7& in this case. the #P-complete problem of counting perfect matchings becomes. Such an outcome is somewhat artificial.

randomized. which states that the chromatic index of a graph either equals the maximum degree of the graph or is one larger. Traveling Salesman. Hamiltonian Circuit.3 is that a promise of uniqueness does not make the following problems tractable: 3SAT. As mentioned earlier. as always. Subset Sum. the first is in NP and not believed to be NP-complete. with Unique Satisfiability. Verifying the promise of uniqueness is generally hard for hard problems. Maximum Cut.) An immediate consequence of our work with strictly parsimonious transformations in Section 7. However. as a consequence of a theorem of Vizing's (see Exercise 8. Partition.2
Strong NP-Completeness
We noted in Section 7. Knapsack.The fact that Subset Sum remains hard under a promise of uniqueness is of particular interest in cryptology. (The transformation must be strictly parsimonious. Since a parsimonious transformation preserves the number of solutions. and 0-1 Integer Programming.
301
8. reduction from the promise version of SAT to SAT itself). its search version is the same as its decision version.(NP U coNP) for most NP-complete problems. Binpacking.14). Theorem 8.. it preserves uniqueness as a special case and thus preserves the partition of instances into those fulfilling the promise and those not fulfilling it. a promise problem. There are exceptions.1 that the Partitionproblem was somehow different from our other basic NP-complete problems in that it required the presence
.e.(NP U coNP). other problems with a promise of uniqueness can be proved hard by the simple means of a parsimonious transformation.8. Chromatic Index is an unusual problem in many respects: among other things. which effectively asks to verify the promise of uniqueness. i. the inclusions are believed to be proper. as this problem forms the basis for the family of knapsack ciphers (even though the knapsack ciphers are generally considered insecure due to other characteristics). The following theorem is quoted without proof (the proof involves a suitable. the weak version of parsimony where the number of solutions to one instance is easily related to the number of solutions to the transformed instance is insufficient here. F1 From this hard promise problem.3. deciding the question of uniqueness appears to be in A' . Thus the conjecture RP # NP implies P # NP. whereas the second is in A' . such as Chromatic Index.7 Uniquely Promised SAT (SAT with a promise of uniqueness) cannot be solved in polynomial time unless RP equals NP. Compare for instance Uniquely Promised SAT.2 Strong NP-Completeness NP.

using the recurrence =1 f(°. G = (V. it is in fact hardly longer and is bounded by a polynomial function in the length of the binary encoding. While the unary encoding is not as succinct as the binary encoding. It immediately follows that the problem remains NP-complete when coded in unary. Definition 8. However. the running time is not polynomial in the input size.. This problem can be solved by dynamic programming. M -si))
where f (i. this algorithm runs in O(n N2 ) time. we had used unary notation. so that it is also a reasonable encoding.
[]
. In binary. Viewed in a more positive light. then the input size would have been O(n N) and the dynamic programming algorithm would have run in quadratic time. As we observed before. M). If. measured in terms of unary inputs. {xI. This characteristic is in fact common to a class of problems. E). where each element xi has size si.El. 0)
-
1.
unary. In unary. M) equals 1 or 0. Partitionis tractable when restricted to instances with (polynomially) small element values. the Maximum Cut problem. since the latter is O(n log N).302
Complexity Theory in Practice
of large numbers in the description of its instances in order for it to be NPhard.2 An algorithm runs in pseudo-polynomial time on some problem if it runs in time polynomial in the length of the input encoded in
. f (i
f(0. the bound B now requires B = O(IEI) symbols and the graph requires O(IEI. x2.. This abrupt change in behavior between unary and binary notation is not characteristic of all NP-complete problems.n.j)=O forj •0 f (i. . which we now proceed to study. indicating whether or not there exists a subset of the first i elements that sums to M. IVI + Iv ) symbols. we can encode the bound with log B = O(log IEI) bits and the graph with O(IEI log IVI + IVI) bits. instead of using binary notation for the si values. and a bound. For instance. M) = max(f (i -1. where an instance is given by a graph. and the question is "Can this set be partitioned into two subsets in such a way that the sum of the sizes of the elements in one subset equals the sum of the sizes of the elements in the other subset?" We can assume that the sum of all the sizes is some even integer N.. we define a special version of polynomial time. as discussed in Chapter 4. Let us begin by reviewing our knowledge of the Partitionproblem. In order to capture this essential difference between Partition and Maximum Cut. remains NP-hard even when encoded in unary. this conclusion relies on our convention about reasonable encodings. An instance of it is given by a set of elements. B . .

The dynamic programming algorithm for Partitionprovides an example of a pseudo-polynomial time algorithm that is not also a polynomial-time one. Steiner Tree in Grapbs. This restricted version of
. a list of such problems includes Traveling Salesman. can be solved by a pseudo-polynomial time algorithm. Knapsack. it follows that any pseudopolynomial time algorithm for this problem would also be a polynomialtime algorithm. Such results are rather trivial. further examples include the problems Subset Sum. and many others. Beside Partition.at least. k-Clustering. as it implies the existence of a subproblem in P: simply restrict the problem to those instances I where len(I) and max(I) remain polynomially related. that is. no NP-complete problem can be solved by a polynomial-time algorithm. Under our standard assumption of P $ NP. Our transformation between Hamiltonian Circuit and Traveling Salesman produced instances of the latter where all distances equal 1 or 2 and where the bound equals the number of cities.2 Strong NP-Completeness
303
For convenience.where max (I) cannot be bounded by a polynomial in len(I). A pseudo-polynomial time solution may in fact prove very useful in practice. Definition 8. Knapsack. Set Cover. it follows that any polynomial-time algorithm is also a pseudopolynomial time algorithm. Hamiltonian Circuit.3 An NP-complete problem is strongly NP-complete if it cannot be solved in pseudo-polynomial time unless P equals NP. c] The same reasoning applies to any problem that does not include arbitrarily large numbers in the description of its instances. Bounded Diameter Spanning Tree. and the other problems mentioned earlier. we shall denote the length of a reasonable binary encoding of instance I by len(I) and the length of a unary encoding of I by max(I). the real interest lies in problems that cannot be reasonably encoded in unary. Graph Three-Colorability. hence problems such as Satisfiability. there cannot exist a pseudo-polynomial time algorithm for Maximum Cut. the existence of such a solution also helps us circumscribe the problem (the goal of our first section in this chapter). Hence a study of pseudo-polynomial time appears quite worthwhile. Since unary encodings are always at least as long as binary encodings. hence. On the other hand. under our standard assumption. Vertex Cover. and Betweenness are all strongly NP-complete problems. However. since its running time may remain quite small for practical instances. since all such problems have reasonable unary encodings. Partition. Binpacking into Two Bins. since unary and binary encodings remain polynomially related for all instances of the Maximum Cut problem.8. Moreover. and some scheduling problems that we have not mentioned.

8 Subset Product is strongly NP-complete. say T = {xi. is a strongly NP-complete problem. Given an instance of X3C. a size function s: S -A N. The question is "Does there exist a subset S' of S such that the product of the sizes of the elements of S' is equal to B?" El Proof. Observe that B (the largest number involved) can be computed in time O(n log P3n) given the first 3n primes.) This equivalent characterization shows that the concept of strong NP-completeness stratifies problems according to the values contained in their instances. Hence the best way to show that a problem is strongly NP-complete is to construct a transformation that produces only small numbers.p(len(I)) is itself NP-complete. size function s: {Xi. F (We leave the easy proof to the reader. We set up an instance of Subset Product with set S = C.. finding the ith prime itself need not take longer than 0(pi) divisions (by the brute-force method of successive divisions).304
Complexity Theory in Practice TSP is itself NP-complete but. More interestingly. We now appeal to a result from number theory that we shall not prove: pi is 0(i 2 ). we conclude that the complete transformation runs in polynomial time.2 A problem 1I E NP is strongly NP-complete if and only if there exists some polynomial p( ) such that the restriction of 1I to those instances with max(I) . We transform X3Cto our problem using prime encoding. we can show that k-Clustering. in such a manner. if the set of instances containing only polynomially "small" numbers is itself NP-complete. . Using this information. Theorem 8. Finally. Membership in NP is obvious. Xk) -* PiPjPk. Hence the problem is NP-complete. and a bound B. and by implication the general Traveling Salesman problem. Proposition 8. this latter characterization is equivalent to our definition.3nI pi.D. An instance of this problem is given by a finite set S. I P3n.
. we can define problems that are strongly NP-complete yet do not derive directly from a "numberless" problem. quite clearly. Xj. and bound B = p. then the problem is strongly NP-complete. That the transformation is a valid many-one reduction is an obvious consequence of the unique factorization theorem. Q. Hence this special version of TSP. it follows that the problem is in fact strongly NP-complete.E. can be encoded reasonably in unary-it does not include arbitrarily large numbers in its description. Thus we find another large class of strongly NP-complete problems: all those problems that remain hard when their "numbers" are restricted into a small range. X3n} and C c 2T (with C E C = cle= 3). Steiner Tree in Graphs. and the various versions of spanning tree problems are all strongly NP-complete problems. Indeed. since the numbers produced by the transformation are only polynomially large. let the first 3n primes be denoted by PI.

while conceptually simple. 2.
El
.2 Binpacking is strongly NP-complete. Definition 8.q 2(len(I). The proof.9 For each fixed k . f.8. k-Partition is strongly NP-complete. and 3. F
The proof merely consists in noting that an instance of k-Partition with kn elements and total size Bn may also be regarded as an instance of Binpacking with bin capacity B and number of bins bounded by n.3) is given by a set of kn elements. max(I)) time. as such a construction could take more than polynomial time (because xi need not be polynomial in the size of the input). An additional interest of strongly NP-complete problems is that we may use them as starting points in reductions that can safely ignore the difference between the value of a number and the length of its representation. from problem FT to problem F!' is a pseudo-polynomial transformationif there exist polynomials p(. (len'(f (I))). and q2(. hence the name k-Partition. for an arbitrary instance I of H: 1.2 Strong NP-Completeness
305
While Subset Product has a more "genuine" flavor than TravelingSalesman. and the size s(x) of each element x obeys the inequality B/(k + 1) < s(x) < B/(k .4 A many-one reduction. ). Binpacking into k Bins and Partitioninto k Subsets are solvable in pseudo-polynomial time for each fixed k. len(I) S q. However. the same technique is perfectly safe when reducing k-Partition to F! or to some other problem. max(I)). max'(f (M)) . Consider reducing the standard Partitionproblem to some problem TI and let Xi be the size of the ith element in an instance of the standard Partitionproblem. While these two problems are strongly NP-complete in their general formulation.1). say Bn. ql). each with a positive integer size.) This problem has no clear relative among the "numberless" problems. involves very detailed assignments of sizes and lengthy arguments about modulo arithmetic. Corollary 8. An instance of this problem (for k . the interested reader will find references in the bibliography.3. Creating a total of xi pieces in an instance of n to correspond in some way to the ith element of Partitioncannot be allowed. it nevertheless is not a particularly useful strongly NP-complete problem. its closest relative appears to be our standard Partition. the sum of all the sizes is a multiple of n. ). such that. in fact. The question is "Can the set be partitioned into n subsets such that the sum of the sizes of the elements in each subset equals B?" D (The size restrictions amount to forcing each subset to contain exactly k elements. they are both solvable in pseudo-polynomial time for each fixed value of n. That is. f (I) can be computed in p(len(I). Theorem 8.

this looser requirement may allow a pseudo-polynomial transformation to run in exponential time on a subset of instances! The other two conditions are technicalities: the second forbids very unusual transformations that would shrink the instance too much. Q. not just in proving other problems to be strongly NP-complete. In some cases.r(len(I)) is NP-complete. Since H is strongly NP-complete. is "Can S be partitioned into N disjoint subsets (call them Si. Proposition 8. That the problem is NP-complete in the usual sense is obvious. is NP-complete. An instance of this problem is given by a set. so that H' is strongly NP-complete. as it restricts to Partition by setting N = 2 and J = 1/2 B2 ..q2 (r(q1(len'(I'))). not just in len(I) like the former. then H' is strongly NP-complete. Theorem 8. Thus it is a polynomial-time transformation between Hr and r'. there exists some polynomial r() such that H restricted to instances I with max(I) . In terms of such a transformation.306
Complexity Theory in Practice The differences between a polynomial transformation and a pseudopolynomial one are concentrated in the first condition: the latter can take time polynomial in both len(I) and max(I). a size for each element.3 If H is strongly NP-complete. our observation can be rephrased as follows. We appeal to our equivalent characterization of strongly NPcomplete problems. and the third prevents us from creating exponentially large numbers during a pseudopolynomial transformation in the same way that we did. We can also restrict our problem to k-Partition. and H reduces to H' through a pseudo-polynomial transformation. for instance. where r'(x) is the polynomial q 2 (r(ql(x)). We begin with a proof of strong NP-completeness.
.D. Hence Hr. S.10 Minimum Sum of Squares is strongly NP-complete. D Proof. s: S A-* and positive integer bounds N < ISI and J. to be exact. in our reduction from 1in3SAT to Partition. H' belongs to NP. s(x)) 2 does not exceed J?" D Proof For convenience in notation. and it creates instances I' of H' that all obey the inequality max(I') . 1 < i 6 N) such that the sum over all N subsets of (Zxes. The greater freedom inherent in pseudo-polynomial transformations can be very useful. q 1(len'(I'))). ql(x)).E. denote by Hr this restricted version of H. r(len(I))) time. but also in proving NP-complete some problems that lack arbitrarily large numbers or in simple proofs of NP-completeness. Now consider the effect of f on Hr: it runs in time polynomial in len(I)-in at most p(len(I). set B = _x s(x). The question R. of elements.

The time taken by the transformation is polynomial in nB but not in n log B. The number of grid points is nB. this subgraph is simply the complete graph on s(x) vertices. Theorem 8. call the M dimension horizontal and the N dimension vertical. the problem as posed reduces to one of grouping the subgraphs of G into n subsets such that each subset contains exactly B vertices in all-which is exactly equivalent to ThreePartition. 2. An instance of this problem is given by a graph. Finally.. we set up a distinct subgraph of G. we must have either ux = vx or
Uy = vy. The following reduction shows how the much relaxed constraint on time can be put to good use. in fact. with Exes s(x) = n B. .RI. vy). with N = S/k and J = B2 /N. The question is "Can the vertices of G be embedded in an M x N grid?" In other words. as follows. Since subgraphs have at least n + 1 vertices and the grid has height n. E). the former is max(I). it can be embedded on a grid in only one way: with all of its vertices on the same horizontal or vertical-otherwise. we could simply multiply all sizes by that value. does there exist an injection f: V {1. . with ISI = 3n. Since each subgraph has at least three vertices.) D1
307
Proof The problem is clearly in NP. as is easily verified through elementary algebra.11 Edge Embedding on a Grid is strongly NP-complete. the transformation is a plain polynomial-time transformation. which is exactly the number of vertices of G. M. Q.D. while the latter is len(I). The total graph G is thus made of 3n separate complete graphs of varying sizes.8. We shall assume that each element of S has size at least as large as max{3. Both restrictions rely on the fact that the minimal sum of squares is obtained when all subsets have the same total size. For each element x E S. Let an instance of Three-Partition be given by a set 5. Finally. we set M = B and N = n. We reduce Three-Partition to it with a pseudo-polynomial time transformation as follows. N) such that each edge of G gives rise to a vertical or horizontal segment in the grid? (Formally. if such were not the case. M and N. at least one of the edge embeddings would be neither vertical nor horizontal. 2.. and two natural numbers. G = (V.(x). . since the horizontal dimension is precisely B.E. x 11.2 Strong NP-Completeness thereby proving our problem to be strongly NP-complete. . uy) and f (v) = (vx. The transformation used in reducing k-Partition to Minimum Sum of Squares did not make use of the freedom inherent in pseudo-polynomial transformations. . Thus the
. n + 1}. and size function s: S -. K. subgraphs can be embedded only horizontally. v) E E and letting f (u) = (ux. We restrict our problem to those instances where ISI is a multiple of k. . given edge {u.

we may remain faced with an NP-complete problem. Q.1. the goal is to return the solution with the best objective value." should be amended by replacing "polynomial time" with "pseudo-polynomial time. 8.D. Some approximations may rely on the probability distribution of the solutions.3. then the last step. our transformation is a valid pseudo-polynomial time transformation."
8. a randomly chosen dense graph is almost certain to include a Hamiltonian circuit. and an objective function defined over the solutions. we can turn to complexity theory for guidance and ask about the complexity of certain types of approximation for our problem. which is basically the same as the largest number in an instance of k-Partition. Recall that an optimization problem is given by a collection of instances. We take up approximations with deterministic guarantees in this section and those with probabilistic guarantees in Section 8. either deterministic or probabilistic. but not in polynomial time.308
Complexity Theory in Practice transformation runs in pseudo-polynomial time. given a fixed number of colors.1 Definitions
If our problem is a decision problem." Let us then assume that we are dealing with an optimization problem. some approximations provide certain guarantees.E.4. both in terms of performance and in terms of running time. only probabilistic approaches can succeed-after all.3
The Complexity of Approximation
Having done our best to circumscribe our problem. For instance. page 228): if the known NP-complete problem is in fact strongly NP-complete. from which our conclusion follows. "yes" is a very poor approximation for "no. Other approximations (heuristics) do well in practice but have so far defied formal analysis. Since this definition includes any type of optimization problem and we want to focus
. We should thus modify our description of the structure of a typical proof of NP-completeness (Table 7. a collection of (feasible) solutions. and so on. Since the transformed instance has size O(B 2 )-because each complete subgraph on s(x) vertices has O(s2 (x)) edges-and since its largest number is M = B. However. a randomly chosen graph has a vanishingly small probability of being colorable with this many colors. What strategy should we now adopt? Once again. "verify that the reduction can be carried out in polynomial time.

The class NPO is the set of all NPO problems. and.3 The Complexity of Approximation
309
on those optimization problems that correspond to decision problems in NP. a collection of feasible solutions. Our definition of NPO problems ensures that all such problems have concise and easily recognizable feasible solutions. the NPO problem itself cannot be solved in polynomial time unless P equals NP-hence our interest in approximations for NPO problems. for each instance x. The difference measure is of interest only when it can be bounded over all possible instances by a constant.
E
The goal for an instance of an NPO problem is to find a feasible solution that optimizes (maximizes or minimizes. otherwise the ratio measure is preferable. while the optimal solution to instance I has value (I).or at the ratio of that difference to the value of the (optimal or approximate) solution. If(I) .f(l) 1. with the fraction reversed for maximization problems. we can look at the difference between the values of the two solutions.
f (I).8. Exercise 8. Definition 8.) The ratio measure can be defined over all instances or only in asymptotic terms-the rationale for
. To gauge the worth of our algorithm. such that each feasible solution y E S(x) has length bounded by p(IxI) and such that membership in S(x) of strings of polynomially bounded length is decidable in polynomial time. S(x). our approximation algorithm returns a solution with value f
w
Let us assume that. Our definition also ensures that the value of a feasible solution is computable in polynomial time. An immediate consequence of our definition is that the decision version of an NPO problem (does the instance admit a solution at least as good as some bound B?) is solvable in polynomial time whenever the NPO problem itself is.5 An NP-optimization (NPO) problem is given by: * a collection of instances recognizable in polynomial time. . just as decision problems in NP have concise and easily checkable certificates. depending on the problem) the value of the objective function. the ratio for a minimization problem is that of the value of the optimal solution to the value of the approximate solution. (The reader will encounter ratio measures defined without recourse to differences. * an objective function defined on the space of all feasible solutions and computable in polynomial time. In such measures. Therefore if the decision version of an NPO problem is NP-complete. we formalize and narrow our definition. We can similarly define the class PO of optimization problems solvable in polynomial time.4 Verify that P equals NP if and only if PO equals NPO. a polynomial p and.

denotes an algorithm that cannot err by more than 100%. Our main difficulty stems from the fact that many-one polynomial-time reductions.f (I)IJ. Determining the quality of an approximation belongs to the domain of algorithm analysis. which served us so well in analyzing the complexity of exact problems. let A. and. ° * The asymptotic ratio of A4 is R' = inflEs{r 3 0 R(I) -_r}. Hence we consider three measures of the quality of an approximation method. ratio. the two measures often coincide. Define the approximation ratio for A on instance I of a maximization problem to be Rj (I) = IfM-fuMI and that of a minimization problem to be Rsg(I)
=
1 f
()
m
* The absolute distance of s is DI = supin{If(I) . E Under these definitions. just like sophisticated algorithms often show their fast running times only on large instances. any of these three measures (difference. Yet another variation on these measures is one that measures the ratio of the error introduced by the approximation to the maximum error possible for the instance. it is very difficult to define average-case behavior of approximation methods.
. for instance. as for time and space complexity measures. A ratio of 1 denotes an algorithm that can return arbitrarily bad solutions.N holds for all I in S.310
Complexity Theory in Practice
the latter being that sophisticated approximation methods may need large instances to show their worth. We want to know whether such problems can be approximated in polynomial time to within a constant distance or ratio or whether such guarantees make the approximation as hard to obtain as the optimal solution. Our concern here is to determine the complexity of approximation guarantees for NP-hard optimization problems.be an optimization problem. one that measures the ratio of the difference between the approximate and optimal solutions to the difference between the pessimal and optimal solutions. are much less useful in analyzing the complexity of approximations. for an arbitrary instance I of H. where S
is any set of instances of H for which there exists some positive integer bound N such that f (I) -. let f(I) be the value of the optimal solution to I and f (I) the value of the solution returned by Vt. * The absolute ratio of so is R4 = infIEn{r 3 0 I RW (I) S r}. Finally. and asymptotic ratio) can be defined as a worstcase measure or an average-case measure. A ratio of l/2. Definition 8. In practice.6 Let 1. Since the pessimal value for many maximization problems is zero.be an approximation algorithm for H. while approximation algorithms have ratios between 0 and 1. that is. an exact algorithm has Dad = RA = 0.

finally. from B in VC to VI .3 The Complexity of Approximation because they do not preserve the quality of approximations. We have seen that. If we have. but for the corresponding Independent Set. E). we then ascertain what.2 Constant-Distance Approximations
311
We begin with the strictest of performance guarantees: that the approximation remains within constant distance of the optimal solution. A little thought eventually reveals some completely trivial examples. we do not know of an approximation algorithm for Independent Set that would provide a ratio guarantee. Hence a reduction between the two decision problems consists of copying the graph unchanged and complementing the bound. and remove it and all edges adjacent to its two endpoints-see Exercise 8.B in Independent Set. That ensuring such a guarantee can be any easier than finding the optimal solution appears nearly impossible at first sight. Yet. in which the value of the optimal solution never exceeds some constant-almost any such problem has an easy
. we begin by examining the complexity of certain guarantees. corresponding to an independent set of at least two vertices. say. we develop new polynomial-time reductions that preserve approximation guarantees and use them in erecting a primitive hierarchy of approximation problems. we get Rd =n-l/n+1. For the VC approximation. D In the following pages.2 Consider the twin problems of Vertex Cover and Independent Set.V' is a maximum independent set of G. 2n vertices and a minimum cover of n .1 vertices and thus a maximum independent set of n + I vertices. Example 8. 8. we have RK= 1/2. the absolute ratio is bounded by a constant.2 vertices. while there exists a simple approximation for VC that never returns a cover of size larger than twice that of the minimal cover (simply do the following until all edges have been removed: select a remaining edge at random. place both its endpoints in the cover. The reduction is an isomorphism and thus about as simple a reduction as possible. Under such circumstances.8.23). the subset of vertices V' is a minimum vertex cover for G if and only if the subset V . given a graph G = (V. That we cannot use the VC approximation and transform the solution through our reduction is easily seen. which grows arbitrarily close to 1 (and thus arbitrarily bad) for large values of n. if anything. is preserved by reductions among NP-complete problems. We know of NP-equivalent optimization problems for which the optimal solution can be approached within one unit and of others where obtaining an approximation within any constant ratio is NP-hard.3. then our VC approximation would always return a cover of no more than 2n .

and pack the next k" .
. Consider the following simple solution. a size function s: S A-* a bin capacity C. The same idea can clearly be extended to the Maximum k-Binpacking problem. Vizing's theorem (Exercise 8. any planar graph can be colored with five colors in low polynomial time. Let k' be the largest index such that the sum of the k' smallest elements does not exceed C and let k" be the largest index such that the sum of the k" smallest elements does not exceed 2C. As mentioned earlier. Chromatic Number of Planar Graphs is a good example. thereby packing a total of k" -1 elements in all.312
Complexity Theory in Practice
I | 2 C
k' k+ 1 k+ 2 C
k"
k"+
I
Figure 8. ignore the (k' + 1)st smallest element. the decision version is known to be NP-complete.
approximation to within a constant distance.1 smallest elements in the second bin.4
The simple distance-one approximation for Maximum TwoBinpacking.14) states that the chromatic index of a graph equals either its maximum degree or its maximum degree plus one. However. the optimal solution cannot exceed k". k. yet deciding three-colorability (the G3C problem) is NP-complete.k' .7 An instance of Maximum Two-Binpacking is given by a set S. and a positive integer bound R. Definition 8. Almost as trivial is the Chromatic Index problem. but now the deviation from the optimal may be as large as k -1 (which remains a constant for any fixed k). which asks how many colors are needed for a valid edge-coloring of a graph. so that our approximation is at most one element away from it.4 illustrates the idea. The question is "Does there exist a subset of S with at least k elements that can be partitioned into two subsets. Figure 8. Moreover the constructive proof of the theorem provides a O(IEI IVI) algorithm that colors the edges with dmax + 1 colors. We can pack the k' smallest elements in one bin. Our first nontrivial problem is a variation on Partition. since all planar graphs are four-colorable. since it suffices to set k = ISI and I = 1/2(xEs s(x)) in order to restrict it to the Partitionproblem. Our second problem is more complex. each of which has total size not exceeding C?" II This problem is obviously NP-complete.

If we ignore the total length of the tree and focus instead on minimizing the degree of the tree. NP-hard optimization problems cannot be approximated to within a constant distance unless P equals NP.12 The Minimum-Degree Spanning Tree problem can be approximated to within one from the minimum degree. In general.13 Unless P equals NP. and a target bound k > 0.m and 1 . where each process requires a certain amount of each of a number of different resources in order to complete its execution and release these resources. we may want to break it by killing a subset of processes that among themselves hold sufficient resources to allow one of the remaining processes to complete. which is also NP-complete. Should a deadlock arise (where the processes all hold some amount of resources and all need more resources than remain available in order to proceed). Let G = (E. of each of m currencies and by target currency amounts b = (bi). B}.8. B2 . V) be an instance of Vertex Cover and assume that an optimal
. We shall reduce the optimization version of the Vertex Cover problem to its approximation version by taking advantage of the fact that the value of the solution is an integer. 2 Proof. Our third problem is a variant on a theme explored in Exercise 7. .18). This problem is NP-complete for each fixed number of currencies larger than one (see Exercise 8. Theorem 8.8 An instance of Safe Deposit Boxes is given by a collection of deposit boxes fBi.) This problem arises in resource allocation in operating systems. Theorem 8. no polynomial-time algorithm can find a vertex cover that never exceeds the size of the optimal cover by more than some fixed constant. The question is "Does there exist a subcollection of at most k safe deposit boxes that
among themselves contain sufficient currency to meet target b?" n
(The goal.3 The Complexity of Approximation
313
Definition 8.17) but admits a constantdistance approximation (see Exercise 8.. El The approximation algorithm proceeds through successive iterations from an arbitrary initial spanning tree. Let the constant of the theorem be k.19. .m. . What makes the problem difficult is that the currencies (resources) are not interchangeable.j S n. in which we asked the reader to verify that the Bounded-Degree Spanning Tree problem is NP-complete. then.27. we obtain the MinimumDegree Spanning Tree problem. however. is to break open the smallest number of safe deposit boxes in order to collect sufficient amounts of each of the currencies. 1 . We give one example of the reduction technique used in all such cases.i . see Exercise 8. each containing a certain amount saj. 0 S i .

Hence the optimization problem reduces in polynomial time to its constant-distance approximation version. For instance. any nonzero ratio guarantee-and if so. but these k vertices are distributed among (k + 1) copies. £.2 summarizes the key features of these methods. We define three corresponding classes of approximation problems.
. a family of approximation algorithms. moreover.314
Complexity Theory in Practice vertex cover for G contains m vertices. Exercises at the end of the chapter pursue some other. Thus at least (k + 1)m of the vertices are accounted for. As. runs in time polynomial in III.9
a
An optimization problem H belongs to the class Apx if there exists a precision requirement. but we multiply the value of each object by (k + 1). and an approximation algorithm. so that Gk+1 has (k + 1)1VI vertices and (k + 1)IEI edges. the same object sizes. although not always through simple replication. more interestingly. * An optimization problem H belongs to the class PTAS (and is said to be p-approximable) if there exists a polynomial-time approximation scheme. an optimal vertex cover for Gk+I has (k + I)m vertices. we keep the same collection of objects. the vertices present in any copy of G form a cover of G. at what price.E. We produce the new graph Gk+i by making (k + 1) distinct copies of G. and the same bag capacity. Table 8. that is. that is. such that AI takes as input an instance I of H. an optimal solution. Q.3
Approximation Schemes
We now turn to ratio approximations. for a price. so that one copy did not receive any additional vertex. We now run the approximation algorithm on Gk+l: the result is a cover for Gk+1 with at most (k + 1)m + k vertices. so that.E. the supposed approximation algorithm actually found a solution with m vertices. leaving only k vertices. Identifying that copy is merely a matter of scanning all copies and retaining that copy with the minimum number of vertices in its cover.such that. The same technique of "multiplication" works for almost every NP-hard optimization problem. For that copy of G. The ratio guarantee is only one part of the characterization of an approximation algorithm: we can also ask whether the approximation algorithm can provide only some fixed ratio guarantee or. The vertices of this collection are distributed among the (k + 1) copies of G.D. {sAi 1. in particular. at least m vertices of the collection must appear in any given copy. and obeys Rq .3. more specialized "multiplication" methods. Definition 8. in applying the technique to the Knapsack problem.
8.

A. specifically. runs in time polynomial in Il and /i. if its decision version is C strongly NP-complete.2
How to Prove the NP-Hardness of Constant-Distance Approximations. runs in time polynomial in Il. Conclude that no such constant-distance approximation can exist unless P equals NP. that is.e. The definition for FPTAS is a uniform definition. Verify that one of the solutions for x recovered from a distanced approximation for f (x) is an optimal solution for x.8. Very few problems are known to be in FPTAS. and obeys R5j . The definition for PTAS does not preclude the existence of a single algorithm but allows its running time to grow arbitrarily with the precision requirement-or simply allows entirely distinct algorithms to be used for different precision requirements. .any solution for x can be transformed easily to a solution for f (x).the transformed version of an optimal solution for x is an optimal solution
for f(x).3 The Complexity of Approximation
315
Table 8. the value of which is (k + I) times the value of the solution for x. and
. the transformation must ensure that .
Theorem 8. and obeys R j * An optimization problem rl belongs to the class FPTAS (and is said to be fully p-approximable) if there exists a fully polynomial-time approximation scheme.a solution for x can be recovered from a solution for f(x).
. A Transform an instance x of the problem into a new instance f(x) of the same problem through a type of "multiplication" by (k + 1). in the sense that a single algorithm serves for all possible precision requirements and its running time is polynomial in the precision requirement. that takes as input an instance I of Fl. . then Fl is not fully p-approximable. . a single approximation algorithm. . there exists an algorithm in the family. we clearly have PO C FPTAS C PTAS C Apx C NPO.
for each fixed precision requirement E > 0.
* Assume that a constant-distance approximation with distance k exists.14 Let Fl be an optimization problem. D From the definition. say Aj. None of the strongly NPcomplete problems can have optimization versions in FPTAS-a result that ties together approximation and strong NP-completeness in an intriguing
way. that takes as input both an instance I of Fl and a precision requirement 1 £.

this family of algorithms shows that Knapsack is p-approximable. The reader is familiar with at least one fully p-approximable problem: Knapsack. by definition. The simple greedy heuristic based on value density. The value of the optimal solution
. Since the input size is O(n log(SV)). guarantees a packing of value at least half that of the optimal packing (an easy proof). In order to show that Knapsack is fully papproximable. then there would exist an 8-approximation algorithm A. running in time polynomial in (among other things) l/. Given an instance with n items where the item of largest value has value V and the item of largest size has size S. only one term in the running time. we must make real use of the fact that Knapsack is solvable in pseudo-polynomial time. the running time of an algorithm in this family is proportional to n1'£ and thus not a polynomial function of the precision requirement. it follows that the value of the optimal solution cannot exceed max(I).E. introduced in EI to turn it into the decision problem lnd. Since the bound. if n were fully p-approximable. with a small modification (pick the single item of largest value if it gives a better packing than the greedy packing).316
Complexity Theory in Practice Proof. we can make this expression polynomial in the input size. which is pseudo-polynomial time. If we were to scale all item values down by some factor F. If n were fully papproximable. exactly. and then keep the best of the completed solutions. there would exist a pseudopolynomial time algorithm solving it. which would contradict the strong NP-completeness of nd. leave room for the optimization versions of problems that do not appear to be in P and yet are not known to be NPcomplete. While this improved heuristic is expensive. the new running time would be O(n2 F log(nSv)). it does run in polynomial time for each fixed k. the linear term V. is not actually polynomial. since most NP-complete problems are strongly NPcomplete. however.. It does. since it takes time Q (nk). Now set £ = (max(I)+l) so that an E-approximate solution must be an exact solution. We can modify this algorithm by doing some look-ahead: try all possible subsets of k or fewer items. with the right choice for F. However.D. which can be made arbitrarily good. the dynamic programming solution runs in O(n 2 V log(nSV)) time. moreover its approximation guarantee is RA = I/k (see Exercise 8. This result leaves little room in FPTAS for the optimization versions of NPcomplete problems. we have B S max(I). time polynomial in l/£ is time polynomial in max (I). ranges up to (for a maximization problem) the value of the optimal solution and since.25). Hence. Indeed. Let 'ld denote the decision version of n. B. However. Q. complete each subset by the greedy heuristic. and thus also nd.

can be easily related to the value of the optimal solution to the original instance. f (I)). as the following theorem states. 3. as well as to the value in unscaled terms of the optimal solution to the scaled version. Hence we have derived a fully polynomial-time approximation scheme for the Knapsack problem. 1I can be solved in pseudo-polynomial time.3 The Complexity of Approximation
317
to the scaled instance. Then H is fully p-approximable. F should be of the form Y for some parameter y (since then the value f (IF) is within y of the optimal solution).F
F(IF)
I(l) -
nF
How do we select F? In order to ensure a polynomial running time.8.27.p(len(I). see Exercise 8.
E1
(For a proof. hence the ratio guarantee of our algorithm is R. f (I) and max(I) are polynomially related through len(I).q(len(I). in order to ensure an approximation independent of n. F should be of the form X for some parameter x. and. very few NPO problems that have an NP-complete decision version are known to be in FPTAS.15 Let rT be an optimization problem with the following properties: 1. f (IF): f (IF) .) This theorem gives us a limited converse of Theorem 8. which is polynomial in the size of the input. which 8 is polynomial in the input size and in the precision requirement. call it fF(IF).
. we have f (I) ¢ V. Theorem 8. In fact.v. we can obtain the precision requirement E = I/k with an approximation algorithm running in 0(1/en 3 log(l/ n 2S)) time. Since we could always place in the knapsack the one item of largest value. f (I). 2. thereby obtaining a solution of value V. The dynamic V. the objective value of any feasible solution varies linearly with the parameters of the instance. the scaling mechanism can be used with a variety of problems solvable in pseudo-polynomial time. that is.14 but basically mimics the structure of the Knapsack problem. max(I)) and max(I) . Let us simply set F = kn' for some natural number k. programming solution now runs on the scaled instance in 0(kn3 log(kn2 S)) time. there exist bivariate polynomials p and q such that we have both f (I) . and the solution returned is at least as large as f (I) .j=
f M)-
f(IF)
V/k
f(I)
V
k
In other words. Other than Knapsack and its close relatives.

since. Clique. Partitionis p-simple by virtue of its dynamic
programming solution. Vertex Cover. It is p-simple if there exists a fixed bivariate polynomial. proves the result for minimization problems. for each fixed B. not sufficient condition for membership in PTAS(for instance. the same line of reasoning. and Set Cover are simple. the set of instances with optimal values not exceeding B is decidable in polynomial time. Chromatic Number is not simple. * If n is fully p-approximable (H E FPTAS).B. q.16 Let H be an optimization problem. in which the optimal value is bounded by 4. such that the set of instances I with optimal values not exceeding B is decidable in q(l1I.) Thus we have f(I) . since it remains NPcomplete for planar graphs. Theorem 8. * If 11 is p-approximable (H E PTAS). We give the proof for a maximization problem. Our definition of simplicity moves from simple problems to p-simple problems by adding a uniformity condition. Definition 8. Clique. the set of instances with optimal values bounded by B can
be solved in polynomial time by exhaustive search of all
(n)
collections of
B items (vertices or subsets).= Bl 2in time polynomial in the size of the input instance !. cannot be in PTAS unless P equals NP. much like our change from p-approximable to fully p-approximable. El For instance. with the obvious changes. On the other hand. for each fixed B. 1
Proof. The approximation scheme can meet any precision requirement . But f (I) is the value of the optimal solution. B) time. Simplicity is a necessary but.318
Complexity Theory in Practice An alternate characterization of problems that belong to PTAS or FPTAS can be derived by stratifying problems on the basis of the size of the solution rather than the size of the instance. (Our choice of B + 2 instead of B is to take care of boundary conditions. then it is p-simple. so we obviously can only have f (I) 3 B
.10 An optimization problem is simple if. alas. then it is simple.f(I)
1
B+2
f(I)
or
f(I)
f(I)
B +1
B+2
Hence we can have f (I) . as we shall shortly see). while simple.B only when we also have f (I) .

for each instance I of FI. f (I) and max(I) are polynomially related through len(I). Q. we write f (S) for the value of a subset and f (x) for the value of a single element. so is the second. let J be a feasible solution of size k.
provides a general technique for building approximations schemes for a class of NPO problems. each with a value. that is. since the first inequality is decidable in polynomial time. see Exercise 8. The feasible solutions of the instance form an independence system. Theorem 8. Adding uniformity to the running time of the approximation algorithm adds uniformity to the decision procedure for the instances with optimal values not exceeding B and thus proves the second statement of our theorem.8. the problem is simple.17 If 11 is an NPO problem with an NP-complete decision version and. to return some completion J obeying
f ()
3.D. running on J. Further let 1* be an optimal solution. The goal is to maximize the sum of the a values of the items included in the solution.11 An instance of a maximum independent subset problem is given by a collection of items. every subset of a feasible solution is also a feasible solution. We want our algorithm. and let j* be the best possible feasible superset of J. Hence we conclude
and.E.25).3 The Complexity of Approximation
319
when we also have f (I) > B.) The class PTAS is much richer than the class FPTAS. the best possible completion of J. Definition 8.
k
f (J*) -f
k
(I*)-
xEJ1 I
max f (x)
.e.. (For a proof.28. then 1l is p-simple if and only if it can be solved in : pseudo-polynomial time. Since the set of instances I that have optimal values not exceeding B is thus decidable in polynomial time. Our first attempt at providing an approximation scheme for Knapsack. through an exhaustive
search of all
(n)
subsets and their greedy completions (Exercise 8. one that can ensure the desired approximation by not "losing too much" as it fills in the rest of the feasible solution. i. Now we want to define a well-behaved completion algorithm. Let f be the objective function of our maximum independent subset problem. We can further tie together our results with the following observation. that is.

then the algorithm will find it directly. we have JO* = *. the subset Jo will be tested and the completion algorithm will return a solution at least as good as Jo. The key aspect of the technique is its examination of a large. We call such an algorithm a polynomialtime k-completion algorithm. Basically. it then chooses the best of the k approximate solutions
. let Jo be the subset of I* of size k that contains the k largest elements of I*. otherwise. Now we must have f (J)
k f (Jo*)-I f (I . we can derive a somewhat more useful technique for building approximation schemes-the shifting technique. the approximation algorithm solves each subproblem (each group) and merges the solutions to the subproblems into an approximate solution to the entire problem. number of different solutions. Theorem 8. Since all subsets of size k are tested. The best possible completion of Jo is I* itself-that is. it leaves it untouched. ii The other half of the characterization is the source of the specific subtractive terms in the required lower bound. For each of the k choices. Applying the same principle to other problems. The problem with this technique lies in proving that the completion algorithm is indeed well behaved. its output is not defined. it admits a polynomial-time k-completion algorithm. We have just proved the simpler half of the following characterization theorem. for any k. The grouping process has no predetermined boundaries and so we have k distinct choices obtained by shifting (hence the name) the boundaries between the groups. if it is not given a feasible solution.i0
max f (x)
k+ I
f (JO*)
since the optimal completion has at least k + 1 elements. If the optimal solution has k or fewer elements. Now.320
Complexity Theory in Practice
and running in polynomial time. We claim that such a completion algorithm will guarantee a ratio of 3 /k.18 A maximum independent subset problem is in PTAS if and only if. the shifting technique decomposes a problem into suitably sized "adjacent" subpieces and then creates subproblems by grouping a number of adjacent subpieces.m-Jax(O)
=
k
f (I)
because we have
xEJO . yet polynomial. if the algorithm is given a feasible solution of size less than k. We can think of a linear array of some kl subpieces in which the subpieces end up in I groups of k consecutive subpieces each.

.19 If algorithm s4 has absolute approximation ratio RA. . we ignore the fact that the strips at either end may be somewhat narrower. this technique is a compromise between a dynamic programming approach.5
1
Pi
I
5
NI
400
.PSl00S
fIP3
Figure 8.
thus obtained.5
The partitions created by shifting.S
~^t
Xi x11 N
i . S4.) Our approximation 2 algorithm divides the area in which the n points reside (from minimum to maximum abscissa and from minimum to maximum ordinate) into vertical strips of width D-we ignore the fact that the last strip may be somewhat narrower.4
40.) Figure 8. this step can be repeated k . and a divide-and-conquer approach. Theorem 8. once
for each partition. Consider then the Disk Covering problem: given n points in the plane and disks of fixed diameter D.1 times to obtain a total of k distinct partitions (call them PI. we can apply this algorithm to each strip in partition Pi and take the union of the disks returned for each strip to obtain an approximate solution for the complete problem.8. We can repeat the process k times. In effect. (Again.'00000 CiS 54SN . exercises at the end of the chapter explore some other examples. One example must suffice here.5 illustrates the concept.g .OS W
0'SS'
t 2: |
S ffSE
P2
i
13 ' I
AS' . For some natural number k. By shifting the boundaries of the partition by D. P2. which would examine a single grouping.. cover all points with the smallest number of disks. Since si yields Rq -approximations. and choose the best of the k approximate solutions thus
obtained. we obtain a new partition. the number of disks returned by our algorithm
. Pk) into vertical strips of width kD. (Such a problem can model the location of emergency facilities such that no point is farther away than 1/ D from a facility.igia
W. 5
i. .3 The Complexity of Approximation
321
:
. which would examine all possible groupings of subpieces. Suppose we have an algorithm -4 that finds good approximate solutions within strips of width at most kD. then the shifting algorithm has absolute approximation ratio kRj+1 El Proof Denote by N the number of disks in some optimal solution.>g S
S . we can group k consecutive strips into a single strip of width kD and thus partition the area into vertical strips of width kD. .

Hence we can write k 0I N.6.. Our observation can be rewritten as ZjEp Nj .e.(k + 1) *N
and thus we can write
.
for partition Pi is bounded by
-
Ejcp. in a globally optimal solution. that is. since the distance between nonadjacent elementary strips exceeds the diameter of a disk. cover points in two adjacent strips. Denote this last quantity by O0. Because each partition has a different set of adjacent strips and because each partition is shifted from the previous one by a full disk diameter. By summing our first inequality over all k partitions and substituting our second inequality. Thus if we could obtain locally optimal solutions within each strip (i.N + O. none of the disks that cover points in adjacent strips of Pi can cover points in adjacent strips of Pj. Nj.
where Nj is the optimal number
of disks needed to cover the points in vertical strip j in partition Pi and where j ranges over all such strips. I
Figure 8.322
Complexity Theory in Practice
Pi
I
I
IIDI
P.6 Why disks cannot cover points from adjacent strips in two distinct partitions. as illustrated in Figure 8. By construction. we obtain
k
E
i=l jpi
Nj -. Thus the total number of disks that can cover points in adjacent strips in any partition is at most N-the total number of disks in an optimal solution. O0 is the number of disks in the optimal solution that cover points in two adjacent strips of partition Pi. for i : j. taking their union would yield a solution that exceeds N by at most the number of disks that. solutions of value Nj for strip j). a disk cannot cover points in two elementary strips (the narrow strips of width D) that are not adjacent.

However. as desired. 2k2. We begin by noting that a square of size kD can easily be covered completely by (k + 1)2 + k2 = 0(k 2 ) disks of diameter D.19 applies again.7. we need only consider a constant number of disks for covering a square. with suitable modifications regarding the effective diameter of the shape. it allows us to use a divide-and-conquer strategy and limit the divergence from optimality. Moreover.E
E
i
k k
i=1 jEP. if expensive. any disk that covers at least two points can always be assumed to have these two points on its periphery. it presupposes the existence of a good. The result is k distinct partitions of the vertical strip into a collection of squares of side kD. so that we need only devise a good approximation algorithm for placing disks to cover the points within a square-a problem for which we can actually afford to compute the optimal solution as follows.k. This result generalizes easily to coverage by uniform convex shapes other than disks. Hence we need only consider 0 (nfO(k )) distinct arrangements of disks in square i. effectively. as shown in Figure 8. for some natural number k. we can repeat the divide-and-conquer strategy: we now divide each vertical strip into elementary rectangles of height D and then group k adjacent rectangles into a single square of side kD (again.8. the end pieces may fall short). deriving an optimal solution by exhaustive search could take exponential time. it is not quite optimal.4 Since (k + 1)2 + k2 is a constant for any constant k.
Using now our first bound for our shifting algorithm. With no restriction on the height of the strip. However.3 The Complexity of Approximation
323
min. approximation ratio of kR-l1 . Fortunately. we conclude that its approximation is bounded by lIR k+1 N and thus has an absolute Q. but its leading term.E. what works once can be made to work again.D. Theorem 8. is the same as that of the optimal covering. each arrangement can be checked in O(nik2 ) time. since we need only check that each point resides
4This covering pattern is known in quilt making as the double wedding ring. we do not have any algorithm yet for covering the points in a vertical strip.
I Nj . Our new problem is to minimize the number of disks of diameter D needed cover a collection of points placed in a vertical strip of width kD. It gives us a mechanism by which to extend the use of an expensive approximation algorithm to much larger instances. approximation algorithm. we have to consider at most 2(") disk positions for the ni points present within some square i.
. In the case of Disk Covering.
E
JEP. since there are two possible circles of diameter D that pass through a pair of points.

4
Fixed-Ratio Approximations
In the previous sections. the optimization version of 3SAT. Theorem 8. for every natural number k. Overall. although they obey the necessary condition of simplicity.23). there remains a very large number of problems that have some fixed-ratio approximation and thus belong to Apx but do not appear to belong to PTAS.20 There is an approximation scheme for Disk Covering such that. and the goal is to return a truth assignment that maximizes the number of satisfied clauses. within one of the disks. the scheme provides an absolute approximation ratio of 2k+1 and runs in O(k4n0(k2 )) time. and the most basic problem of all. Because this problem is the optimization version of our most fundamental NP-complete problem. Examples include Vertex Cover (see Exercise 8. An instance of this problem is given by a collection of clauses of three literals each. we established some necessary conditions for membership in PTAS as well as some techniques for constructing approximation schemes for several classes of problems.24). it is natural to regard it as the key problem in Apx.7 How to cover a square of side kD with (k + 1)2 + k2 disks of diameter D. D
8. Membership of Max3SAT or MaxkSAT (for any fixed k) in Apx is easy to establish. However. Maximum Cut (see Exercise 8. Theorem 8.324
Complexity Theory in Practice
(k = 3)
Figure 8. FH
.3. namely Maximum 3SAT (Max3SAT). we obtain a polynomialtime approximation scheme for Disk Covering.21 MaxkSAT has a 2 -k-approximation. we see that an optimal disk covering can be obtained for each square in time polynomial in the number of points present within the square. Putting all of the preceding findings together.

and Wf the total weight of the clauses losing a literal before the loss of that literal. the total weight of the clauses in the instance-a somewhat more general claim. the total weight of the clauses satisfied by the assignment.mt = mu + mf clauses now have a total weight of w. and PTAS? By using reductions. our algorithm will leave at most w. since we have. Wm+1 3 w. Set x to true if the sum of the weights of the clauses in which x appears as an uncomplemented literal exceeds the sum of the clauses in which it
appears as a complemented literal. w.E. our claim isproved.D.3 The Complexity of Approximation Proof.icl.Ici . By inductive hypothesis. Note that m2-k is exactly the total weight of the m clauses of length k in the original instance. Because we must have had w. since it applies to instances with clauses of variable length. naturally.. thus our claim is that the number of clauses left unsatisfied by the algorithm is bounded by EZmI 2. . + 2 Wf. we can write wm+l = wt + w.mf the number of clauses unaffected by the assignment. our conclusion follows.3 wf in order to assign x as we did. because the weight of every clause that loses a literal doubles.m. + 2 Wf clauses unsatisfied among these clauses and thus also in the original problem. Let x be the first variable set by the algorithm and denote by mt the number of clauses satisfied by the assignment.
* Update the clauses and their weights and repeat until all clauses have been satisfied or reduced to a falsehood. the type of reduction
. + Wi . With a single clause. + Wf 3 W. Assume then that the algorithm meets the bound on all instances of m or fewer clauses. Also let Wm+1 denote the total weight of all the clauses in the original instance.+ 2 Wf. To prove our claim.. * Assign to each remaining clause ci weight 2. Apx. since the best that any algorithm could do would be to satisfy all m clauses. Q. Consider the following simple algorithm. However. the total weight of the unaffected clauses. set it to false otherwise. thus every unassigned literal left in a clause halves the weight of that clause. the weight of a clause is inversely proportional to the number of ways in
325
which that clause could be satisfied. (Intuitively. + 2 Wf. mf the number of clauses losing a literal as a result of the assignment. we. How are we going to classify problems within the classes NPO. thus we can write wm+1 = wt + w. and mu = m + 1 . The remaining m . we use induction on the number of clauses. We claim that this algorithm will leave at most m2 -k unsatisfied clauses (where m is the number of clauses in the instance). as noted above. the algorithm clearly returns a satisfying truth assignment and thus meets the bound.) * Pick any variable x that appears in some remaining clause.8.

we may want to use different reductions depending on the classes we want to separate: as we noted in Chapter 6. we are making the diagram commute). all of our reductions should run in polynomial time. Since all of our classes reside between PO and NPO. Definition 8. the approximation algorithm A for 112. Figure 8.
we now need is quite a bit more complex than the many-one reduction used in completeness proofs for decision problems. By using map f between instances and map g between solutions. along with known algorithm A. we can obtain a good approximate solution for our original problem. such that
. the correspondence between solutions must preserve approximation ratios. Of course. thus both the f map between instances and the g map between solutions must be computable in polynomial time. moreover. we achieve the generality by introducing a third function that maps precision requirements for 1 I onto precision requirements for 112. and PTAS. f. Differences among possible reductions thus come from the requirements they place on the handling of the precision requirement. We choose a definition that gives us sufficient generality to prove results regarding the separation of NPO. the tool must be adapted to the task. In fact. we are effectively defining the new approximation algorithm A' for problem Hi (in mathematical terms. We need to establish a correspondence between solutions as well as between instances. and h. g. and the map g. The reason for these requirements is that we need to be able to retrieve a good approximation for problem Ill from a reduction to a problem 112 for which we already have an approximate solution algorithm with certain
guarantees.326
Complexity Theory in Practice
instances
solutions
nH
Figure 8. We say that HIl PTAS-reduces to 112 if there exist three functions. Apx.8 illustrates the scheme of the reduction. by calling in succession the routines implementing the map f.12 Let I I and F12 be two problems in NPO.8
n2
The requisite style of reduction between approximation problems.

5 Prove these statements.3 The Complexity of Approximation * for any instance x of l7i. g(x. we define one last class of optimization problems to reflect our sense that Max3SAT is a key problem. e) is a solution for x and is computable in time polynomial in ixl and lyl. * If 711 PTAS-reduces to 112 and F12 belongs to Apx (respectively.
* for any instance x of 711. In view of Theorem 8. whereas OPTNP.13 The class OPTNP is exactly the class of problems that PTAS-reduce to Max3SAT. Max3SAT itself. 11
We say that an optimization problem is complete for NPO (respectively. F1 This reduction has all of the characteristics we have come to associate with reductions in complexity theory. * h is a computable invective function on the set of rationals in the
327
interval [0. f (x) is an instance of 112 and is computable in time polynomial in Ix . The standard complete problems for NPO and Apx are. by its very definition. Apx) and every problem in NPO (respectively. Theorem 8.22 The Maximum Weighted Satisfiability (MaxWSAT) problem has the same instances as Satisfiability.8. and any precision requirement E (expressed as a fraction). we have OPTNP C Apx. y. if the value of y obeys precision requirement h(e). then the value of g(x. An instance of the Maximum Bounded Weighted Satisfiability
. * for any instance x of Fii. The objective is to find a satisfying truth assignment that maximizes the total weight of the true variables. in fact. F Exercise 8. and any rational precision requirement s (expressed as a fraction). Furthermore. any solution y for instance f(x) of 112.and Apx-completeness. if it belongs to NPO (respectively. generalizations of Max3SAT. We introduce OPTNP because we have not yet seen natural problems that are complete for NPO or Apx. Proposition 8. has at least one.21 and Proposition 8. PTAS).4 * PTAS-reductions are reflexive and transitive.4. -) obeys the precision requirement s. Definition 8. any solution y for instance f(x) of 112. then 11I belongs to Apx (respectively. PTAS). with the addition of a weight function mapping each variable to a natural number. Apx). 1). Apx) PTAS-reduces to it. F We define OPTNP-completeness as we did for NPO. y.

except for those that denote that a tape square contains the character 1 at the end of computation-and that only for squares to the right of position 0. * Maximum Bounded Weighted Satisfiability is Apx-complete. for each i from I to p(IxI).
F
Proof. then M halts with a 0 on the tape. Using the notation of Table 6. only the tape squares that contain a 1 in the binary representation of the value of the solution for x will count toward the weight of the MaxWSAT solution. which is explored in Exercise 8. to finish the proof.D. the construction used in the proof of Cook's theorem yields a Boolean formula of polynomial size that describes exactly those computation paths of M on input x and guess y that lead to a nonzero answer. 1). for each instance of 1-.be a problem in NPO and let M be a nondeterministic machine that. we assign weight 2 i. (That is. a solution for the original problem can be recovered by looking at the assignment of the variables describing the initial guess (to the left of square 0 at time 1).1. no NPO-complete problem can be in Apx and no Apx-complete problem can be in PTAS. in addition to Max3SAT. In addition. We prove only the first result.33. and many others. If the guess fails. For M and any instance x. That is. the second requires a different technique. written in binary and "in reverse. M runs in polynomial time.
Q. so that the weight of the MaxWSAT solution equals the value of the solution computed by M. Bounded-Degree Independent Set. i. and computes its value.328
Complexity Theory in Practice problem is an instance of MaxWSAT with a bound W such that the sum of the weights of all variables in the instance must lie in the interval [W. we can use PTAS-reductions
. otherwise it halts with the value of the solution.) We assign a weight of 0 to all variables used in the construction.31). we would need to show that any minimization problem in NPO also PTAS-reduces to MaxWSAT (see Exercise 8." with its least significant bit on square 1 and increasing bits to the right of that position. they include many natural problems: Bounded-Degree Vertex Cover. our proof showed only that any maximization problem in NPO PTAS-reduces to MaxWSAT. guesses a solution.1 to variable t(p(JxD). Maximum Cut. 2W]. This transformation between instances can easily be carried out in polynomial time. the Boolean formula yields a bijection between satisfying truth assignments and accepting paths.E. By definition of NPO.
Strictly speaking. Unless P equals NP. and the precision-mapping function h is just
the identity. That MaxWSAT is in NPO is easily verified. checks that it is feasible. Let I. OPTNP-complete problems interest us because. * Maximum Weighted Satisfiability is NPO-complete.

In defining the reduction. r2 ). P equals NP) is based on the use of gap-preserving reductions. Our NP-completeness proofs provide several examples of such gap-creating reductions. unless P equals NP. obvious modifications make it applicable to reductions between two minimization problems or between a minimization problem and a maximization problem. Clique.
. and many others. our reduction from NAE3SAT to G3C was such that all satisfiable instances were mapped onto three-colorable graphs. we need only specify the mapping between instances and some condition on the behavior of optimal solutions. whereas all unsatisfiable instances were mapped onto graphs requiring at least four colors. a gap-preserving reduction actually creates a gap: it maps a decision problem onto an optimization problem and ensures that all "yes" instances map onto instances with optimal values on one side of the gap and that all "no" instances map onto instances with optimal values on the other side of the gap. including Vertex Cover. of course. Such results are useful because OPTNP-hard problems cannot be in PTAS unless P equals NP. Proving that an NPO problem does not belong to PTAS (unless. together with two pairs of functions.14 Let [Il and [12 be two maximization problems. denote the value of an optimal solution for an instance x by opt(x). to show that a number of optimization problems are OPTNP-hard. we give the definition for a reduction between two maximization problems. (cl.8. In its strongest and simplest form.) Definition 8. since such an algorithm could then be used to solve NAE3SAT. many of them similar (with respect to instances) to the reductions used in proofs of NP-completeness. It follows immediately that no polynomial-time algorithm can approximate G3C with an absolute ratio better than 3/4.3 The Complexity of Approximation from Max3SAT. and r2 return values no smaller than 1 and the following implications hold:
opt*) 3I cl()topt(f
opt(x)6 Cc i
329
(x)) 3C2(f (x))
_2
opt(f(X))x)
))
E
ri(x)
r2 (f x))L
Observe that the definition imposes no condition on the behavior of the transformation for instances with optimal values that lie within the gap. For instance. as we now proceed to establish. A gap-preserving reduction from [1. (For simplicity. G3C cannot be in PTAS. such that r. ri) and (c2. We conclude that. Traveling Salesman with Triangle Inequality. to rI2 is a polynomial-time map from instances of 7I to instances of 12.

The gist of this characterization is that a "yes" instance of a problem
D1
. We just saw that the reduction g used in the proof of NP-completeness of G3C gave rise to the implications
x satisfiable • opt(g(x)) = 3 x not satisfiable X opt(g(x)) 3 4
Assume that we have a gap-preserving reduction f. We can combine g and f to obtain
x satisfiable X opt(h(g(x))) S c'(h(g(x))) x not satisfiable X opt(h(g(x)))
c'((g(x)))
r'(h(g (x)))
so that the gap created in the optimal solutions of G3C by g is translated into another gap in the optimal solutions of H'-the gap is preserved (although it can be enlarged or shrunk). because the problems for which we had a gap-creating reduction were relatively few and had not been used much in further transformations. f is a gap-creating reduction to Max3SAT. In particular.23 For each problem H in NP. there is a polynomial-time map f from instances of H to instances of Max3SAT and a fixed E > 0 such that. Proof. 3/4) and (c'. The consequence is that approximating n' with an absolute ratio greater than r' is NP-hard.E)If(x)I
where If(x)I denotes the number of clauses in f (x). for any instance x of H. Theorem 8. In other words. We need to say a few words about the alternate characterization of NP. gap-preserving reductions were of limited interest. with pairs (3.330
Complexity Theory in Practice The typical use of a gap-preserving reduction is to combine it with a gap-creating reduction such as the one described for G3C. r') from G3C to some minimization problem WI. the following implications hold:
x is a "yes" instance X opt(f (x)) = If(x)I x is a "no" instance X opt(f (x)) < (1 . it has become possible to prove that Max3SATand thus any of the OPTNP-hard problems-cannot be in PTAS unless P equals NP. nothing was known about OPTNP-complete problems or even about several important OPTNP-hard problems such as Clique. Through a novel characterization of NP in terms of probabilistic proof checking (covered in Section 9.5). Up until 1991.

and write a formula of constant size that describes the accepting paths in terms of the bits of the certificate read during the computation. logn
= n"C such
sequences in all. one for each sequence of random bits. where each conjunct describes one path and thus has at most C2 literals. the number of clauses in the final formula is a constant times n"X say kncl. the verifier accepts its input. the computation of the verifier depends on c2 bits from the certificate and is otherwise a straightforward deterministic polynomialtime computation. Since El is in NP. determine which are accepting and which rejecting. at least half of the random bit sequences will lead to rejection). then the verifier will accept it with probability 1 (that is. chosen with the help of a logarithmic number of random bits. for each choice of the c2 certificate bits to be read). the verifier will reject it with probability at least 1/2 (that is.8. This formula is a disjunction of at most 2C2 conjuncts. so that we must have at 2
. If the verifier rejects its input. We can examine all 2 C2 possible outcomes that can result from looking up these c2 bits. The formula. otherwise. If x is a "yes" instance. However. The resulting large conjunction is satisfiable if and only if there exists a certificate such that. Because there is a constant number of paths and each path is of polynomial length. each with at most 2C2 literals. we can rewrite each disjunction as a conjunction of disjuncts. Consider any fixed
331
sequence of random bits-there are 2c. and place them into a single large conjunction. unfortunately. is not in 3SAT form: it is a conjunction of nc' disjunctions. each in at most a polynomial number of steps. Therefore.3 The Complexity of Approximation in NP has a certificate that can be verified probabilistically in polynomial time by inspecting only a constant number of bits of the certificate.e. But then at least one out of every k clauses must be false for these !nc' formulae. at least one half of the constant-size formulae are unsatisfiable. each composed of conjuncts of literal. for each choice of cl log n random bits (i. For
a fixed sequence.. We can then take all nCl such formulae. then it does so for at least one half of the possible choices of random bits. Each such formula is satisfiable if and only if the c2 bits of the certificate examined under the chosen sequence of random bits can assume values that lead the verifier to accept its input. a "yes" instance of size n has a certificate that can be verified in polynomial time with the help of at most cl log n random bits and by reading at most c2 bits from the certificate. Each outcome determines a computation path. we can examine all of these paths. then use our standard trick to cut the disjuncts into a larger collection of disjuncts with three literals each. Since all manipulations involve only constant-sized entities (depending solely on C2). some paths lead to acceptance and some to rejection. it will accept no matter what the random bits are).

* If the problem is not p-simple or if its decision version is strongly NP-complete, then it is not in FPTAS unless P equals NP. * If the problem is not simple or if it is OPTNP-hard, then it is not in PTAS unless P equals NP.

least I nc, unsatisfied clauses in any assignment. Thus if the verifier accepts its input, then all kncI clauses are satisfied, whereas, if it rejects its input, then at most (k - 1 )nC = (1 - 1 )knc1 clauses can be satisfied. Since k is a fixed constant, we have obtained the desired gap, with E= 2{T. Q.E.D. Corollary 8.3 No OPTNP-hard problem can be in PTAS unless P equals NP. 2 We defined OPTNP to capture the complexity of approximating our basic NP-complete problem, 3SAT; this definition proved extremely useful in that it allowed us to obtain a number of natural complete problems and, more importantly, to prove that PTAS is a proper subset of OPTNP unless P equals NP. However, we have not characterized the relationship between OPTNP and Apx-at least not beyond the simple observation that the first is contained in the second. As it turns out, the choice of Max3SAT was justified even beyond the results already derived, as the following theorem (which we shall not prove) indicates. Theorem 8.24 Maximum Bounded Weighted Satisfiability PTAS-reduces to Max3SAT. II In view of Theorem 8.22, we can immediately conclude that Max3SAT is Apx-complete! This result immediately settles the relationship between OPTNP and Apx. Corollary 8.4 OPTNP equals Apx. DThus Apx does have a large number of natural complete problems-all of the OPTNP-complete problems discussed earlier. Table 8.3 summarizes what we have learned about the hardness of polynomial-time approximation schemes. 8.3.5 No Guarantee Unless P Equals NP

Superficially, it would appear that Theorem 8.23 is limited to ruling out membership in PTAS and that we need other tools to rule out membership

8.3 The Complexity of Approximation

333

in Apx. Yet we can still use the same principle; we just need bigger gaps or
some gap-amplifying mechanism. We give just two examples, one in which

we can directly produce enormous gaps and another in which a modest gap is amplified until it is large enough to use in ruling out membership in Apx. Theorem 8.25 Approximating the optimal solution to Traveling Salesman F within any constant ratio is NP-hard. Proof We proceed by contradiction. Assume that we have an approximation algorithm with absolute ratio RA = E. We reuse our transformation from HC, but now we produce large numbers tailored to the assumed ratio. Given an instance of HC with n vertices, we produce an instance of TSP with one city for each vertex and where the distance between two cities is 1 when there exists an edge between the two corresponding vertices and /nl~lotherwise. This reduction produces an enormous gap. If an instance x of HC admits a solution, then the corresponding optimal tour uses only graph edges and thus has total length n. However, if x has no solution, then the very best tour must move at least once between two cities not connected by an edge and thus has total length at least n I + [n/l1. The resulting gap exceeds the ratio £, a contradiction. (Put differently, we could use Si to decide any instance x of HC in polynomial time by testing whether the Q.E.D. length of the approximate tour si(x) exceeds n/E.)
.A -

Thus the general version of TSP is not in Apx, unlike its restriction to instances obeying the triangle inequality, for which a 2 /3-approximation is known. Theorem 8.26 Approximating the optimal solution to Clique within any constant ratio is NP-hard. n Proof. We develop a gap-amplifying procedure, show that it turns any constant-ratio approximation into an approximation scheme, then appeal to Theorem 8.23 to conclude that no constant-ratio approximation can exist. Let G be any graph on n vertices. Consider the new graph G2 on n2 vertices, where each vertex of G has been replaced by a copy of G itself, and vertices in two copies corresponding to two vertices joined by an edge in the original are connected with all possible n2 edges connecting a vertex in one copy to a vertex in the other. Figure 8.9 illustrates the construction for a small graph. We claim that G has a clique of size k if and only if G2 has a clique of size k2 . The "only if" part is trivial: the k copies of the clique of G corresponding to the k clique vertices in G form a clique of size k2 in G2 . The "if" part is slightly harder, since we have no a priori constraint

334

Complexity Theory in Practice
0

IX~<
*-*

*

copy 3

copy 2

copy 1 the graph its square

Figure 8.9

Squaring a graph.

on the composition of the clique in G2 . However, two copies of G in the larger graph are either fully connected to each other or not at all. Thus if two vertices in different copies belong to the large clique, then the two copies must be fully connected and an edge exists in G between the vertices corresponding to the copies. On the other hand, if two vertices in the same copy belong to the large clique, then these two vertices are connected by an edge in G. Thus every edge used in the large clique corresponds to an edge in G. Therefore, if the large clique has vertices in k or more distinct copies, then G has a clique of size k or more and we are done. If the large clique has vertices in at most k distinct copies, then it must include at least k vertices from some copy (because it has k2 vertices in all) and thus G has a clique of size at least k. Given a clique of size k2 in G2 , this line of reasoning shows not only the existence of a clique of size k in G, but also how to recover it from the large clique in G2 in polynomial time. Now assume that we have an approximation algorithm sl for Clique with absolute ratio E. Then, given some graph G with a largest clique of size 2 k, we compute G2 ; run sA on G2 , yielding a clique of size at least Ek ; and then recover from this clique one of size at least ek 2 = ke. This new procedure, call it AI', runs in polynomial time if so does and has ratio Rat, = RA. But we can use the same idea again to derive procedure A" with ratio Rye = = 4'R. More generally, i applications of this scheme yield procedure Ad with absolute ratio if. Given any desired approximation ratio E, we can apply the scheme lo 1 times to obtain a procedure with the desired ratio. Since F['gf 1 is a constant and since each application of the scheme runs in polynomial time, we have derived a polynomial-time approximation scheme for Clique. But Clique is OPTNP-hard and thus, according to Theorem 8.23, cannot be in PTAS, the desired contradiction. Q.E.D. Exercise 8.6 Verify that, as a direct consequence of our various results in the preceding sections, the sequence of inclusions, PO C FPTAS C PTAS C OPTNP = Apx c NPO, is proper (at every step) if and only if P does not equal NP. El

8.4 The Power of Randomization

335

8.4

The Power of Randomization

A randomized algorithm uses a certain number of random bits during its execution. Thus its behavior is unpredictable for a single execution, but we can often obtain a probabilistic characterization of its behavior over a number of runs-typically of the type "the algorithm returns a correct answer with a probability of at least c." While the behavior of a randomized algorithm must be analyzed with probabilistic techniques, many of them similar to the techniques used in analyzing the average-case behavior of a deterministic algorithm, there is a fundamental distinction between the two. With randomized algorithms, the behavior depends only on the algorithm, not on the data; whereas, when analyzing the average-case behavior of a deterministic algorithm, the behavior depends on the data as well as on the algorithm-it is the data that induces a probability distribution. Indeed, one of the benefits of randomization is that it typically suppresses data dependencies. As a simple example of the difference, consider the familiar sorting algorithm quicksort. If we run quicksort with the partitioning element chosen as the first element of the interval, we have a deterministic algorithm. Its worst-case running time is quadratic and its average-case running time is O(n log n) under the assumption that all input permutations are equally likely-a data-dependent distribution. On the other hand, if we choose the partitioning element at random within the interval (with the help of O(log n) random bits), then the input permutation no longer matters-the expectation is now taken with respect to our random bits. The worst-case remains quadratic, but it can no longer be triggered repeatedly by the same data sets-no adversary can cause our algorithm to perform really poorly. Randomized algorithms have been used very successfully to speed up existing solutions to tractable problems and also to provide approximate solutions for hard problems. Indeed, no other algorithm seems suitable for the approximate solution of a decision problem: after all, "no" is a very poor approximation for "yes." A randomized algorithm applied to a decision problem returns "yes" or "no" with a probabilistic guarantee as to the correctness of the answer; if statistically independent executions of the algorithm can be used, this probability can be improved to any level desired by the user. Now that we have learned about nondeterminism, we can put randomized algorithms in another perspective: while a nondeterministic algorithm always makes the correct decision whenever faced with a choice, a randomized algorithm approximates a nondeterministic one by making a random decision. Thus if we view the process of solving an instance of the problem as a computation tree, with a branch at each decision point, a nondeterministic algorithm unerringly follows a path to an accepting leaf, if any, while a randomized algorithm follows a random path to some leaf.

336

Complexity Theory in Practice

left: true

fase
right:false

false Figure 8.10

true

A binary decision tree for the function xy + xz + yw.

As usual, we shall focus on decision problems. Randomized algorithms are also used to provide approximate solutions for optimization problems, but that topic is outside the scope of this text. A Monte Carlo algorithm runs a polynomial time but may err with probability less than some constant (say 1/2); a one-sided Monte Carlo decision algorithm never errs when it returns one type of answer, say "no," and errs with probability less than some constant (say 1/2) when it returns the other, say "yes." Thus, given a "no" instance, all of the leaves of the computation tree are "no" leaves and, given a "yes" instance, at least half of the leaves of the computation tree are "yes" leaves. We give just one example of a one-sided Monte Carlo algorithm. Example 8.3 Given a Boolean function, we can construct for it a binary decision tree. In a binary decision tree, each internal node represents a variable of the function and has two children, one corresponding to setting that variable to "true" and the other corresponding to setting that variable to "false." Each leaf is labeled "true" or "false" and represents the value of the function for the (partial) truth assignment represented by the path from the root to the leaf. Figure 8.10 illustrates the concept for a simple Boolean function. Naturally a very large number of binary decision trees represent the same Boolean function. Because binary decision trees offer concise representations of Boolean functions and lead to a natural and efficient evaluation of the function they represent, manipulating such trees is of interest in a number of areas, including compiling and circuit design. One fundamental question that arises is whether or not two trees represent the same Boolean function. This problem is clearly in coNP: if the two trees represent distinct functions, then there is at least one truth
assignment under which the two functions return different values, so that

we can guess this truth assignment and verify that the two binary decision

8.4 The Power of Randomization

337

trees return distinct values. To date, however, no deterministic polynomialtime algorithm has been found for this problem, nor has anyone been able to prove it coNP-complete. Instead of guessing a truth assignment to the n variables and computing a Boolean value, thereby condensing a lot of computations into a single bit of output and losing discriminations made along the way, we shall use a random assignment of integers in the range S = [0, 2n - 1], and compute (modulo p, where p is a prime at least as large as ISI) an integer as characteristic of the entire tree under this assignment. If variable x is assigned value i, then we assign value I - i (modulo p) to its complement, so that the sum of the value of x and of x is 1. For each leaf of the tree labeled "true," we compute (modulo p) the product of the the values of the variables encountered along the path; we then sum (modulo p) all of these values. The two resulting numbers (one per tree) are compared. If they differ, our algorithm concludes that the trees represent different functions, otherwise it concludes that they represent the same function. The algorithm clearly gives the correct answer whenever the two values differ but may err when the two values are equal. We claim that at least (1S5- I)n of the possible (ISI)n assignments of values to the n variables will yield distinct values when the two functions are distinct; this claim immediately implies that the probability of error is bounded by

(IS1)n

(2n

l)

1/

and that we have a one-sided Monte Carlo algorithm for the problem. The claim trivially holds for functions of one variable; let us then assume that it holds for functions of n or fewer variables and consider two distinct functions, f and g, of n + I variables. Consider the two functions of n variables obtained from f by fixing some variable x; denote them f,=o and L=1, so that we can write f = Tf,=o + xf,= . If f and g differ, then f =o and g,=o differ, or fr=- and g,= differ, or both. In order to have the value computed for f equal that computed for g, we must have (1 - IxIDIfx=o + IxhfJ=1 = (1 - IX)IgX=o + lXIIgX=Il (where we denote the value assigned to x by jxJ and the value computed for f by If I). But if Ifx=oI and Igx=o differ, we can write
lXI(IfX=1I -

IfX=ol - Igx=i1 + IgX=o)

=

IfX=ol

- gX=0ol

which has at most one solution for Ixl since the right-hand side is nonzero. Thus we have at least (IS I - 1) assignments to x that maintain the difference

338

Complexity Theory in Practice in values for f and g given a difference in values for If, =ol and Ig,=oI; since, by inductive hypothesis, the latter can be obtained with at least (ISI - 1 )' assignments, we conclude that at least (ISI - 1)n+1 assignments will result in different values whenever f and g differ, the desired result. D A Las Vegas algorithm never errs but may not run in polynomial time on all instances. Instead, it runs in polynomial time on average-that is, assuming that all instances of size n are equally likely and that the running time on instance x is f (x), the expression E, 2-'f(x), where the sum is taken over all instances x of size n, is bounded by a polynomial in n. Las Vegas algorithms remain rare; perhaps the best known is an algorithm for primality testing. Compare these situations with that holding for a nondeterministic algorithm. Here, given a "no" instance, the computation tree has only "no" leaves, while, given a "yes" instance, it has at least one "yes" leaf. We could attempt to solve a problem in NP by using a randomized method: produce a random certificate (say encoded in binary) and verify it. What guarantee would we obtain? If the answer returned by the algorithm is "yes," then the probability of error is 0, as only "yes" instances have "yes" leaves in their computation tree. If the answer is "no," on the other hand, then the probability of error remains large. Specifically, since there are 21x1 possible certificates and since only one of them may lead to acceptance, the probability of error is bounded by (1 - 2-xi) times the probability that instance x is a "yes" instance. Since the bound depends on the input size, we cannot achieve a fixed probability of error by using a fixed number of trials-quite unlike Monte Carlo algorithms. In a very strong sense, a nondeterministic algorithm is a generalization of a Monte Carlo algorithm (in particular, both are one-sided), with the latter itself a generalization of a Las Vegas algorithm. These considerations justify a study of the classes of (decision) problems solvable by randomized methods. Our model of computation is that briefly suggested earlier, a random Turing machine. This machine is similar to a nondeterministic machine in that it has a choice of (two) moves at each step and thus must make decisions, but unlike its nondeterministic cousin, it does so by tossing a fair coin. Thus a random Turing machine defines a binary computation tree where a node at depth k is reached with probability 2 -k. A random Turing machine operates in polynomial time if the height of its computation tree is bounded by a polynomial function of the instance size. Since aborting the computation after a polynomial number of moves may prevent the machine from reaching a conclusion, leaves of a polynomially bounded computation tree are marked by one of "yes," "no," or "don't

8.4 The Power of Randomization know." Without loss of generality, we shall assume that all leaves are at the same level, say p(IxI) for instance x. Then the probability that the machine answers yes is simply equal to Ny2-P(IxD), where Ny is the number of "yes" leaves; similar results hold for the other two answers. We define the following classes. Definition 8.15 * PP is the class of all decision problems for which there exists a polynomial-time random Turing machine such that, for any instance

339

x of H:
- if x is a "yes" instance, then the machine accepts x with probability larger than 1/2; - if x is a "no" instance, then the machine rejects x with probability larger than 1/2. * BPP is the class of all decision problems for which there exists a polynomial-time random Turing machine and a positive constant £ S 1/2 (but see also Exercise 8.34) such that, for any instance x of

E:
-

if x is a "yes" instance, then the machine accepts x with probability no less than 1/2 + £; if x is a "no" instance, then the machine rejects x with probability no less than 1/2 + £.

(The "B" indicates that the probability is bounded away from 1/2.) * RP is the class of all decision problems for which there exists a polynomial-time random Turing machine and a positive constant £ S 1 such that, for any instance x of H
-

if x is a "yes" instance, then the machine accepts x with probability no less than E; if x is a "no" instance, then the machine always rejects x. E:

Since RP is a one-sided class, we define its complementary class, coRP, in the obvious fashion. The class RP U coRP embodies our notion of problems for which (one-sided) Monte Carlo algorithms exist, while RP n coRP

corresponds to problems for which Las Vegas algorithms exist. This last
class is important, as it can also be viewed as the class of problems for which there exist probabilistic algorithms that never err. Lemma 8.1 A problem H belongs to RP n coRP if and only if there exists a polynomial-time random Turing machine and a positive constant E - 1 such that

340

Complexity Theory in Practice * the machine accepts or rejects an arbitrary instance with probability no less than £; * the machine accepts only "yes" instances and rejects only "no" FD instances. We leave the proof of this result to the reader. This new definition is almost the same as the definition of NP n coNP: the only change needed is to make £ dependent upon the instance rather than only upon the problem. This same change turns the definition of RP into the definition of NP, the definition of coRP into that of coNP, and the definition of BPP into that of PP. Exercise 8.7 Verify this statement.
F1

We can immediately conclude that RP n coRP is a subset of NP n coNP, RP is a subset of NP, coRP is a subset of coNP, and BPP is a subset of PP. Moreover, since all computation trees are limited to polynomial height, it is obvious that all of these classes are contained within PSPACE. Finally, since no computation tree is required to have all of its leaves labeled "yes" for a "yes" instance and labeled "no" for a "no" instance, we also conclude that P is contained within all of these classes. Continuing our examination of relationships among these classes, we notice that the - value given in the definition of RP could as easily have been specified larger than 1/2. Given a machine M with some - no larger than 1/2, we can construct a machine M' with an Elarger than 1/2 by making M' iterate M for a number of trials sufficient to bring up a. (This is just the main feature of Monte Carlo algorithms: their probability of error can be decreased to any fixed value by running a fixed number of trials.) Hence the definition of RP and coRP is just a strengthened (on one side only) version of the definition of BPP, so that both RP and coRP are within BPP. We complete this classification by proving the following result. Theorem 8.27 NP (and hence also coNP) is a subset of PP. Proof. As mentioned earlier, we can use a random Turing machine to approximate the nondeterministic machine for a problem in NP. Comparing definitions for NP and PP, we see that we need only show how to take the nondeterministic machine M for our problem and turn it into a suitable random machine M'. As noted, M accepts a "yes" instance with probability larger than zero but not larger than any fixed constant (if only one leaf in the computation tree is labeled "yes," the instance is a "yes" instance, but the probability of acceptance is only 2-P(IxI)). We need to make this probability larger than 1/2. We can do this through the simple expedient of tossing

8.4 The Power of Randomization
PSPACE

341

I
PP

co-NP

co-R

R r co-R Up

I

Figure 8.11

The hierarchy of randomized complexity classes.

one coin before starting any computation and accepting the instance a priori if the toss produces, say heads. This procedure introduces an a priori probability of acceptance, call it Pa, of 1/2; thus the probability of acceptance of "yes" instance x is now at least 1/2 + 2-P(Ixl). We are not quite done, however, because the probability of rejection of a "no" instance, which was exactly 1 without the coin toss, is now 1 - Pa = 1/2. The solution is quite simple: it is enough to make Pa less than l/2, while still large enough so that Pa + 2-p(Ix ) > 1/2. Tossing an additional p(IxL) coins will suffice: M' accepts a prioriexactly when the first toss returns heads and the next p(ixl) tosses do not all return tails, so that Pa = 1/2 - 2-P(4x)-. Hence a "yes" instance is accepted with probability Pa + 2-p(Ixl) = 1/2 + 2-P(IXI)-1 and a "no" instance
is rejected with probability 1 - Pa = 1/2 + 2-POxI)-1. Since M' runs in polynomial time if and only if M does, our conclusion follows. Q.E.D.

The resulting hierarchy of randomized classes and its relation to P, NP, and PSPACE is shown in Figure 8.11. Before we proceed with our analysis of these classes, let us consider one more class of complexity, corresponding to the Las Vegas algorithms, that is, corresponding to algorithms that always return the correct answer but have a random execution time, the expectation of which is polynomial. The class of decision problems solvable with this type of algorithms is denoted

342

Complexity Theory in Practice by ZPP (where the "Z" stands for zero error probability). As it turns out, we already know about this class, as it is no other than RP n coRP. Theorem 8.28 ZPP equals RP n coRP. Proof We prove containment in each direction. (ZPP C RP n coRP) Given a machine M for a problem in ZPP, we construct a machine M' that answers the conditions for RP n coRP by simply cutting the execution of M after a polynomial amount of time. This prevents M from returning a result so that the resulting machine M', while running in polynomial time and never returning a wrong answer, has a small probability of not returning any answer. It remains only to show that this probability is bounded above by some constant e < 1. Let q() be the polynomial bound on the expected running time of M. We define M' by stopping M on all paths exceeding some polynomial bound ro, where we choose polynomials r() and r'() such that r(n) + r'(n) = q(n) and such that r() provides the desired E (we shall shortly see how to do that). Without loss of generality, we assume that all computations paths that lead to a leaf within the bound r () do so in exactly r (n) steps. Denote by Px the probability that M' does not give an answer. On an instance of size n, the expected running time of M is given by (1 - px) r(n) + Px *tmax(n), where tmax(n) is the average number of steps on the paths that require more than polynomial time. By hypothesis, this expression is bounded by q(n) = r(n) + r'(n). Solving for Px, we obtain r'(n)
tmax(n)E

r(n)
tmax,(n)
-

This quantity is always less than 1, as the difference

r(n) is superpolynomial by assumption. Since we can pick r() and r'(), we can make Px smaller than any given £ > 0. (RP n coRP C ZPP) Given a machine M for a problem in RP n coRP, we construct a machine M' that answers the conditions for ZPP. Let I/k (for some rational number k > 1) be the bound on the probability that M does not return an answer, let r( ) be the polynomial bound on the running time of M, and let kq(n) be a bound on the time required to solve an instance of size n deterministically. (We know that this last bound is correct as we know that the problem, being in RP n coRP, is in NP.) On an instance of size n, M' simply runs M for up to q(n) trials. As soon as M returns an answer, M' returns

8.4 The Power of Randomization

343

the same answer and stops; on the other hand, if none of the q(n) successive runs of M returns an answer, then M' solves the instance deterministically. Since the probability that M does not return any answer in q(n) trials is k q(n), the expected running time of M' is bounded by (1 - k q(n)) r(n) + k-q(n) kq(n) = 1 + (1 - k-q(n)) r(n). Hence the expected running time of M' is bounded by a polynomial in n. Q.E.D.
.

Since all known randomized algorithms are Monte Carlo algorithms, Las Vegas algorithms, or ZPP algorithms, the problems that we can now address with randomized algorithms appear to be confined to a subset of RP U coRP. Moreover, as the membership of an NP-complete problem in RP would imply NP = RP, an outcome considered unlikely (see Exercise 8.39 for a reason), it follows that this subset of RP U coRP does not include any NP-complete or coNP-complete problem. Hence randomization, in its current state of development, is far from being a panacea for hard problems. What of the other two classes of randomized complexity? Membership in BPP indicates the existence of randomized algorithms that run in polynomial time with an arbitrarily small, fixed probability of error. Theorem 8.29 Let H be a problem in BPP. Then, for any a > 0, there exists a polynomial-time randomized algorithm that accepts "yes" instances and E rejects "no" instances of H with probability at least 1- 6. Proof Since H is in BPP, it has a polynomial-time randomized algorithm .A that accepts "yes" instances and rejects "no" instances of H with probability at least 1/2 + E, for some constant s > 0. Consider the following new algorithm, where k is an odd integer to be defined shortly.
yescount := 0;
for i := 1 to k do if A(x) accepts

then yes-count := yes-count+1
if yes count > k div 2

then accept
else reject

If x is a "yes" instance of H, then A(x) accepts with probability at least 1/2 + £; thus the probability of observing exactly j acceptances (and thus k-j Irejections) in the k runs of sl(x) is at least
(k) (1/2
+ )1 (1/2 -_
)k-j

344

Complexity Theory in Practice We can derive a simplified bound for this value when j does not exceed k/ 2 by equalizing the two powers to k/ 2 :
(1)(/2 + E)i(1/2 _-gawk j (1/4 _-£2)P

Summing these probabilities for values of j not exceeding k/ , we obtain the 2 probability that our new algorithm will reject a "yes" instance:
E/

(_)(/

+£)

('/2 -E)k

.

,

(1/4 -

2)kl2

E

(k)

4-

k

Now we choose k so as to ensure (1t - 4,- 2 k 2 log 8 log(1 - 42)

X 8 which gives us

the condition

so that k is a constant depending only on the input constant 8.

Q.E.D.

Thus BPP is the correctgeneralization of P through randomization; stated differently, the class of tractable decision problems is BPP. Since BPP includes both RP and coRP, we may hope that it will contain new and interesting problems and take us closer to the solution of NP-complete problems. However, few, if any, algorithms for natural problems use the full power implicit in the definition of BPP. Moreover, BPP does not appear to include many of the common hard problems; the following theorem (which we shall not prove) shows that it sits fairly low in the hierarchy. Theorem 8.30 BPP is a subset of EP n rlp (where these two classes are the nondeterministic and co-nondeterministic classes at the second level of the polynomial hierarchy discussed in Section 7.3.2). E If NP is not equal to coNP, then neither NP nor coNP is closed under complementation, whereas BPP clearly is; thus under our standard conjecture, BPP cannot equal NP or coNP. A result that we shall not prove states that adding to a machine for the class BPP an oracle that solves any problem in BPP itself does not increase the power of the machine; in our notation, BPPBPP equals BPP. By comparison, the same result holds trivially for the class P (reinforcing the similarity between P and BPP), while it does not appear to hold for NP, since we believe that NPNP is a proper superset of NP. An immediate consequence of this result and of Theorem 8.30 is that, if we had NP C BPP, then the entire polynomial hierarchy would

as the probabilistic guarantee on the error bound is very poor The amount by which the probability exceeds the bound of 1/2 may depend on the instance size.8. Indeed. Whether or not a randomized algorithm indeed makes a difference remains unknown. as it rests on the usual conjecture that all containments are proper. in particular to the minimization of the number of truly random bits required. In conclusion. we would have RP = NP and BPP = PH. Reducing the probability of error to a small fixed value for such a problem requires an exponential number of trials. randomized algorithms have the potential for providing efficient and elegant solutions for many problems. that is. which would indicate that randomized algorithms have more potential than suspected. If we had NP C BPP. We know that a complete problem (under Turing reductions) for #P is "How many satisfying truth assignments are there for a given 3SAT instance? " The very similar problem "Do more than half of the possible truth assignments satisfy a given 3SAT instance?" is complete for PP (Exercise 8. As a result. the class of enumeration problems corresponding to decision problems in NP. The bibliographic section offers suggestions for further exploration of these topics. What then of the largest class. appears inapplicable here. PP contains the decision version of the problems in #P-instead of asking for the number of certificates. the entire computation is completely fixed! Much work has been devoted to this issue. namely complete problems. an oracle for PP is as good as an oracle for #P.36). since they are generated by a pseudorandom number generator. the hierarchy of classes described earlier is not firm. However.
. the randomized algorithms that we can actually run are entirely deterministicfor a fixed choice of seed. as long as said problems are not too hard. so that the scope of randomized algorithms is indeed fairly restricted. we have seen that this quantity is only 2-P(n) for an instance of size n. then no gain at all could be achieved through the medium of randomized algorithms (except in the matter of providing faster algorithms for problems in P). since neither RP nor BPP appear to have complete problems (Exercise 8. if we had P = ZPP = RP = coRP = BPP c NP. the problems ask whether the number of certificates meets a certain bound. as well as mechanisms to remove biases from nonuniform generators. In a sense. PPP is equal to p#P. for instance. PP? Membership in PP is not likely to be of much help. Another concern about randomized algorithms is their dependence on the random bits they use.39). In practice.4 The Power of Randomization
345
collapse into BPP-something that would be very surprising. Many amplification mechanisms have been developed. Hence BPP does not appear to contain any NP-complete problem. PP is very closely related to #P. these bits are not really random. for a problem in NP. Our standard study tool.

8* Prove that Planar3SAT is NP-complete.) 1. Vertex Cover 2.14 Prove Vizing's theorem: the chromatic index of a graph either equals the maximum degree of the graph or is one larger. In particular. Exercise 8.346
Complexity Theory in Practice
8. this appears to be a harder question for TSP than it is for SAT or even HC. Can you explain that? Based on your explanation. 2.. (Design an appropriate component to substitute for each vertex of degree larger than three.10 Prove that the following problems remain NP-complete when restricted to graphs where no vertex degree may exceed three. .5
Exercises
Exercise 8.
n n
i=l j=l
E
E aijf(i)f(j) < K
. in polar and nonpolar versions. An instance of this problem is given by an n x n matrix A = (aij) with nonnegative integer entries and a bound K. Exercise 8. (Hint: set it up as a matching problem between pairs of adjacent planar faces.9* (Refer to the previous exercise.15 Prove that Matrix Cover is strongly NP-complete. in polar and nonpolar versions. f: (1. . The question is whether there exists a . We saw in the previous chapter that Unique Traveling Salesman Tour is complete for A2' (Exercise 7.6.48).) Exercise 8.12* Prove Theorem 8.) Exercise 8. 11.* I -1.) Prove that Planar lin3SAT is NP-complete. (Hint: use a transformation from Vertex Cover.) Exercise 8. a presumably proper subset of AsP. (Hint: use induction on the degree of the graph. while Unique Satisfiability is in DP. with function. can you propose other candidate problems for which the question should be as hard as for TSP? no harder than for SAT? Exercise 8.13* A curious fact about uniqueness is that the question "Does the problem have a unique solution?" appears to be harder for some NPcomplete problems than for others. Maximum Cut Exercise 8.11* Show that Max Cut restricted to planar graphs is solvable in polynomial time.

At the beginning of each iteration.. and a release time 1: S --* N. call it S.. based in part upon the monotonicity of the problem (because all currencies have positive values and any exchange rate is also positive) and in part upon the following observation (which you should prove): if some subset of k boxes. we have a. . we develop a polynomial-time approximation algorithm for Safe Deposit Boxes for two currencies that returns a solution using at most one more box than the optimal solution. a request time f: S -N. If the requirement on the second currency is also met. we know that k = ISI is a lower bound on the value of the optimal solution.17 Prove that the decision version of Safe Deposit Boxes is NPcomplete for each fixed k -.5 Exercises (Hint: transform Maximum Cut so as to produce only instances with "small" numbers. the problem is hard only because we cannot convert the second currency into the first. then the optimal solution must open at least k + 1 boxes. .f (y) or 1(y) . . The question "Does there exist a memory allocation scheme c: -* S {1. . thus. S. bn. The interesting part in this result is that the exchange rate under which the k boxes fail to satisfy either currency requirement need not be the "optimal" exchange rate nor the extremal rates of 1 : 0 and ofO: 1. we have an optimal solution and stop. b2 . Exercise 8. the exchange rate). -_an. selected in decreasing order by total value under some exchange rate. . Exercise 8.2. . Otherwise we start an iterative process of corrections to the ordering (and.16* Prove that Memory Management is strongly NP-complete.. . fulfills the requirement on the first currency. Our algorithm will maintain a collection. M} such that allocated intervals in memory do not overlap during their existence?" Formally. ¢' a2 : . An instance of this problem is given by a memory size M and collection of requests S.8.. As we noted in the text.1] 0 implies that one of 1(x) . in our ordering. a2 . the allocation scheme must be such that [a(x).) Exercise 8. Let the values in the first currency be a. incidentally. We sketch an iterative algorithm. a (y) + s(y) . breaking any ties by their values in the second currency. of boxes with known properties.18* In this exercise. where l(x) > f(x) holds for each x E S.f (x) holds. Set the initial currency exchange ratio to be 1 : 0 and sort the boxes according to their values in the first currency. . Select the first k boxes in the ordering such that the resulting collection. each with a size s: S -A* N. . an and those in the second currency bl. . a (x) + s(x) .1] n [ao(y). this collection meets the requirement on
347
. fails to meet the objective for both currencies. 2.

Stop. j) to be the ratios
Po. From our observation. Now examine each fi(i.) Exercise 8. The resulting collection fails to meet both requirements. We place box i back into the collection S. j) where we have both as > aj and bj> bi-i. Iterate. Now use this two-currency algorithm to derive an (m . the optimal solution must contain at least k + 1 boxes. The resulting collection continues to meet the requirement on the first currency and continues to fail the requirement on the second-albeit by a lesser amount.1)-distance approximation algorithm for the m-currency version of the problem that runs in polynomial time for each fixed m. j) in turn. 4. then we replace box i by box j in S. and we proceed to case 1 or 3. a change that increases the amount of the second currency and decreases the amount of the first currency. 3. Set the exchange rate to 1: f(i. under which selecting boxes in decreasing order of total value yields a distance-one approximation. as appropriate. On the other hand. The resulting collection now meets both requirements: we have a solution of size k and thus an optimal solution. 1. j) =
bj -bi
'a
-
ai
Consider all f3(i. We place box i back into the collection S. Four cases can arise: 1. if box i belongs to S and box j does not. If boxes i and j both belong to S or neither belongs to S. thereby ensuring that the new S meets the requirement on the first currency. j). specifically a rate of 1 : fi(i..348
Complexity Theory in Practice the first but not on the second currency. Verify that the resulting algorithm returns a distance-one approximation to the optimal solution in O(n 2 log n) time. this change does not alter S. call it T. j) > 0-and sort them.e. (A solution that runs in O(nm+l) time is possible. the new collection now meets both requirements with k + I boxes and thus is a distance-one approximation. Stop. The resulting collection fails to meet the requirement on the first currency but satisfies the requirement on the second. with P(i.19* Consider the following algorithm for the Minimum-Degree Spanning Tree problem. Define the values /3(i. Find a spanning tree. An interesting consequence of this algorithm is that there exists an exchange rate.
. j) for a suitable choice of i and j. 2.

Exercise 8.) Exercise 8. v} not in T connecting two components (which need not be trees) in F and while all vertices of degree k remain marked: (a) Consider the cycle created in T by {u. (Hint: to multiply the problem. Finding the truth assignment that satisfies the largest number of clauses in a 2SAT problem. v}. v) and unmark any bad vertices in that cycle.1 or k. 3.) 2. (b) Combine all components of F that have a vertex in the cycle into one component. 4. If there is an unmarked vertex w of degree k. Finding an optimal identification tree." Remove the bad vertices from T. 2.8. Finding a set cover of minimum cardinality.20 Use the multiplication technique to show that none of the following NP-hard problems admits a constant-distance approximation unless P equals NP. Finding a minimum subset of vertices of a graph such that the graph resulting from the removal of this subset is bipartite. Add {u. 1.19).21* Use the multiplication technique to show that none of the following NP-hard problems admits a constant-distance approximation unless P equals NP. Let k be the degree of T. Otherwise T is the approximate solution.
. Then verify that the vertices that remain marked when the algorithm terminates have the property that their removal creates a forest F in which no two trees can be connected by an edge of the graph. v} to T. it is unmarked because we unmarked it in some cycle created by T and some edge {u. leaving a forest F. 1. Prove that this algorithm is a distance-one approximation algorithm. 3. and return to Step 2.5 Exercises
349
2. introduce subclasses for each class and add perfectly splitting tests to distinguish between those subclasses. Finding a minimum spanning tree of bounded degree (contrast with Exercise 8. While there exists some edge {u. we call these vertices "bad. Mark all vertices of T of degree k . (Hint: prove that removing m vertices from a graph and thereby disconnecting the graph into d connected components indicates that the minimum-degree spanning tree for the graph must have degree at least m+d"1. remove from T one of the cycle edges incident upon w.

v} of G'. is also in FPTAS. Exercise 8. Verify that.23 Verify that the following is a 1/2-approximation algorithm for the Vertex Cover problem: * While there remains an edge in the graph.24* Devise a 1/2-approximation algorithm for the Maximum Cut problem. Exercise 8.28 Prove Theorem 8. unless NP equals P.) Exercise 8. there exists a polynomial-time approximation algorithm Slk for Knapsack with ratio Rq. Use binary search to find the value of the optimal solution. the proof essentially constructs an abstract approximation algorithm in the same style as used in deriving the fully polynomial-time approximation scheme for Knapsack. Exercise 8. select any such edge. To multiply graph G by graph G'. that is.350
Complexity Theory in Practice
3.) Exercise 8.f I-(. otherwise. make a copy of G for each node of G' and.22* The concept of constant-distance approximation can be extended to distances that are sublinear functions of the optimal value.15.26 Prove that the product version of the Knapsack problem. there cannot exist a polynomial-time approximation algorithm s for any of the problems of the previous two exercises that would produce an approximate solution f obeying If (1) . add both of its endpoints to the cover. for each fixed k. consider the completion of the subset composed of the k most valuable items in the optimal solution. completing each subset with the greedy heuristic based on value density and choosing the best completion. and remove all edges covered by these two vertices. always returns a solution of value not less than kT1 times the optimal value.27 Prove Theorem 8. for each edge {u.25* Verify that the approximation algorithm for Knapsack that enumerates all subsets of k objects.fM ) for some constant E > 0.17. = I/k. the version where the value of the packing is the product of the values of the items packed rather than their sum. It k follows that.(Hint: if the optimal solution has at most k objects in it. Exercise 8. connect all vertices in the copy of G corresponding to u to all vertices in the copy of G corresponding to v. Exercise 8.
. we are done. Finding the chromatic number of a graph. hence Knapsack is in PTAS. (Hint: multiply the graph by a suitably chosen graph.

8. The "outermost" layer contains the vertices on the boundary of.we set k = r f1.) Nodes in one layer are adjacent only to nodes in a layer that differs by at most one.21). we make use of the existence of polynomial-time exact algorithms for these problems on k-outerplanar graphs to develop approximation schemes for these problems on general planar graphs. each cycle receives the same layer number. but we can layer such graphs. a simple cycle defines two faces. one of the faces is infinite. In this exercise. An outerplanargraph is a planar graph that can be embedded so that all of its vertices are on the boundary of (or inside) the infinite face. the infinite face (because a tree has no cycle). If a graph can thus be decomposed into k layers. for instance. etc.29* This exercise develops an analog of the shifting technique for planar graphs. the infinite face. A maximum independent set can then be computed for each component. Dominating Set (for both vertices and edges-see Exercise 7. or inside. * For the Vertex Cover problem.. . It turns out that k-outerplanar graphs (for constant k) form a highly tractable subset of instances for a number of classical NP-hard problems. for each i = 1. a tree defines a single face. because vertices from two component sets must be at least two layers apart and thus cannot be connected. In any planar embedding of a finite graph.e.1)-outerplanar subgraphs. we delete from the graph nodes in layers congruent to i mod k. including Vertex Cover. with an overlap of one layer between any two
. k in turn. A planar embedding of a (planar) graph defines faces in the plane: each face is a region of the plane delimited by a cycle of the graph and containing no other face. Most planar graphs are not outerplanar. Independent Set. We decompose the graph into subgraphs made up of k + 1 consecutive layers. we use a version of the shifting idea to reduce the work to certain levels only. (If there are several disjoint cycles with their vertices on the infinite face.. and so on. the next layer is similarly defined on the planar graph obtained by removing all vertices in the outermost layer.1 consecutive layers each-i. it is said to be k-outerplanar. For instance. and so on. Since a general planar graph does not have a constant number of layers. the union of these sets is itself an independent set in the original graph.5 Exercises
351
Exercise 8. We select the best of the k choices resulting from our k different partitions. For a precision requirement of E. This step disconnects the graph. trees and simple cycles are outerplanar. * For the Independent Set problem. . Partitioninto Triangles. breaking it into components formed of k . breaking the graph into a collection of (k . we use a different decomposition scheme. .

. the degree of which may depend only on c and on the value of A(x. y = A (x. Exercise 8. so that we can find an optimum vertex cover for it in polynomial time. and * the running time of .22. which takes as input an instance of r and a natural number k such that * for each instance x of r and every natural number c. the value of which differs from the optimal by at most kc. . Exercise 8. c) is a polynomial function of lxi. The main difficulty in proving hardness is that we do not know how to bound the value of solutions. Each subgraph is a (k + 1)-outerplanar graph.352
Complexity Theory in Practice subgraphs-for each i = 1. Exercise 8. Again. Exercise 8.34 Verify that replacing the constant e by the quantity P0'XD in the definition of BPP does not alter the class.. c) is a solution of H. This approximation algorithm can be used to focus on instances with suitably bounded solution values. Prove that each of these two schemes is a valid polynomial-time approximation scheme. we form the subgraph made of layers i mod k. (i mod k) + 1.
. An analog of this result exists for FPTAS membership: replace "simple" by "p-simple" and replace the constant k in the definition of boundedness by a polynomial in IxI. (i mod k) + k. a handicap that prevents us from following the proof used in the first part. .33* Prove the second part of Theorem 8. . Prove that an NPO problem is in PTAS if and only if it is simple and satisfies the boundedness condition. One way around this difficulty is to use the characterization of Apx: a problem belongs to Apx if it belongs to NPO and has a polynomialtime approximation algorithm with some absolute ratio guarantee. we select the best of the k choices resulting from our k different decompositions. The union of these covers is a cover for the original graph. Exercise 8. .30* We say that an NPO problem 1I satisfies the boundedness condition if there exists an algorithm si. . since every single edge of the original graph is part of one (or two) of the subgraphs. k.31 Prove that any minimization problem in NPO PTAS-reduces to MaxWSAT.A(x. c)..32 Prove that Maximum Cut is OPTNP-complete.

) Exercise 8. (Use Cook's construction to verify that the number of accepting paths is the number of satisfying assignments. with a fixed bound on that probability." "no.e." or "don't know". answers of "yes" and "no" are correct.2-SAT and n.51).35* Use the idea of a priori probability of acceptance or rejection in an attempt to establish El n FIP c PP (a relationship that is not known to hold).27) and what difficulties do you encounter? Exercise 8. a special case solution would become a Las Vegas algorithm. the probabilistic version of PSPACE. What is the difference between this problem and proving NP C PP (as done in Theorem 8.36 Prove that deciding whether at least half of the legal truth assignments satisfy an instance of Satisfiability is PP-complete.6 Bibliography
353
Exercise 8. not just at one-half. and prove that the two classes are equal. albeit in simpler versions. verify that the bounded halting problem {(M.8.n-SAT are in P and that 3. that is. Exercise 8.37 Give a reasonable definition for the class PPSPACE. (A special case solution has no guarantee on the probability with which it answers "don't know".) Then verify that the knife-edge can be placed at any fraction.A set is T-bi-immune whenever both it and its complement are IC-immune.6
Bibliography
Tovey [1984] first proved that n. We follow Garey and Johnson [1979] in their presentation of the
.38* A set is immune with respect to complexity class C (Cimmune) if and only if it is infinite and has only finite subsets in IC. our proofs generally follow his.) Prove that a set is P-bi-immune if and only if every special case solution for it answers "don't know" almost everywhere. i.. It is known that a set is P-bi-immune whenever it splits every infinite set in P.4-SAT is NP-complete. the algorithm only answers "yes" on yes instances and only answers "no" on no instances. A special case solution for a set is an algorithm that runs in polynomial time and answer one of "yes. (In particular.
8.39* Verify that RP and BPP are semantic classes (see Exercise 7. Exercise 8. verify that deciding whether at least 1/Aof the legal truth assignments satisfy an instance of Satisfiability is PP-complete for any E > 1. that is. a P-biimmune set cannot have a Las Vegas algorithm. x) I M E ICand M accepts x) is undecidable for I = RP and IC= BPP.

while Theorem 8. they could reduce the running time (for two currencies) to O(n 2 ) and extend the algorithm to return (m . and Tarjan [1976].7 is from Valiant and Vazirani [1985]. The work of the Amsterdam Mathematisch Centrum group up to 1982 is briefly surveyed in the article of Lageweg et al. which generalizes constant-distance to distances that are sublinear functions of the optimal value. Dyer and Frieze [1986] proved that Planarlin3SAT is also NP-complete. while Moret [1988] showed that PlanarNAE3SAT is in P. where the reader will find the proof that k-Partition is strongly NPcomplete. through primal-dual techniques. Dinitz [1997] presents an updated version in English. our construction for PlanarHC is inspired from Garey. Jordan [1995] gave a polynomial-time
. Furer and Raghavachari [1 99 2 . as well as a number of NPcomplete problems for which pseudo-polynomial time algorithms exist. then generalized it to minimum-degree Steiner trees. The idea of a promise problem is due to Even and Yacobi [1980]. Johnson [1985] gave a very readable survey of the results concerning uniqueness in his NP-completeness column. Johnson. Exercise 8. Nigmatullin [1975] proved a technical theorem that gives a sufficient set of conditions for the "multiplication" technique of reduction between an optimization problem and its constant-distance approximation version. they discussed various aspects of this property in their text [1979]. including new results on a greedy approach to the problem. Groetschel et al.354
Complexity Theory in Practice
completeness of graph coloring for planar graphs and graphs of degree 3. is also from his work.18 is from Dinic and Karzanov [1978]. The concept of strong NP-completeness is due to Garey and Johnson [1978].1 9 9 4 ] gave the distance-one approximation algorithm for minimum-degree spanning trees (Exercise 8.22. Their list of NP-complete problems includes approximately 30 nontrivial strongly NP-complete problems.14) is from Vizing [1964]. while the proof of NP-completeness for Chromatic Index is due to Holyer [1980]. [1982].1)-distance approximations for m currencies in 0(nm+l) time. Lichtenstein [1982] proved that Planar 3SAT is NP-complete and presented various uses of this result in treating planar restrictions of other difficult problems. Exercise 8.19).6 is from the same paper.4) is the k-star. which also describes the parameterization of scheduling problems and their classification with the help of a computer program. Thomason [1978] showed that the only graph that is uniquely edge-colorable with k colors (for k . Perfect graphs and their applications are discussed in detail by Golumbic [1980]. Theorem 8. [1981] showed that several NP-hard problems are solvable in polynomial-time on perfect graphs and also proved that recognizing perfect graphs is in coNP. they gave the algorithm sketched in the exercise and went on to show that. Vizing's theorem (Exercise 8.

The approximation algorithm for MaxkSAT (Theorem 8.3. Sahni and Gonzalez [1976] and Gens and Levner [1979] gave a number of problems that cannot have bounded-ratio approximations unless P equals NP.17 is from their paper. with connections to structure theory (the theoretical aspects of complexity theory). [1995]. Theorem 8. Our definition of reduction among NPO problems (Definition 8. is from Papadimitriou and Steiglitz [1982]. However.29).2 edges more than the optimal solution (and is provably optimal for k = I and k = 2). while not known to be in FP. Theorem 8. [1980] extended their work and unified strong NP-completeness with simplicity. the fully polynomial-time approximation scheme for Knapsack is due to Ibarra and Kim [19751. Exercise 8. from which Theorem 8.15. is not known to be NP-hard. who gave a number of OPTNP-complete problems. The alternate characterization of NP was developed through a series of papers. most notably Crescenzi and Panconesi [1991]. can be found in the text of Bovet and Crescenzi [1994]. whose approach we follow through much of Section 8. Paz and Moran [1977] introduced the notion of simple problems.30. p.12) is from Ausiello et al. a more thorough treatment can be found in the article of Ausiello et al.19) and its use in the Disk Coveringproblem is from Hochbaum and Maass [1985]. [1995]. Hochbaum [1996] offers an excellent and concise survey of the complexity of approximation.18 on the use of k-completions for polynomial approximation schemes is from Korte and Schrader [1981].23 was taken. An
. including a very useful table (Table 10. who improved on an earlier 1/2-approximation for Max3SAT due to Johnson [1974].2. [1992]. Theorem 8.8. later improved by Lawler [1977].1979] studied the relation between fully p-approximable problems and pseudo-polynomial time algorithms and proved Theorem 8.21) is due to Lieberherr [1980]. culminating in the results of Arora et al.14. Baker [1994] independently derived a similar technique for planar graphs (Exercise 8.6 Bibliography
355
approximation algorithm for the problem of augmenting a k-connected graph to make it (k + I)-connected that guarantees to add at most k .25 is from Sahni [1975]. The study of OPTNP and Max3SAT was initiated by Papadimitriou and Yannakakis [1988]. [1994]. as is Exercise 8. Theorem 8. Theorem 8. the general problem of k-connectivity augmentation.24 is from Khanna et al. A concise and very readable overview. The "shifting lemma" (Theorem 8.16 is from their paper. Ausiello et al. Sahni and Gonzalez also introduced the notions of papproximable and fully p-approximable problems. Arora and Lund [1996] give a detailed survey of inapproximability results. Garey and Johnson [1978. Several attempts at characterizing approximation problems through reductions followed the work of Paz and Moran. 431) of known results as of 1995. while its generalization.

Arora and Lund [1996] cover the recent results derived from the alternate characterization of NP through probabilistic proof checking.se/-viggo/problemlist/compendium.27. The Monte Carlo algorithm for the equivalence of binary decision diagrams is from Blum et al. The random Turing machine model was introduced by Gill [1977]. and on random number generation.kth.html. their write-up is also a guide on how to use current results to prove new inapproximability results. Naor. and provided complete problems for PP.
. and PP. on derandomization.30 is from Lautemann [1983].nada. Motwani. and Raghavan [1996] give a comprehensive discussion of randomized approximations in combinatorial optimization. [1980]. In Section 9 of their monograph. while Welsh [1983] and Maffioli [1986] discussed a number of applications of randomized algorithms. consult the survey of Moret [1982]. BPP. For more information on binary decision trees.356
Complexity Theory in Practice exhaustive compendium of the current state of knowledge concerning NPO problems is maintained on-line by Crescenzi and Kann at URL www. Theorem 8. Johnson [1984] presented a synopsis of the field of random complexity theory in his NP-completeness column. Motwani and Raghavan [1995] wrote an outstanding text on randomized algorithms that includes chapters on randomized complexity. on the characterization of NP through probabilistic proof checking. who also defined the classes ZPP. As mentioned. proved Theorem 8. Wagner and Wechsung [1986] present a concise survey of many theoretical results concerning the complexity of approximation. Ko [1982] proved that the polynomial hierarchy collapses into 2 if NP is contained in BPP. RP (which he called VPP).

We choose to review here topics that extend the theme of the text-that is. We begin by addressing two issues that. Of necessity. we should like to hear that our instances are not hard. The first such issue is simply the complexity of a single instance: in an application. would have been addressed in the previous chapter. if we are designing an encryption scheme. it has witnessed a large number of important results and the creation of several new fields of enquiry. Instead. Complexity theory is the most active area of research in theoretical computer science. because they directly affect what we can expect to achieve when confronting an NP-hard problem. and the reader will not be expected to master the details of any specific technique. Over the last five years. our coverage of each area is superficial. based on worst-case behavior. Can we characterize the complexity of a single instance? If we are attempting to optimize a solution. Unlike previous chapters.1
Introduction
In this chapter. we are rarely interested in solving a large range of instances-let alone an infinity of them-but instead often have just a few instances with which to work. topics that touch upon the practical uses of complexity theory. we need to hear that our instances are hard. perhaps we can improve on traditional complexity theory. we attempt to give the reader the flavor of each of the areas considered.CHAPTER 9
Complexity Theory: The Frontier
9. if it were not for their difficulty and relatively low level of development. Barring such a detailed characterization. we survey a number of areas of current research in complexity theory. by considering average-case behavior. Hence our second issue: can we develop complexity classes and completeness results based on average cases? Knowing that a problem is hard in average
. this chapter has few proofs.

on the other hand. optical computing. appears to offer an entirely new level of parallelism. Assuming theoretical results are all negative. such as buying new hardware that promises major leaps in computational power. a well defined one. with high probability. so that a fairly comprehensive theory of parallel complexity has evolved. so that results developed for parallel machines apply there too. and compare the results with current models. it alone has the potential for turning some difficult (apparently not P-easy) problems into tractable ones. Optical computing differs from conventional parallel computing more at the implementation level than at the logical level. although not. Perhaps the most exciting development in complexity theory has been in the area of proof theory. The first model (interactive proofs) is of particular interest in cryptology: a critical requirement in most communications is to establish a certain level of confidence in a number of basic assertions. Since much of complexity theory is about modeling computation in order to understand it. that is. alas. we might be tempted to resort to desperate measures. such a result would go a long way towards justifying the use of the problem in encryption. The most intriguing result to come out of this line of research is that all problems in NP admit zero-knowledge proof protocols. DNA computing. one in which the amount of available circuitry" does not directly limit the degree of parallelism. Of all of the models. but it does not. One type of computing devices for which such claims have been made is the parallel computer. any model proposed so far leads to a fairly simple parallel machine. in any case. enable us to solve NP-hard problems in polynomial time. such as the fact that the party at the other end of the line is who he says he is. DNA computing presents quite a different model. more recently. develop models for their mode of computation. In our modern view of a proof as an attempt by one party to convince another of the correctness of a statement. a "yes" instance without transmitting any information
. along with claims of surpassing the power of conventional computing devices. we would naturally want to study these new devices. Parallel computing has been studied intensively. Quantum computing. if nothing else. and quantum computing have all had their proponents. protocols that allow the prover to convince the checker that an instance of a problem in NP is. studying proofs involves studying communication protocols. noninteractive communication to the checker. to date. Researchers have focused on two distinct models: one where the prover and the checker interact for as many rounds as the prover needs to convince the checker and one where the prover simply writes down the argument as a single.358
Complexity Theory: The Frontier instance is a much stronger result than simply knowing that it is hard in the worst case.

such as the question of P vs. it is possible to prove that certain problems are in FP without providing any algorithm-or indeed. in turn. In addition. any hints as to how to design such an algorithm. as it relates directly to the nature of mathematical proofs-which are typically written arguments designed to be read without interaction between the reader and the writer. Results in the theory of graph minors have now come to challenge this model: with these results. as is (at least for now) the complexity of specific instances. we should say a few words about what it does not cover. Some of what we have covered falls under the heading of structure theory: the polynomial hierarchy is an example. but for which a suitable algorithm cannot be designed-or. although providing such an algorithm (directly or through a reduction) was until recently the universal method used in proving membership in a class. as we saw in the previous chapter. One major drawback of complexity theory is that. This algorithm need not be known. particularly discussions of oracle arguments and relativizations. but you will never find it and would not recognize it if you stumbled upon it." cannot be recognized for what it is. This characterization. and attempts to recast difficult unsolved questions in other terms so as to bring out new facets that may offer new approaches to solutions. This theoretical side goes by the name of Structure Theory. Surely. like mathematics. this constitutes the ultimate irony to an algorithm designer: "This problem has a polynomialtime solution algorithm. This line of research has culminated recently in the characterization of NP as the set of all problems. The interested reader will find many other topics in the literature." Along with what this chapter covers.1 Introduction whatsoever about the certificate! The second model is of even more general interest. A problem belongs to a certain class of complexity if there exists an algorithm that solves the problem and runs within the resource bounds defining the class. since its main subject is the structural relationships among various classes of complexity. The theoretical side addresses mostly internal questions. the successes of complexity theory in characterizing hard problems have led to its use in areas that do not fit the traditional model of
359
.9. it has been shown that this theory is. Complexity theory has its own theoretical sidewhat we have presented in this text is really its applied side. density of sets. Worse yet. at least in part. "yes" instances of which have proofs of membership that can be verified with high probability by consulting just a constant number of randomly chosen bits from the proof. has led to new results about the complexity of approximations. NP. inherently existential in the sense that there must exist problems that can be shown with this theory to belong to FP. it is an existential theory. if designed "by accident. and topological properties in some unified representation space.

All of these topics are of considerable theoretical interest and many have yielded elegant results. can be said about the complexity of individual instances of the problem? In solving a large optimization problem. nor even to finite collections of instances. most results in these areas have so far had little impact on optimization or algorithm design.2
The Complexity of Specific Instances
Most hard problems. many researchers have proposed models of computation over the real numbers and have defined corresponding classes of complexity. fixing the desired cover size in Vertex Cover to a constant k. A bit of thought quickly reveals that the theory developed so far cannot be applied to single instances. principally by mathematicians and physicists. in devising an encryption scheme.e. In particular. if anything. however. we want to know that every message produced is hard to decipher. much work has been done in the area. still possess a large number of easy instances. Finally. we can precompute the answers for all instances and store the answers in a table. while there is not yet an accepted model for computation over the reals. The cost of precomputation is not included in the complexity measures that we have been using and the costs associated with table storage and table lookup are too small to matter. many have sought to extend the models from countable sets to the set of real numbers. we are interested in the complexity of the one or two instances at hand. So what does a proof of hardness really have to say about a problem? And what.
9. even when circumscribed quite accurately.which studies versions of NP-hard problems made tractable by fixing a key parameter (for instance. in which case even the brute-force search algorithm that examines every subset of size k runs in polynomial time). A somewhat more traditional use of complexity in defining the problem of learning from examples or from a teacher (i. from queries) has blossomed into the research area known as Computational Learning Theory.360
Complexity Theory: The Frontier
finite algorithms. The bibliographic section gives pointers to the literature for the reader interested in learning more in these areas. An immediate consequence is that we
.. then we can write a program that "solves" each instance very quickly through table lookup. As long as only a finite number of instances is involved. The research into the fine structure of NP and higher classes has also led researchers to look at the fine structure of P with some interesting results. as researchers in various applied sciences became aware of the implications of complexity theory. chief among them the theory of Fixed-Parameter Tractability.

Our proof proceeds by diagonalization over all Turing machines. We construct X = [xI. there exists an i such that pi(n) > p(n) holds for all n > 0. cannot be accepted in Pn (xnl ) time by any of the first n Turing machines in the enumeration. since it cannot decide membership in S. if we have Mi(y) # Xs(y). run machine Mi on string y for PA(IYI) steps or until it terminates.0 xj. i > j =X pi(n) > pj(n).e. However. that is. then it possesses an infinite (and decidable) subset. .. If MA terminates but does not solve instance y correctly. all but a finite number of which are "hard.9. Xn.n. Denote by Xs the characteristic function of S.1 If a set S is not in P. Let [pi I be the sequence of polynomials pi(x) = E . x2. on all but a finite number of instances in X). it will always be possible to solve a finite number of its instances very quickly with the table method. (We are at stage n. this proof is an example of a fairly complex diagonalization: we do not just "go down the diagonal" and construct a set but must check our work to date at every step along the diagonal. determine if it passed Step 2 because machine Mi did not stop in time. I .) For each i. 2. I element by element as follows: 1.) Let string y be the empty string and let the stage number n be 1. If so (if none of the uncancelled
361
. then cancel i: we need not consider Mi again. attempting to generate xn. We construct a sequence of elements of S such that the nth element. (x). and (ii) given any polynomial p. 7 Proof. whichever occurs first. . We capture this concept with the following informal definition: a complexity core for a problem is an infinite set of instances. such that i is not yet cancelled (see below). note that this sequence has the following two properties: (i) for any value of n > 0. X C S. 3. such that any decision algorithm must take more than a polynomial number of steps almost everywhere on X (i. . that is. we shall look only at complexity cores with respect to P and thus consider hard anything that is not solvable in polynomial time. For each i not yet cancelled. (Initialization.i . Theorem 9." What is meant by hard needs to be defined.2 The Complexity of Specific Instances cannot circumscribe a problem only to "hard" instances: no matter how we narrow down the problem. Denote the ith Turing machine in the enumeration by Mi and the output (if any) that it produces when run with input string x by M. we have x E S X xs(x) = I and x 0 S =X xs(x) = 0. The best we can do in this respect is to identify an infinite set of "hard" instances with a finitely changeable boundary.

so that the set X thus generated is infinite. which proves our theorem. so that machine Mi acts as a polynomial-time decision procedure for our problem on instance y. In the case of NP-complete problems. which contradicts our assumption that our problem is not in P. at each stage n. run on y. the number of instances of size n that belong to a complexity core is. machine MA. this procedure must terminate and produce an element xn. run on xn. it is also clearly decidable. We claim that. Since most problems have a number of instances that grows exponentially with the size of the instances. hence for any n with n 3 i and pn > p. 4. the presence of a complexity core alone does not say that most instances of the problem belong to the core: our proof may create very sparse cores as well as very dense ones. any machine MA that computes the characteristic function Xs) and any polynomial po). increase n by 1. (The current stage is completed.1 a complexity core for S. Q. producing longer and longer strings y. and return to Step 2.1 instances).(IxI) steps. we have derived a polynomial-time decision procedure for our problem. terminates in no more than pi(IyI) steps. This precise value of i cannot get cancelled in our construction of X. any finite subset of instances can be solved in polynomial time. Suppose that stage n does not terminate. as expected. Moreover.362
Complexity Theory: The Frontier Mis was able to process y). machine Mi must run in superpolynomial time. and removing this subset from the complexity core leaves another complexity core. In other words. We call the set X of Theorem 9. Thus X is infinite.) Replace y by the next string in lexicographic order. this can happen only if there exists some uncancelled i such that Mi.E. so that y is not a candidate for membership in X). replace y by the next string in lexicographic order and return to Step 2. quite large: under our standard
. = y and proceed to Step 4.D. Since this is true for all sufficiently long strings and since we can set up a table for (the finite number of) all shorter strings. Then Step 3 will continue to loop back to Step 2. Thus every hard problem possesses an infinite and uniformly hard collection of instances. Unfortunately this result does not say much about the complexity of individual instances: because of the possibility of table lookup. for all but a finite number of instances in X (the first i . as each successive xi is higher in the lexicographic order than the previous. then let x. it is important to know what proportion of these instances belongs to a complexity core. does not terminate within p. But then we must have Mi(y) = Xs(y) since i is not cancelled. If not (if some Mi correctly processed y. Now consider any decision procedure for our problem (that is. prepare the new stage.

if no such machine exists (because H is an unsolvable problem). In spite of its undecidability. Definition 9. Kolmogorov and G. In other words. we use a very simple formulation of the shortest encoding of a string. it can be used to provide an excellent definition of a random string-a string is completely random if it is its own shortest description. This idea of randomness was proposed independently by A.2 Let Iy be the set of yes instances of some problem 17. it cannot be bounded by any polynomial in n.
. In truth. then the instance complexity is infinite. as the following informal definition shows.9.2 The Complexity of Specific Instances
363
assumptions. Definition 9. is the size of the smallest Turing machine that produces x when started with the empty string. * The descriptional complexity (also called information complexity or Kolmogorov complexity) of a string x. Asymptotically. * The t-bounded instance complexity of x with respect to H. For instance. as stated by the next theorem. IC'(x I FI). the size of the program will be entirely determined by the size of the instance. thus a large instance is hard if the smallest program that solves it efficiently is as large as the size of the table entry for the instance itself. is defined as the size of the smallest Turing machine that solves 11 and runs in time bounded by t(Ixj) on instance x. x an arbitrary instance of the problem. For our purposes here. the table lookup is not so much an impediment as an opportunity. D For large problem instances. We have encountered this idea of the most concise encoding of a program (here an instance) before-recall Berry's paradox-and we noted at that time that it was an undecidable property. which we present without proof. z In order to capture the complexity of a single instance. Theorem 9. K(x).1 A hard instance is one that can be solved efficiently only through table lookup. we can expect that the size of the program will grow with the size of the instance whenever the instance is to be solved by table lookup.2 Every NP-complete problem has complexity cores of superpolynomial density. we must find a way around the table lookup problem. Naturally the table entry need not be the instance itself: it need only be the most concise encoding of the instance. the measure has much to recommend itself. and t) some time bound (a function on the natural numbers). Chaitin and developed into Algorithmic Information Theory by the latter. the table lookup method imposes large storage requirements.

3 Given constant c and time bound to). El Both measures deal with the size of a Turing machine: they measure neither time nor space..) D Now we can formally define a hard instance.K(x) . Since any problem in P has a polynomial-time solution algorithm of fixed size (i. The instance complexity captures the size of the smallest program that efficiently solves the given instance. for each instance x.c holds.1 Prove this statement. a single instance can always be solved by table lookup with little extra work.K'(x) + cr holds for any time bound t ( ) and instance x. so that P can be characterized on an instance basis as well as on a problem basis. table entry) for the instance.e. c)-hard if the size of the smallest program that solves it within the time bound t must grow with the size of the shortest encoding of the instance itself. there exists a constant cn such that ICt(xIII) . the instance complexity of x is bounded above by the descriptional complexity of x. the descriptional complexity captures the size of the shortest encoding (i. (Hint: combine the minimal machine generating x and some machine solving H to produce a new machine solving H that runs in no more than t(Ix ) steps on input x. F We used K(x) rather than Kt(x) in the definition. First. Interestingly. Proposition 9. E Exercise 9. the converse statement also holds. of size bounded by a constant). Whereas the technical definition may appear complex. For large instances x. c)hard for problem H if ICt (IH) .
. although they may depend on a time bound t( ). for any problem. an instance x is (t. thus we must show that.1 For every (solvable decision) problem H.364
Complexity Theory: The Frontier We write K'(x) if we also require that the Turing machine halt in no more than t(jxI) steps. we should like to say that x is hard if its instance complexity is determined by its descriptional complexity. not hard problems.. We do not claim that the size of a program is an appropriate measure of the complexity of the algorithm that it embodies: we purport to use these size measures to characterize hard instances. we must confirm our intuition that. Definition 9. its essence is easily summed up: an instance is (t. it follows that the polynomialbounded instance complexity of any instance of any problem in P should be a constant.e. which weakens it somewhat (since K(x) . though.Kt(x) holds for any bound t and instance x) but makes it less model-dependent (recall our results from Chapter 4).

c)-hard instances have an instance complexity exceeding any constant (an immediate consequence of the fact that.c.2 A set X is a complexity core for problem H if and only if. for any constant c. Since all but a finite number of the (p. D
365
Exercise 9. for any constant c and polynomial p I.9. there are infinitely many instances x in X for which ICP (x I l) . But a complexity core can have only a finite number of instances solvable in polynomial time. although not necessarily in polynomial time. Proposition 9.E. it follows that the set of (p. we may expect that all but a finite number of their instances are hard instances. El Proof Let X be a complexity core. there are only finitely many Turing machines of size bounded by c and thus only finitely many strings of descriptional complexity bounded by c). Since they are not solvable in polynomial time. for these infinitely many instances x.2 The Complexity of Specific Instances Theorem 9.c holds for some constant c. the converse result also holds. Then X must have an infinite number of instances solvable in polynomial time. However.D.CP(x I) > c holds for almost every instance x in X. since complexity cores are uniformly hard. we have no direct characterization of problems not in P. in particular. (Hint: there are only finitely many machines of size not exceeding c and only some of these solve H. we have ICP(xIl) . combine these few machines into a single machine that solves H and runs in polynomial time on all instances. One last question remains: while we know that no problem in P has hard instances and that problems with complexity cores are exactly those with an infinity of hard instances. c)-hard instances of a problem either is finite or forms a complexity core for the problem. then. they are "hard"
. Assume that X is not a complexity core.2 Prove this result.) D1 Our results on complexity cores do not allow us to expect that a similarly general result can be shown for classes of hard problems. so that X cannot be a core-hence the desired contradiction. so that there exists a machine M that solves H and runs in polynomial time on infinitely many instances x in X. Let c be the size of M and p( ) its polynomial bound. Q. Then there must be at least one machine M of size not exceeding c that solves H and runs in polynomial time on infinitely many instances x in X. with this proviso. which contradicts the hypothesis.3 A problem nI is in P if and only if there exist a polynomial p() and a constant c such that ICP(x1Il) S c holds for all instances x of
Iio.

c)-hard instances. there exists a constant c such that El has infinitely many (p.
F1
. Now we have
ICq+Pq(X I l1) . so that we have size(Mf) + size(Mx) + c' = size(Mx) + c = ICP(f(x)II which completes the proof. let Mx be a minimal machine that solves r12 and runs in no more than p(If(x)I) steps on input f (x). that is. only finitely many (p. intuitively then. for any polynomial p( ).3* Prove this result.6 In any polynomial transformation f from rII to F12. runs in time bounded by q(IxI) + p(If(x)D). use a construction by stages with cancellation similar to that used for building a complexity core.size(Mf ) + size(Mx) + c' 1
But Mf is a fixed machine. Mx solves III and. Now define Mx to be the machine resulting from the composition of Mf and Mx. indeed. Theorem 9. bounded by q(IxI) + p(q(Ix 1)). Then.5 Let I'll and 1l2 be two problems such that [II many-one reduces to in polynomial time through mapping f. polynomial transformations preserve complexity cores and individual hard instances. n One aspect of (informally) hard instances that the reader has surely noted is that reductions never seem to transform them into easy instances. when fed instance x.D.4 Let [I be a problem not in P. they ought to have an infinite set of hard instances.366
Complexity Theory: The Frontier from a practical standpoint.
2
)+c
Q. In fact.E. nor do reductions ever seem to transform easy instances into hard ones.
Hard instances are preserved in an even stronger sense: a polynomial transformation cannot map an infinite number of hard instances onto the same hard instance. Theorem 9. Finally. FD1
n12
Proof Let Mf be the machine implementing the transformation and let q() be its polynomial time bound. then there exist a constant c and a polynomial q() such that ICq+pq(xIl 1 ) . c)hard instances x of [II can be mapped to a single instance y = f (x) of
n12. Let p() be any nondecreasing polynomial.size(Mx) . g Exercise 9. for each constant c and sufficiently large polynomial po. Theorem 9.ICP(f(x)JI 2 ) + C holds for all polynomials p() and instances x.

3
Average-Case Complexity
If we cannot effectively assess the complexity of a single instance. F
. (Hint: use contradiction.
9. it introduces a brand-new parameter. if only because. can we still get a better grasp on the complexity of problems by studying their average-case complexity rather than (as done so far) their worstcase complexity? Average-case complexity is a very difficult problem.4.4 Prove this result. for k = 3. (Recall our discussion in Section 8. Example 9. then instances of arbitrarily large descriptional complexity are mapped to an instance of fixed descriptional complexity.9. the instance distribution.1 Consider the graph coloring problem: a simple backtracking algorithm that attempts to color with some fixed number k of colors a graph of n vertices chosen uniformly at random among all 2(2) such graphs runs in constant average time! The basic reason is that most of the graphs on n vertices are dense (there are far more choices for the selection of edges when the graph has 6(n 2) edges than when it has only O(n) edges). if infinitely many instances are mapped to the same instance. where we distinguished between the analysis of randomized algorithms and the average-case analysis of deterministic algorithms: we are now concerned with the latter and thus with the effect of instance distribution on the expected running time of an algorithm. the size of the backtracking tree averages around 200-independently of n. They illustrate the importance of proper handling of the table lookup issue and provide a framework in which to study individual instances. while other NP-hard problems appear to resist such an attack.) While these results are intuitively pleasing and confirm a number of observations. the backtracking algorithm runs very quickly into a clique of size k + 1. when compared to worst-case complexity. but they do not allow us as yet to prove that a given instance is hard or to measure the instance complexity of individual instances. A construction similar to that used in the proof of the previous R theorem then provides the contradiction for sufficiently large p. The computation of the constant is very complex. so that most of these graphs are in fact not k-colorable for fixed k-in other words. if only because we know of NP-hard problems that turn out to be "easy" on average under reasonable distributions.3 Average-Case Complexity
367
Exercise 9. they are clearly just a beginning.) Yet it is worth the trouble.

u "(x) is the conditional probability of x.2-0 in) of the 2" instances of size n and in 20. then the average running time is polynomial.u(x)/IxI converges.) The third formulation is closest to our first attempt: the main difference is that the
.t to denote a probability distribution rather than the more common p to avoid confusion with our notation for polynomials. the longer an instance takes to solve. Unfortunately. * There exist positive constants c and d such that. given that its length does not exceed n. But now translate this algorithm from one model to another at quadratic cost: the resulting algorithm still takes polynomial time on a fraction (1 . for any positive real number r. We can overcome this problem with a rather subtle definition.3 Given a function f. this definition is not machine-independent! A simple example suffices to illustrate the problem. where f (x) denotes the running time of the algorithm on instance x. (We use . as in the well-known averagecase analysis of quicksort.4 A function f is polynomial on /l-average if there exists a constant E > 0 such that the sum Y' fF(x). Proposition 9. Assume that the algorithm runs in polynomial time on a fraction (1 . we assume some probability distribution It.2-° 0n) of the 2n instances of size n but now takes 20°18n time on the rest.09n time on the rest. F] (We skip the rather technical and not particularly revealing proof. it is worth examining two equivalent formulations.) It is therefore tempting to define polynomial average time under tt as the set of problems for which there exists an algorithm that runs in ZY f(X)(X))) time.
-
Definition 9. we have /[tf (x) > rdIX Id] < c/r. over all instances of size n and then proceed to bound the sum 1IX X= l)n W. the rarer it should be. where . so that the average running time has become exponential! This example shows that a machine-independent definition must somehow balance the probability of difficult instances and their difficulty-roughly put. D In order to understand this definition. the following statements are equivalent:
X There
exists a positive constant £ such that the sum E f'(x)1(x)/lx
converges. * There exist positive constants c and £ such that we have
f 8 (x)As(x) > en
xIsn
for all n.368
Complexity Theory: The Frontier
In the standard style of average-case analysis.

we call this class FAP (because it is a class of functions computable in "average polynomial" time) and denote its subclass consisting only of decision problems by AP. the running time of which is bounded by a function polynomial on ft-average. the lower the probability that it will happen. most "standard" distributions involve real values. we can now define a problem to be solvable in average polynomial time under distribution It if it can be solved with a deterministic algorithm. We can easily see that any polynomial function is polynomial on average under any probability distribution. we can also verify that the conventional notion of average-case polynomial time (as we first defined it) also fits this definition in the sense that it implies it (but not. say (H. say AP. and a probability distribution-or a conventional problem plus a probability distribution. A potentially annoying problem with our definition of distributional problems is the distribution itself: nothing prevents the existence of pairs (H. by stating that a distributional NP problem is one. (ri. we can define a distributional version of each of P. With a little more work. In this new paradigm.3 Average-Case Complexity
369
average is taken over all instances of size not exceeding n rather than over all instances of size n. We call such a problem a distributional problem.
. multiplication.. it). . a problem is really a triple: the question.). such that H is solvable in polynomial time on ft-average. The second formulation is at the heart of the matter. the classical version of which belongs to NP. the other way around). naturally. the set of instances with their answers. unfortunately. Thus we must define a computable distribution as one that an algorithm can approximate to any degree of precision in polynomial time. of course. which no finite algorithm can compute in finite time. where the distribution ft is some horrendously complex function. We can define classes of distributional problems according to the time taken on ft-average-with the clear understanding that the same classical problem may now belong to any number of distributional classes.9. is the class of all distributional problems. f-t) in. depending on the associated distribution. It makes sense to limit our investigation to distributions that we can specify and compute. It shows that the running time of the algorithm (our function f) cannot exceed a certain polynomial very often-and the larger the polynomial it exceeds. A somewhat more challenging task is to verify that our
definition is properly machine-independent-in the sense that the class is closed under polynomial scaling. If we limit ourselves to decision problems. This constraint embodies our notion of balance between the difficulty of an instance (how long it takes the algorithm to solve it) and its probability. etc.
and maximum. Since these functions are well behaved. NP. Of most interest to us. We can easily verify that the class of functions polynomial on ft-average is closed under addition.

tt) where /ti is dominated by some polynomial-time computable distribution.6 that showed that we could not map infinite collections of hard instances of one problem onto single instances of the other problem.) We need some preliminary definitions about distributions.6 A polynomial-time computable distribution . k)) time a finite fraction y obeying f(x) . for all x.5 A real-valued function f: E* -.t(x) = 6x IX-22-lxl. the algorithm outputs in O(p(lxl. in addition. since we clearly cannot allow a mapping of the high-probability instances of 1i1 to the low-probability instances of 12. 1] is polynomial-time * computable if there exists a deterministic algorithm and a bivariate polynomial p such that. but we can come close by selecting n with probability p(n) at least as large as some (fixed) inverse polynomial. it can also dominate the same uniform distribution within a constant factor.t on E* is said to be uniform if there exists a polynomial p and a distribution p on N such that we can write [u(x) = p(IxI)2-1xl and we have p(n) l1/p(n) almost everywhere.* such that. Theorem 9. In order to study AP and DISTNP. So how do we select a string from A*? Consider doing the selection in two steps: first pick a natural number n. While such an assumption works for finite sets of instances. There exists a constant c e N and an invective. These schemes have to incorporate a new element to handle probability distributions. Naturally. and polynomial-time computable function g: E* . we have u(x) S c 2-g(x)I. These "uniform" distributions are in a strong sense representative of all polynomial-time computable distributions.l 2 . the standard assumption made about distributions of instances is uniformity: all instances of size n are generally assumed to be equally likely. 1 The "default" choice is . not only can any polynomial-time computable distribution be dominated by a uniform distribution. then there exists a second constant b E N such that. and then select uniformly at random from all strings of length n. under mild conditions. we cannot select uniformly at random from an infinite set. we cannot pick n uniformly. for any input string x and natural number k. If. Definition 9. We define the class DIsTNP to be the class of distributional NP problems (Fl. In the average-case analysis of algorithms.7 Let /t be a polynomial-time computable distribution. but. we need reduction schemes. we have b 2-1g(x)l _ [(x) a c 2-1g(x)0.[0. for all x. invertible. (This is an echo of Theorem 9.
.370
Complexity Theory: The Frontier Definition 9. [t(x) exceeds 2-P(Ixl) for some polynomial p and for all x.

x. C1 2 Under these reductions. We begin with a reduction that runs in polynomial time in the worst case-perhaps not the most natural choice. including a version of the natural complete problem defined by bounded halting.3 Average-Case Complexity
371
Definition 9. v) if there is a polynomial-time transformation from Hl to 112 such that tt is dominated by v with respect to f.8 We say that (HI. Definition 9.2-1g(x)l. We define a new machine M' as follows. for all x. we have it(x) > p(jxj)1t'(x).7 Let /1 and v be two distributions. But gt is dominated by it'. AP is closed under them. if (H1 . Now let (Hl. the corresponding single instance y has weight v(y) = Zf(x)=y 8/'(x). x is a string (the input for M). cz
Proof Let (H. so that we have It(x) . we obtain v(y) ¢ Lf(x)=y tt(x)/p(lxj). and n is a natural number. if g -l(y) is defined. v) and (1` .9 An instance of the DistributionalBounded Haltingproblem for AP is given by a triple. where M is the index of a deterministic Turing machine. showing that the probability of y cannot be much smaller than that of the set of instances x that map to it: the two are polynomially related. more importantly. I2 Theorem 9. It) be an arbitrary problem in DISTNP. for all x. run onx.5 Prove that. cI These reductions are clearly reflexive and transitive.8 DistributionalBounded Halting is DIsTNP-complete. In). x. such that /u is dominated by tt' and we have v(y) = Zf(x)=y '(x). u) is polynomial-time reducible to (112.9.) and (12. halt in atmostn steps?" The distribution It for the problem is given byu t(M. so that there exists some polynomial p such that. then M' simulates
. Exercise 9. Let M be a nondeterministic Turing machine for H and let g be the function of Theorem 9. The question is "Does M.7. Definition 9. v) be two distributional problems and f a transformation from HI to 112. v) belongs to AP. We say that It is dominated by v if there exists a polynomial p such that. On input y. We say that it is dominated by v with respect to f if there exists a distribution At' on H. Substituting. At). then H belongs to NP. DISTNP has complete problems. but surely the simplest. then so does (HI. it) is polynomial-time reducible to (12. D The set of all instances x of H I that get mapped under f to the same instance y of 12 has probability Z~f(x)=y =t(x) in the distributional problem (H1 J). 1V) = c n-2 J l -21 Ml -222MI-IIxIwhere c is a normalization constant (a positive real number). in fact. /. We are now ready to define a suitable reduction between distributional problems. we have A(x) S p(IxI)v(x). (M.

DIsTNP-complete problems capture. results that we take for granted in worst-case contexts may not hold in average-case contexts. completes in p (xD) time for all x. For instance. Thus M accepts x if and only if M' accepts g(x). The definitions of average complexity given here are robust enough to allow the erection of a formal hierarchy of classes through an analog of the hierarchy theorems. although thus far only a few commercial architectures incorporate more than a token amount (a few dozen processors or so) of parallelism.4. I P(IX I)). including a tiling problem and a number of word problems. as we shall see-an increase in the number of processors yields a corresponding decrease in execution time. n processors.4 9. On problems that lend themselves to parallelism-not all do. at least in part. The reader will find pointers to further reading in the bibliography.E. even in the best of cases. g(x). then it must be in P. Then we define our transformed instance as the triple (M'. Of course. is a gain in execution time by a factor of n. Our conclusion follows easily. our intuitive notion of problems that are NPcomplete in average instance. yet. it is possible for a problem not to belong to P.
9. An immediate consequence is that parallelism offers very little help in dealing with intractable problems: only with the expense of an exponential number of processors might it become possible to solve intractable problems in polynomial time. average complexity can be combined with nondeterminism and with randomization to yield further classes and results. Moreover. A few other problems are known to be DisTNP-complete. Q. run
on g(x). The trade-off involved in parallelism is simple: time is gained at the expense of hardware. if the problem is in AP under every possible exponential-time computable distribution. and expending an exponential number of processors is even less feasible
.372
Complexity Theory: The Frontier
M run on g -(y). moreover. yet to belong to AP under every possible polynomialtime computable distribution (although no natural example is known).1
Parallelism and Communication Parallelism
Parallelism on a large scale (several thousands of processors) has become a feasible goal in the last few years. the most we can expect by using. Because average complexity depends intimately on the nature of the distributions. say.D. it rejects y otherwise. there exists some polynomial p such that M'. so that the mapping is injective and polynomial-time computable.

do all tractable problems stand to gain from parallelism? Secondly. First. relatively little is known about the complexity of problems as measured on a distributed model of computation. arguably a better measure of complexity for distributed algorithms than time or space. if some problem admits a solution algorithm that runs in O(nk) time on a sequential processor. say. The problem is exacerbated in the case of models of parallel computation. We can state that concurrent execution. Where sequential complexity theory defines.e. parallel time is equivalent
. parallelism is thus essentially useless in dealing with intractable problems: since a polynomial of a polynomial is a polynomial. say. polynomial) resource requirements. Restricting our attention to tractable problems. In the following. In contrast. then. implying the existence of overall synchronization. will using O(nk) processors reduce the execution time to a constant?) The term "parallel" is used here in its narrow technical sense.9. a class of problems solvable in sublinear time with a polynomial amount of hardware: both the time bound and the hardware bound must be obeyed simultaneously. the study of parallel complexity hinges on simultaneous resource bounds. since it uses the same resources as parallel execution but with the added burden of explicit synchronization and message passing. Recall from Chapter 4 that the choice of a suitable model of sequential execution-one that would offer sufficient mathematical rigor yet mimic closely the capabilities of modern computers-is very difficult. The most frustrating problem in parallel complexity theory is the choice of a suitable model of (parallel) computation. With "reasonable" (i. Since an additional resource becomes involved (hardware). even polynomial speed-ups cannot take us outside of FP. concurrent or distributed architectures and algorithms may operate asynchronously. parallel complexity theory defines.4 Parallelism and Communication
373
than expending an exponential amount of time. a class of problems solvable in polynomial time. all such models exhibit one common behavior-what has become known as the parallel computation thesis: with unlimited hardware. While many articles have been published on the subject of concurrent algorithms. one result is that several dozen different models have been proposed in a period of about five years. how much can be gained? (For instance. cannot possibly bring about larger gains in execution time. while potentially applicable to a larger class of problems than parallel execution. while models that mimic modern computers tend to pose severe problems in the choice of complexity measures.. we concentrate our attention on parallelism. we are faced with two important questions. at the end of this section. we take up the issue of communication complexity. Models that offer rigor and simplicity (such as Turing machines) tend to be unrealistically inefficient. Fortunately.

. the set of problems that this model can solve in polynomial time is exactly the set of PSPAcE-easy problems-an illustration of the parallel computation thesis.374
Complexity Theory: The Frontier
(within a polynomial function) to sequential storage. different processors may be at different locations in their program.4. (As a result. together with an unbounded collection of processors. 9. Given the unbounded number of processors. At any step. A PRAM consists of an unbounded collection of global registers. Note that PRAMs require a nonnegligible
start-up time: in order to activate f (n) processors. only one processor at a time is allowed to write into a given register (if two or more processors attempt a simultaneous write in the same register. there are the questions of addressing the global memory (should not such an access be costlier than an access to local memory) and of measuring
the hardware costs (the number of processors alone is only a lower bound. a minimum of log f (n) steps must be executed. no matter how many processors are
used. PRAMs cannot reduce the execution time of a nontrivial sequential problem to a constant. each provided with its own unbounded collection of local registers. however. This result alone motivates the study of space complexity! The parallel computation thesis has allowed the identification of model-independent classes of problems that lend themselves well to parallelism. at any given step of execution. Execution begins with the input string x loaded in the first x I global registers (one bit per register) and with only one processor active. so that the architecture is a compromise between SIMD (single instruction.2 Models of Parallel Computation
As in the case of models of sequential computation. models of parallel computation can be divided roughly in two categories: (i) models that attempt to mimic modern parallel architectures (albeit on a much larger scale) and (ii) models that use more restricted primitives in order to achieve sufficient rigor and unambiguity. in the latter case. multiple data stream) types. Normal RAM instructions may refer to local or to global registers. each capable of holding an arbitrary integer. All processors are identically programmed. the machine crashes). multiple data stream) and MIMD (multiple instruction. however. The first kind of model typically includes shared memory and independent processors and is exemplified by the PRAM (or parallel RAM) model. a processor may execute a normal RAM instruction or it may start up another processor to execute in parallel with the active processors.) The main problem with such models is the same problem that we encountered with RAMs: what are the primitive operations and what should be the cost of each such operation? In addition.

a fan-in of n usually implies a delay of log n. basic results of circuit theory state that any function of n variables is computable by a circuit of size 0(20 /n) and also by a circuit of depth 0(n). Given a set. inasmuch as it makes no difference in the definition of size and depth complexity. decides membership in L if and only if On computes the characteristic function of L . g.4 Parallelism and Communication as much additional hardware must be incorporated to manage the global memory). Given a (Boolean) function. we can define classes of size and depth complexity. we let SIzE(f(n)) = {L 13 {¢}: {nj computes L and size(^n) = O(f(n))} DEPTH(f (n)) ={L 3 {Jn}: {[} computes L and depth() = O(f (n))} I These definitions are quite unusual: the sets that are "computable" within given size or depth bounds may well include undecidable sets! Indeed. we define the depth of g as the depth of the shallowest circuit that computes g.1 We define the size of a circuit to be the number of its gates and its depth to be the number of gates on a longest path from input to output. A circuit is just a combinational circuit implementing some Boolean function. the language consisting of all "yes" instances of the halting problem is in SIZE(2'/n). This apparently paradoxical result can be explained as follows. Since each circuit computes a fixed function of a fixed number of variables. L. we need to consider families of circuits in order to account for inputs of arbitrary lengths. In particular.. we limit the fan-in to some constant. {¢n I n E NJ (where each An is a circuit with n inputs). The size of a circuit is a measure of the hardware needed for its realization and its depth is a measure of the time required for computing the function realized by the circuit. is the circuit model. given some complexity measure f(n). in order to keep the model reasonable. All shared memory models suffer from similar problems. we define the size of g as the size of the smallest circuit that computes g.
375
. denote by Ln the set of strings of length n in L. of n variables. of all "yes" instances (encoded as binary strings) for some problem. Ln defines a Boolean function of n variables (the characteristic function of Ln). yet we have proved in Chapter 4 that this language is undecidable. which is exactly what we obtain by limiting the fan-in to a constant. Then a family of circuits. That there exists a circuit On for each input size n that correctly decides the halting problem says only that each instance of the halting problem is either a "yes" instance or a "no" instance-i. A very different type of model. similarly.e. With these conventions. so that SIZE(2n/n)-or DEPTH(n)-includes all Boolean functions. that each
1 In actual circuit design.9. and a much more satisfying one from a theoretical standpoint. We leave the fan-our unspecified.

E
tn
A similar definition allows O(depth($n)) space instead. so that only algorithmically constructible families of circuits are considered. We can now define uniform versions of the classes
SIZE and DEPTH. our proof of unsolvability in Chapter 4 simply implies that constructing such circuits is an unsolvable problem. PoLYL is equal to PoLYLOGDEPTH and P is equal to PSIZE
(polynomial size). Thus the difference between the existence of a family of circuits that computes a problem and an algorithm for constructing such circuits is exactly the same as the difference between the existence of answers to the instances of a problem and an algorithm for producing such answers.10 A family {(n} of circuits is uniform if there exists a deterministic Turing machine which. we have the following theorem. the similarity between the circuit measures and
. because we do not know how to construct such a circuit. It does not say that the problem is solvable. In fact. More precisely.9 Let f(n) and [log g(n)l be easily computable (fully space constructible) space complexity bounds. their definitions vary depending on what resource bounds are imposed on the construction process. which of the two definitions is adopted has no effect on the classes of size and depth complexity. (that is. Theorem 9. computes the circuit $. Definition 9. that is. As it turns out. all we know is that it exists. then we have
UDEPTH(f 0 (')(n)) = DSPACE(f 0 (')(n)) USIZE(g0 ( )(n)) = DTIME(g 0 ( )(n))
In particular.
USIZE(f (n)) = L Ithere exists a uniform family {fW} that computes L and has size O(f (n))} UDEPTH(f(n)) {L Ithere exists a uniform family ({n} that computes L and has depth O(f(n))} Uniform circuit size directly corresponds to deterministic sequential time and uniform circuit depth directly corresponds to deterministic sequential space (yet another version of the parallel computation thesis). Such families of circuits are called uniform. given input In (the input size in unary notation). Our definitions for the circuit complexity classes are thus too general: we should formulate them so that only decidable sets are included. In fact. outputs a binary string that encodes in some reasonable manner) in space O(log size(dn)).376
Complexity Theory: The Frontier
instance possesses a well-defined answer.

First. with the result that similar methods are employed-such as logdepth reductions to prove completeness of certain problems. parallel time (depth) and hardware (size). a vocabulary that we lack for search and optimization problems. since we already have a well-developed vocabulary for decision problems. separating PoLYLOGDEPTH from PSIZE presents the same problems as separating PoLYL from P. parallel architectures may not achieve more than a constant speed-up factor-something that could also be attained by technological improvements or simply by paying more attention to coding.) The uniform circuit model offers two significant advantages. so that the model is in fact fairly realistic. Its major drawback comes from the combinational nature of the circuits: since a combinational circuit cannot have cycles (feedback loops). 2 The parallel computation
2 We shall use the names of classes associated with decision problems. in some sense. No parallel architecture can speed up execution by a factor larger than the number of processors used.4. However. Moreover. Even then.3 When Does Parallelism Pay?
As previously mentioned. However. as a result. For instance. (A corollary to Theorem 9. that much was already obvious for the circuit value problem. there is no equivalent in the circuit model of the concept of "subroutine". the real potential of parallel architectures derives from their ability to achieve sublinear execution times-something that is forever beyond the reach of any sequential architecture-at a reasonable expense. In other words.4 Parallelism and Communication
377
the conventional space and time measures carries much farther. every computer is composed of circuits. even though our entire development is equally applicable to general search and optimization problems. unambiguous complexity measures. the circuit size (but not its depth) may be larger than would be necessary on an actual machine. This is entirely a matter of convenience.9. only tractable problems may profit from the introduction of parallelism. Sublinear execution times may be characterized as DTIME(logk n) for some k-or PoLYLOGTIME in our terminology.
. its definition is not subject to interpretation (except for the exact meaning of uniformity) and it gives rise to natural. a successful application of parallelism is one in which this maximum speed-up is realized. Secondly. then. 9. attempts to solve this problem by allowing cycles-leading to such models as conglomerates and aggregates-have so far proved rather unsatisfactory. viz.9 is that any P-complete problem is depth-complete for PSIZE. it cannot reuse the same subunit at different stages of its computation but must include separate copies of that subunit. these two measures are precisely those that we identified as constituting the trade-off of parallel execution.

Using the notation that has become standard for classes defined by simultaneous resource bounds:
SC = DTIME." named for Stephen Cook. are identical. since their two resource bounds. uniformity is specified on only one of the cE resource bounds. who defined it under another name in 1979). One class. but one that fails to take into account the peculiarities of simultaneous resource bounds. we impose a polynomial bound on the amount of hardware that our problems may require. known as NC ("Nick's class. NC also contains NL (a nondeterministic Turing machine running in logarithmic space can be
. whereas it is conceivable that some problems in P n PoLYL are solvable in polynomial time-but then require polynomial space-or in polylogarithmic space-but then require subexponential time.6 In this definition. The reader may already have concluded that any problem within P n POLYL is the desired type: a most reasonable conclusion. logo(') n) Exercise 9. We conclude that the most promising field of application for parallelism must be sought within the problems in P n PoLYL. In particular. To keep hardware expenses within reasonable limits. taken separately. Two classes have been defined in an attempt to characterize those problems that lend themselves best to parallelism. is defined in terms of (uniform) circuits as the class of all problems solvable simultaneously in polylogarithmic depth and polynomial size: NC = USIZE. logo(') n)
The other class.378
Complexity Theory: The Frontier thesis tells us that candidates for such fast parallel execution times are exactly those problems in PoLYL. who proposed it in 1979). We might expect that the two classes are in fact equal. DSPACE(nO(l). does it matter? Since PoLYLOGDEPTH equals PoLYL and PSIZE equals P. DEPTH(nOIl). whereas both classes contain L (a trivial result to establish)." named in honor of Nicholas Pippenger. known as SC ("Steve's class. The parallel computation thesis then tells us that candidates for such reasonable hardware requirements are exactly those problems in P. is defined in terms of sequential measures as the class of all problems solvable simultaneously in polynomial time and polylogarithmic space. We require that our problems be solvable jointly in polynomial time and polylogarithmic space. both classes (restricted to decision problems) are contained within P n PoLYL. Yet classes defined by simultaneous resource bounds are such that both classes are presumably proper subsets of their common intersection class and presumably distinct.

and related classes. natural (such as any NL-complete problem). this is an immediate consequence of our previous developments. being essentially independent of the choice of model of computation." Of the two classes. as always.9.1
NC. it also retains its characterization under other models of parallel computation. even if a given parallel machine cannot achieve sublinear execution times due to hardware limitations.1 shows the conjectured relationships among NC. NC appears the more interesting and useful. and related classes. they include such important tasks as
.
simulated by a family of circuits of polynomial size and log2 n depth. Finally. Exactly what problems are in NC? To begin with. Their very membership in NC suggests that they are easily decomposable and thus admit a variety of efficient parallel algorithms. For SC. In spite of this. NC also appears to be the more general class. all problems in L are in NC (as well as in SC). it is very hard to come up with good candidates for SC . an equivalent definition of NC is "the class of all problems solvable in polylogarithmic parallel time on PRAMs with a polynomial number of processors. all containments are thought to be proper.SC are fairly numerous. SC. Figure 9. since the class is defined in terms of sequential models. SC. problems in NC still stand to profit more than any others from that architecture. some of which are bound to work well for the machine at hand. Not only does NC not depend on the chosen definition of uniformity. It is defined directly in terms of parallel models and thus presumably provides a more accurate characterization of fast parallel execution than SC (it is quite conceivable that SC contains problems that do not lend themselves to spectacular speed-ups on parallel architectures).4 Parallelism and Communication
P n POLYL NC SC NL
379
L
Figure 9. a rather more difficult result) whereas SC is not thought to contain NL. While candidates for membership in NC . Both classes are remarkably robust.NC (all existing ones are contrived examples). For instance. and important (including matrix operations and various graph connectivity problems).

a variety of simple dynamic programming problems such as matrix chain products and optimal binary search trees. shortest paths.4) is surprisingly effective. Exercise 9. The resulting class. we could consider the difference between this intersection and NC. We proceed otherwise. such as maximum matching. appealing to one or another of the equivalent characterizations of NC. allows us to develop very simple parallel algorithms for many of the problems in NC and also to parallelize much harder problems. NC also contains an assortment of other important problems not known to be in NL: matrix inversion. denoted RNC. the need for special-
. In other words. despite their tractability. In practice. and rank. and pattern matching. Equally important.) The remarkable number of simple. as opposed to the normal L-uniform NC). sorting. the question becomes "What problems are in P . such as maximum flow in planar graphs and linear programming with a fixed number of variables. determinant.NC?" (Since the only candidates are in fact problems in P n PoLYL.2. Adding randomization (in much the same manner as done in Section 8. we discussed a family of problems in P that are presumably not in PoLYL and thus. a fortiori. they can greatly reduce the running time of day-to-day tasks that constitute the bulk of computing. and minimum spanning trees.380
Complexity Theory: The Frontier integer arithmetic operations. what problems are not in NC? Since the only candidates for membership in NC are tractable problems. (Membership of these last problems in NC has been proved in a variety of ways. NC also includes all problems in NL. Thus we conclude that problems such as maximum flow on arbitrary graphs. Ad hoc hardware can be designed to achieve sublinear parallel execution times for a wider class of problems (P-uniform NC.7* Prove that Digraph Reachability is in NC. because membership in P n PoLYL is not always easy to establish even for tractable problems and because it is remarkably difficult to find candidates for membership in this difference. however. El
Finally. matrix multiplication. circuit value. such as graph reachability and connectivity. not in NC: the P-complete problems. and special cases of harder problems. and path system accessibility are not likely to be in NC. effective applications of parallelism are not limited to problems in NC. general linear programming. membership in P n PoLYL appears to be a very good indicator of membership in NC.) In Section 7. but common and important problems that belong to NC is not only a testimony to the importance of the class but more significantly is an indication of the potential of parallel architectures: while they may not help us with the difficult problems.

such as voting problems. y)). call it c(f (x. y). the cost of communication is related only distantly to the running time of processes. For fixed x and y. 9. as the first machine can just send all of x to the second and let the second do all the computing. the first machine is given x as input and the second y. For certain problems that make sense only in a distributed environment. to be the minimum number of bits that must be exchanged in order for one of
x and y can be considered as a partition of the string of bits describing a problem instance. such algorithms are isolated cases. a theory of parallel complexity would identify problems amenable to linear speed-ups through linear increases in the number of processors.4 Communication and Complexity
381
The models of parallel computation discussed in the previous section either ignore the costs of synchronization and interprocess communication or include them directly in their time and space complexity measures. The question is "How many bits must be exchanged in order to allow one of the machines to output f (x. however.4. let us postulate the simplest possible model: two machines must compute some function f(x.3 the machines communicate by exchanging alternating messages. In order to study communication complexity. Each machine computes the next message to send based upon its share of the input plus the record of all the messages that it has received (and sent) so far. In a distributed system. Ideally. On the other hand.4 Parallelism and Communication purpose circuitry severely restricts the applications. however. Hence some measure of communication complexity is needed. where x and y are assumed to have the same length. some nontrivial functions have only unit complexity: determining whether the sum of x and y (considered as integers) is odd requires a single message. an upper bound on the complexity of any function under this model is Ixl. running time and space for the processes is essentially irrelevant: the real cost derives from the number and size of messages exchanged by the processes. y)? " Clearly.
3
. such a class cuts across all existing complexity classes and is proving very resistant to characterization. we define the communication complexity of f(x. an instance of a graph problem can be split into two strings x and y by giving each string half of the bits describing the adjacency matrix. for example. y). then. Several efficient (in the sense that a linear increase in the number of processors affords a linear decrease in the running time) parallel algorithms have been published for some P-complete problems and even for some probably intractable problems of subexponential complexity.9.

D
Proof.
. Define the nondeterministic communication complexity in the obvious manner: a decision problem is solved nondeterministically with communication cost t(n) if there exists a computation (an algorithm for communication and decision) that recognizes yes instances of size 2n using no more than t(n) bits of communication. c(f2n).D.. Does nondeterminism in this setting give rise to the same exponential gaps that are conjectured for the time hierarchy? The answer. is not only that it seems to create such gaps. one for each input size. i.CoMM(n . y)) over all partitions of the input into two strings. Theorem 9. As was the case for circuit complexity. we define the communication complexity of f for inputs of size 2n. The first machine enumerates in lexicographic order all possible sequences of messages of total length not exceeding t(n). needs to know.10 Let t(n) be a function with 1 < t(n) . Let us further restrict ourselves to functions f that represent decision problems.382
Complexity Theory: The Frontier the machines to compute f. where the ith bit encodes its answer to the ith sequence of messages. Further comparisons are possible with the time hierarchy. we must show that the gap is no larger than exponential. Then CoMM(t(n)) is a proper superset of CoMM(t(n) . though. there are 2t(n) such sequences. x and y.E. An extension of the argument from n to t(n) supplies the desired proof. of equal length. But that is something that the first machine can provide to the second within the stated bounds. this definition of communication complexity involves a family of functions.11 NCOMM(t(n)) C CoMM(2t(n)).1). Since the partition of the input into x and y can be achieved in many different ways. so that there are languages in CoMM(n) .e. but that the existence of such gaps can be proved! First. Let n be the length of x and y. with a binary alphabet. All that the second machine needs to know in order to solve the problem is the first machine's answer to any possible sequence of communications. Em The proof is nonconstructive and relies on fairly sophisticated counting arguments to establish that a randomly chosen language has a nonzero probability (indeed an asymptotic probability of 1) of requiring n bits of communication. Then communication complexity defines a firm hierarchy.n for all n and denote by CoMM(t(n)) the set of decision problems f obeying C(f2 0) < t(n) for all n. allowing messages of arbitrary length. The first machine prepares a message of length 2 '(n). Theorem 9. to Boolean-valued functions. somewhat surprisingly.1). Thus with a single message of length 2 t("). the first machine communicates to the second all that the latter Q. as the minimum of c(f (x.

x) where x. Theorem 9. (x. We claim that no two such pairs can lead to the same sequence of messages. the inner product of x and x is false.. then a nondeterministic algorithm can pick three vertices for which it knows of no missing edges and send their labels to the other machine. 17) is acceptable.D. that guise In both machines proceed. For instance. iu)is accepted because the same sequence of messages used for the pair (x. 17). Xn and y = Y1Y2. the first as if computing f (x. Consider the question "Does a graph of IVI vertices
383
given by its adjacency matrix contain a triangle?" The problem is trivial if either side of the partition contains a triangle. (Cow and
. However. so that at least some of these sequences must use n bits. iu) and (u.. the only triangles are split between the two sides. neither machine can recognize its error and the pair (x. Then the (fixed-partition) communication complexO ity of the decision problem "Is f (x. which yields the desired contradiction. y) = vi= (xi A yi). Assume that there exist two pairs. x) "verifies" that (x. x). of complementary strings that are accepted by our two machines with the same sequence of messages. Yn. where x and y are binary strings of the same length. 1) is accepted.12 There is a problem in NCOMM(a log n) that requires Q (n) bits of communication in any deterministic solution. For any string pair (x. x) and (u. be the logical inner product of the two strings (considered as vectors). which can then verify that it knows of 1) no missing edges either.is the bitwise complement of x. -u). That is. at least one of these two pairs has a true logical inner product and thus is not a yes instance.. so that our two machines do not solve the stated decision problem.9. Since the two computations involve the same sequence of messages. Then our two machines also accept the pairs (x. There are 2n such pairs for strings of length n. the pair (x. The same argument shows that the pair (u. If. y). The first machine starts with x and its first message is the same as for the pair (x. we have f (x. as it implies the existence of 2n distinct sequences of messages for strings of length n.
Proof (of lemma). writing x = xIx 2 . x) is also accepted. we first prove the following simple lemma. Proof (of theorem).4 Parallelism and Communication Now we get to the main result. Lemma 9. x) and the second as if computing f (u. however. y) false?" is exactly n. ii In order to prove this theorem. a proof of the claim immediately proves our lemma. then the second machine receives a message that is identical to what the first machine would have sent had its input been string u and thus answers with the same message that it would have used for the pair (u. Q. 17). x). Since the input size is n = IVI + IVI.E.1 Let the function f(x..

call the vertex "black" if more than 98% of these edges are black.) From each top vertex there issue at least IVI/100 black edges and at least IVI/100 white edges. (Recall that the same amount of data is given to each machine: thus we consider only edge colorings that color half of the edges in white and the other half in black. We prove this assertion by an adversary argument: we construct graphs for which demonstrating the existence of a triangle is exactly equivalent to computing a logical inner product of two n-bits vectors. which is the number of top vertices to which its endpoints are connected by edges of different colors. Call an edge between two bottom vertices a bottom edge. consider it to be edge-colored with two colors.384
Complexity Theory: The Frontier
since identifying the three vertices requires 3 log IV I bits. we select all of the white edges from one of its endpoints (which is thus "whitened") to top vertices and all of the black edges from its other endpoint (which is "blackened") to the top vertices. since the graph is complete.1 edges incident to it. such a triangle exists if and only if the two edges between the top vertex and the bottom vertices exist-in
. the problem is in
NCOMM(o log n) as claimed. We start with the complete graph on IVI vertices. The resulting collection of edges defines the desired graph on n vertices. Thus at least 1% of the vertices must be of the mixed type. each such edge is assigned a weight. Now we select edges between top and bottom vertices: for each edge between bottom vertices.
On the other hand. The only possible triangles in this graph are composed of two matched bottom vertices and one top vertex. there are at least (IVl/ 100)2 bottom edges connected to each top vertex by edges of different colors. First we repeatedly select edges between bottom vertices by picking the remaining edge of largest weight and by removing it and all edges incident to its two endpoints from contention. say black and white. This procedure constructs a matching on the vertices of the graph of weight Q (IV 12) (this last follows from our lower bound on the total weight of edges and from the fact that selecting an edge removes O (IV l) adjacent edges). there are graphs for which the scheme can do no better than to send an bits. In particular. (Because the graph is complete. this implies that the total weight of all bottom edges is Q (IVI3 ).) Any vertex has IVI . the two endpoints of a bottom edge are connected to every top vertex. Hence we can pick a subset of 1% of the vertices such that all vertices in the subset are of mixed type. and "mixed" otherwise. "white" if more than 98% of these edges are white. Now we construct a subset of edges as follows. thus. for any deterministic communication scheme. call these vertices the "top" vertices and call the other vertices "bottom" vertices. with the first machine being given all black edges and the second machine all white edges.

E. Each vector has length equal to the weight of the matching." These results are impressive in view of the fact that communication complexity is a new concept. the only candidate top vertices are those that are connected to the matching edge by edges of different colors. Since the matching has weight Q (I V12) = Q (n). we can extend the model to include randomized approaches. Whereas communication complexity as described here deals with deterministic or nondeterministic algorithms that collaborate in solving a problem. which use a prover and a checker that interact in one or more exchanges. keeping in mind that the proof must establish that. the conclusion then follows from our lemma. deciding whether the graph has a triangle is exactly equivalent to computing the logical inner product of two vectors of equal length. Thus for each pair of matched (bottom) vertices.D. the class NP is based on the idea of interaction:
. Q.9. solving the problem requires a sequence of messages with a total length linear in the size of the input.5
Interactive Proofs and Probabilistic Proof Checking
We have mentioned several times the fact that a proof is more in the nature of an interaction between a prover and a checker than a monolithic. For each candidate triangle.5 introduces the ideas behind probabilistic proof systems. hence the total number of candidate triangles is exactly equal to the weight of the constructed matching. Since the first machine knows only about white edges and the second only about black edges. it is less apparent that the construction respects the constraint "for any partition. for any partition of the input into two strings of equal length. the vector has a bit indicating the presence or absence of one or the other edge between a top vertex and a matched pair of vertices (the white edge for the vector of the first machine and the black edge for the vector of the second machine).
385
9. The reader should think very carefully about the sequence of constructions used in this proof. While it should be clear that the construction indeed yields a partition and a graph for which the problem of detecting triangles has linear communication complexity. to which relatively little study has been devoted so far. it is also clear that simultaneous resource bounds ought to be studied in this context. More results are evidently needed.5 Interactive Proofs and Probabilistic Proof Checking which case these two edges are of different colors. the vectors have length Q (n). absolute composition. Section 9. Communication complexity is not useful only in the context of distributed algorithms: it has already found applications in VLSI complexity theory. Indeed.

There is a well-defined notion of checker (even if the checker remains otherwise unspecified)." questions that depend on the information accumulated so far by the checker.5. Would you obey a request to go in the forest.
4 Perhaps somewhat naive. with the "checker" asking questions of the "prover. is both too broad-because the prover is completely unspecified and the checker only vaguely delineatedand too narrow-because membership in the class requires absolute correctness for every instance. a certificate must be found and then checked in polynomial time. Arthur cannot hide his random bits from Merlin's magic. the work is divided between a nondeterministic prover and a deterministic checker). Even then. at best. there to seek a boulder in which is embedded a sword. probabilistically. An interaction between two scientists typically takes several rounds.and single-round proof systems. whereas Merlin has (at least) the power of NP. to retrieve said sword by slipping it out of its stone matrix. at best. Thus NP. We study below both multi. while capturing some aspects of the interaction between prover and checker. You know about them already: Merlin is the powerful and subtle wizard and Arthur the honest 4 king. In our interaction.11 An interactive proof system is composed of a checker. Merlin will be the prover and Arthur the checker. the prover is effectively the existential quantifier or the nondeterministic component of the machine. In other words. which runs in probabilistic polynomial time. Definition 9. and a prover. being a wizard. being a wise king. However.386
Complexity Theory: The Frontier
to prove that an instance is a "yes" instance. at the very least. Arthur often asks Merlin for advice but. a single machine does all of the work. can easily dazzle him and might not always tell the truth. then we would like to investigate. realizes that Merlin's motives may not always coincide with Arthur's own or with the kingdom's best interest. We write such a system (P. Merlin can obtain things by magic (we would say nondeterministically!). Arthur has the power of P or. whereas Arthur can compute only deterministically or. BPP. So whenever Merlin provides advice. the interactive equivalent of BPP. whereas for a problem in NP.
9.1
Interactive Proofs
Meet Arthur and Merlin. Arthur further realizes that Merlin. though. which can use unbounded resources. Arthur will ask him to prove the correctness of the advice. and to return it to the requester?
. on the other hand. C). If we view NP as the interactive equivalent of P (for a problem in P. Yet the interaction described in these cases would remain limited to just one round: the prover supplies evidence and the checker verifies it.

C) rejects every [I1 "no" instance of II with probability at least 1/2 + E. (The reader will also see one-sided definitions where "yes" instances are always accepted. C) accepts every "yes" instance of fl with probability at least 1/2 + £. Again. who are prevented from doing too much harm by the second requirement. It turns out that the definitions are equivalent-in contrast to the presumed situation for randomized complexity classes. who collaborates with the checker and can always convince the checker of the correctness of a true statement. we can then ask exactly how much power the prover needs to have in order to complete certain interactions. and * for any prover P. (Effectively. where we expect RP to be a proper subset of BPP. 21 and sends to the prover a random permutation H of Gi. whereas the checker in an interactive proof system uses "secret" random bits. the parties exchange at most f (lxl) messages. that is! As we shall see. the interactive proof system (P. In particular. So how is an interactive proof system developed? We give the classic example of an interactive.5 Interactive Proofs and Probabilistic Proof Checking A problem rI admits an interactive proof if there exists a checker C and a constant £ > 0 such that * there exists a prover P* such that the interactive proof system (p*. it turns out that the class IP is remarkably robust: whether or not the random bits of the checker are hidden from the prover does not alter the class.9. Arthur communicates to Merlin the random bits he uses (and thus need not communicate anything else). are they nonisomorphic? This problem is in coNP but not believed to be in NP. other than limiting it to computable functions.) This definition captures the notion of a "benevolent" prover (P*). Definition 9. GI and G2 . One phase of our interactive proof proceeds as follows: 1. and of "malevolent" provers. the checker is asking the prover to decide whether H is isomorphic to GI or to G2 . for instance x.) 2. one-sided proof system for the problem of Graph Nonisomorphism: given two graphs. In an Arthur-Merlin game. The checker chooses at random the index i E 11.
387
. The prover tests H against GI and G2 and sends back to the checker the index of the graph to which H is isomorphic. We did not place any constraint on the power of the prover.12 The class IP(f) consists of all problems that admit an interactive proof where. This definition of interactive proofs does not exactly coincide with the definition of Arthur-Merlin games that we used as introduction. IP = IP(n0 (')). IP is the class of decision problems that admit an interactive proof involving at most a polynomial number of messages.

Techniques used so far in this text all have the property of relativization: if we equip the Turing machine models used for the various classes with an oracle for some problem. the prover must effectively answer at random. (In this scenario. Not knowing the random bit used by the checker.13 IP equals PSPACE. otherwise it rejects it. a result indicating that "normal" proof techniques cannot succeed in proving Theorem 9." However. Developing an exact characterization of IP turns out to be surprisingly difficult but also surprisingly rewarding. when the two graphs are isomorphic. Theorem 9. all of the results we have proved so far carry through immediately with the same proof. A long-standing conjecture in Complexity Theory. The checker compares its generated index i with the index sent by the prover. there exists an oracle A (which we shall not develop) under which the relativized version of this theorem is false. hence it is enough that it be able to solve NP-easy problems. thereby disproving the random oracle hypothesis.5 In point of fact. one part of the theorem is relatively simple: because all interactions between the prover and the checker are polynomially bounded.13. verifying that IP is a subset of PSPACE can be done with standard techniques.
5 That IP equals PSPACE is even more surprising if we dig a little deeper. The first surprise is the power of IP: not only can we solve NP problems with a polynomial interactive protocol. It follows that Graph Nonisomorphism belongs to IP. the checker accepts the instance.
. so that the checker will always accept "yes" instances. On the other hand. and since IP is easily seen to be closed under complementation. IpA differs from PSPACEA with probability 1. stated that any statement true with probability I in its relativized version with respect to a randomly chosen oracle should be true in its unrelativized version. with a probability of 1/2 of returning the value used by the checker and thus fooling the checker into accepting the instance. that is. the prover needs only to be able to decide graph isomorphism and its complement. if they agree. a benevolent prover can always decide correctly to which of the two H is isomorphic and send back to the checker the correct answer. under which we have IpA PSPACE A.) If GI and G2 are not isomorphic. after the proof of Theorem 9. since it belongs to coNP but presumably not to NP.388
Complexity Theory: The Frontier
3. known as the Random Oracle Hypothesis. with respect to a random oracle A. ostensibly on the grounds that a random oracle had to be "neutral.13 was published. we can solve any problem in PSPACE. However. then the prover finds that H is isomorphic to both. The second surprise comes from the techniques needed to prove this theorem. other researchers showed that. we begin to suspect that IP contains both NP and coNP.

= Yi.
The degree of the resulting polynomial pf is at most 3m. something that takes too long for the checker to test directly because pO has
. . The Boolean formula over n variables f = cl A C2 A. carried out in such a way as to transform the existence of a satisfying assignment for a Boolean formula into the existence of an assignment of 0/1 values to the variables that causes the polynomial to assume a nonzero value. Y2.Iij corresponds to the polynomial
Pc = 1 . i) =
E
Pf(Y1. 2.
so that f is
E
An interactive protocol for 3UNSAT then checks that pO equals zero. . that is..yi and the Boolean literal xi to the polynomial Px.5 Interactive Proofs and Probabilistic Proof Checking Exercise 9.
each ci is a clause) corresponds to the polynomial pf(yl. Use our results about randomized classes and the fact that PSPACE is closed under complementation and nondeterminism.
Y'=O =
Yn)
Yir] =0 Yi+2=0
Verify that we have both pf = pf and p<unsatisfiable if and only if pf equals zero. Y2.9. a The key to the proof of Theorem 9.PiI Pi2 Pj3-
389
3. since we can use it to encode an arbitrary instance of 3UNSAT.
PcI PC 2 Pc. Y2. -
A Cm
Y2.
Y =o
Yn) =
°
Define partial-sum polynomials as follows:
Pt(Y. The arithmetization itself is a very simple idea: given a Boolean formula f in 3SAT form. = I . Exercise 9.9 Verify that f is unsatisfiable if and only if we have
EPf
Yi=O Y2=
0
(Y1.. This arithmetization suffices for the purpose of proving the slightly less ambitious result that coNP is a subset of IP.13 and a host of other recent results is the arithmetizationof Boolean formulae. the encoding of Boolean formulae into low-degree polynomials over the integers.i..8* Prove that IP is a subset of PSPACE.
(where
y-) =
. The Boolean clause c = {X>.
Pi
+ Pi
. we derive the polynomial function pf from f by setting up for each Boolean variable xi a corresponding integer variable yi and by applying the following three rules: 1. The Boolean literal xi corresponds to the polynomial px.2 . .

In the zeroth round. the numbers ri. the checker picks a new random number r in the set (0... If the values agree.. the protocol uses n + 1 rounds. the checker will ask the prover to send (a form of) each of the partial-sum polynomials.390
Complexity Theory: The Frontier an exponential number of terms. rn). then.) The high resulting degree prevents the checker from evaluating the polynomial within its resource bounds. in turn. b . . . ri. .have been computed. In each succeeding round. the checker sets bo = 0.have been chosen and the numbers bo. The checker cannot just ask the prover for an evaluation since the result cannot be verified. and ask the prover to return the partial-sum polynomial for that value of the variable (and past values of other variables fixed in previous exchanges). accepting if the two are equal. bi. r2.. finally. the checker evaluates qi'(0) + q'(1) and compares the result with bi-1.. . if the end of the nth round is reached. Thus at the beginning of round i. which does not raise the degree and so does not cause a problem.'(ri). the checker runs one last test. What the checker will do is to choose (at random) the value of a variable.10* (Requires some knowledge of probability. which the checker tests for size and for primality (we have mentioned that Primality belongs to ZPP)... . the checker is able to predict the value of the next partial-sum polynomial and thus can check. . the checker sends bi. On receiving the coefficients defining q'(x) (the prime denotes that the checker does not know if the prover sent the correct coefficients).. Exercise 9. instead. Overall. send that value to the prover. for a suitable choice of the prime p. the checker stops and rejects the instance if any of its tests fails. . x). establishes membership of 3 UNSAT in IP. assigns it to the next variable. . we need a way to generate a (possibly much larger) polynomial of bounded degree. whether it evaluates to the predicted value. 1. In round i. the checker selects the next random value ri and sets bi = q. (Each existential quantifier forces us to take a sum. the checker computes pf and the prover sends to the checker a large prime p (of order 2 n). . and computes a new value b. comparing bn and pf (rl. c1 The problem with this elegant approach is that the polynomial resulting from an instance of the standard PSPAcE-complete problem Q3SAT will have very high degree because each universal quantifier will force us to take a product over the two values of the quantified variable. . in order to carry the technique from coNP to PSPACE.to the prover and asks for the coefficients of the single-variable polynomial qi(x) = pi (ri. Therefore. On the basis of the value it has chosen for that variable and of the partial-sum polynomial received in the previous exchange. ri-1. . To this
. pf. At any time. . p -1).) Verify that the protocol described above. when it receives it from the prover. .

we define
E andy(P) =Ply=0Ply= * ory(p) = Py=0 +p Ply=oPly=l
391
* reducey(p) = Ply=0 + Y(P1y
-Ply=0)
The first two operations will be used for universal and existential quantifiers. Using this low-degree polynomial. no polynomial ever has degree greater than two.. . . QnXf (xI.. n .9. . . As its name indicates. . in fact. apply reducex. an. we define the following operations on polynomials. 2.+I-i is universal. test(ory.i. and (in both cases) follow by applications of reduce.. The proof of Theorem 9. respectively. Y2 . cients and denote by test(p) the problem of deciding. the last operation will be used for reducing the degree of a polynomial (at the cost of increasing the number of its terms). For each i =1.. . A time-tested method to foil a powerful advisor who may harbor ulterior designs is to use several such advisors and keep each of them in the dark about the others (to prevent them from colluding. a2 . 3. but the malevolent provers also become less able to fool the checker into accepting a false statement. xn). ) be a polynomial with integer coeffiLemma 9. we can turn to related questions. . if the quantifier Q. n in turn..
Now we can encode an instance of the PSPAcE-complete problem Q3SAT.
.. . . whether p(al. which completes (our rough sketch of) the proof... then so does each of test(andy (p)). x2 . If test(p) belongs to IP... . as follows: 1. then apply andx. under substitution of either y = 0 or y = 1. . . n in turn. it is an identity.-.. for given values a.
y. a2 . For each i = 1. and II test(reduceyi (p)).2 Let P(Yi. . an) equals b. . and b. . Having characterized the power of interactive proofs. . else apply orx. . How much power does a benevolent IP prover actually need? One problem with this question derives from the fact that restrictions on the prover can work both ways: the benevolent prover may lose some of its ability to convince the checker of the validity of a true statement. we can then proceed to devise an interactive proof protocol similar to that described for 3 UNSAT.13 relies on the following lemma (which we shall not prove) about these three operations. If p is a polynomial and y one of its variables. . No polynomial produced in this process ever has degree greater than that of f-after the second step. . Produce the polynomial pf corresponding to the formula f. say QIXI Q2x 2 . (p)). for j = 1. . .5 Interactive Proofs and Probabilistic Proof Checking end.

The prover sends the two keys to the checker. (This is a common game in international politics or. If the graph is indeed three-colorable. however. 2. The class MIP characterizes decision problems that admit polynomially-bounded interactive proofs with multiple. the probability of error can be reduced to any desired constant. say {u.) For instance. for that matter. certainly it does not fit well with our notion of NP. in commerce: convince your opponents or partners that you have a certain capability without divulging anything that might enable them to acquire the same. the prover can always persuade the checker to accept-any legal coloring (randomly permuted) sent in the boxes will do. If the colors are distinct and both in the set {1. Both the prover and the checker have a description of the graph with the same indexing for the vertices and have agreed to represent colors by values from {l. At first. otherwise.392
Complexity Theory: The Frontier
which would be worse than having to rely on a single advisor). the checker opens the two boxes. Not surprisingly. the checker accepts the claim of the prover. divulging any real information. With enough rounds (each round is independent of the previous ones because the prover uses new boxes with new locks and randomly permutes the three colors in the legal coloring-and.
9. you might want to convince someone that a graph is three-colorable without. If the graph is not three-colorable. it results in at least one same-colored edge or uses at least one illegal color). and asks the prover for the keys to the boxes for u and v. there are many occasions when one party wants to convince the other party of the correctness of some statement without. the idea seems ludicrous. can even select different colorings in different rounds). thereby providing a formal justification for a practice used by generations of kings and heads of state." one for each vertex of the graph-each box contains the color assigned to its corresponding vertex. revealing anything about the three-coloring. After k rounds.5. however. In a given round of interaction. it rejects the claim.2
Zero-Knowledge Proofs
In cryptographic applications. which the checker discovers with probability at least IEI -. independent provers. However. for graphs with more than one coloring. 3).
. 3). v}. consider the following thought experiment. this class sits much higher in the hierarchy than IP (at least under standard conjectures): it has been shown that MIP equals NExp. the prover sends to the checker n locked "boxes. any filling of the n boxes has at least one flaw (that is. where the certificate typically is the solution. thus the checker selects at random one edge of the graph. The checker cannot look inside a box without a key. 2.

F Thus there is nothing that the checker can compute with the help of the prover that it could not compute on its own! However. It is instructive to contemplate what the checker learns about the coloring in one round. C)(x). The checker opens two boxes that are expected to contain two distinct values chosen at random from the set {1.1 rounds.13 A prover strategy.9. Encryption offers an obvious solution: the prover encrypts the content of each box. For practical purposes. so we omit a formal definition but content ourselves with noting that the resulting form of zero-knowledge proof is called computational zero-knowledge. With a strong encryption scheme. In practice.11 and simply add one more condition. P. sl(x).
393
. we know of no provably hard encryption scheme that can easily be decoded with a key. the output of the prover-checker interaction.5 Interactive Proofs and Probabilistic Proof Checking the probability of error is bounded by (1 . The checker might as well have picked two colors at random from the set. Turning our thought experiment into an actual computing interaction must depend on the availability of the boxes with unbreakable locks. from strict equality of outcomes at each instantiation to equality of distributions to the yet weaker notion of indistinguishability of distributions. the correctness of the assertion has been (probabilistically) proved. we can define equality in a number of ways.log(I EI . the last notion suffices: we need only require that the distributions of the two variables be computationally indistinguishable in polynomial time. we want the prover to commit itself to a coloring before the checker asks to see the colors of the endpoints of a randomly chosen edge.IE -1)k. for every probabilistic polynomial-time checker strategy. Definition 9. since the output of the interaction or of the algorithm is a random variable. there exists a probabilistic polynomial-time algorithm As such that. but absolutely nothing has been communicated about the solution: zero knowledge has been transferred. asking for strict equality is rather demanding: when dealing with random processes. This experiment motivates us to define a zero-knowledge interactive proof system. 2. using a different key for each box.1)). Unfortunately. 3} and. so that we can guarantee a probability of error not exceeding 1/2 in at most F(log IEl . is perfect zero-knowledge for problem I if. Formalizing this notion of indistinguishability takes some work. for every instance x of n. the result would be indistinguishable from what has been "learned" from the prover! In other words. C. assuming that the graph is indeed colorable. equals that of the algorithm. the checker is then unable to decipher the contents of any box if it is not given the key for that box. we assume Definition 9. (P. finds exactly that.

So what happens when we simply require. on discovering for the nth time that Merlin has tricked him. 9. any problem in PSPACE has a computational zero-knowledge interactive proof. F1 That such is the case for any problem in NP should now be intuitively clear. as well as how much efficiency is to be gained by transferring more knowledge than the strict minimum required. however. the introduction of probabilistic checking makes interactions much more interesting. Theorem 9. but no proof is available (and schemes based on factoring are now somewhat suspect due to the quantum computing model). (If we return to our analogy of Arthur and Merlin. Going beyond zero-knowledge proofs. he will accept Merlin's advice only in writing. The dream of any cryptographer is a one-way function. that is) is quite astounding in itself. Such functions are conjectured to exist. This limited interaction is to the advantage of the checker: the prover gets fewer opportunities to misdirect the checker. that. yet the ability to produce zero-knowledge proofs for any problem at all (beyond BPP. That it also holds for PSPACE and thus for IP is perhaps no longer quite as surprising. when in final form. as we did in the fully interactive setting. a function that is P-easy to compute but (at least) NP-hard to invert.14 Assuming the existence of one-way functions.) In its simplest form. that is.394
Complexity Theory: The Frontier
Public-key schemes such as the RSA algorithm are conjectured to be hard to decipher. This ability has had profound effects on the development of cryptography. we can imagine Arthur's telling Merlin. As we have seen. are static entities: they are intended as a single communication to the reader (the checker) by the writer (the prover). we can start asking how much knowledge must be transferred for certain types of interactive proofs.5.3 Probabilistically Checkable Proofs
Most mathematical proofs. henceforth. this type of interaction leads to our definition of NP: the prover runs in nondeterministic polynomial time and the checker in deterministic polynomial time.
. since we outlined a zero-knowledge proof for the NP-complete problem Graph Three-Colorabilitywhich. with some care paid to technical details. that the checker have a certain minimum probability of accepting true statements and rejecting false ones? Then we no longer need to see the entire proof and it becomes worthwhile to consider how much of the proof needs to be seen. can be implemented with one-way functions. along with a onetime only argument supporting the advice. but their existence has not been proved.

nO(i)) also equals NP. PCP(O. it turns out that the definitions are equivalent. however.
. From our definition.15 The class PCP(r. no(l)) equals NExp. A bit more work reveals that a small amount of evidence will not greatly help a deterministic checker.5 Interactive Proofs and Probabilistic Proof Checking
395
Definition 9. q) consists of all problems that admit a probabilistically checkable proof where.
More interestingly.) Definition 9. even after checking n . It is not difficult to show that there exist Boolean functions that require the evaluation of every one of their n variables-a simple example is the parity function. O(log n)) = P. Ii
.9. the checker C uses at most r(JxJ) random bits and queries at most q(lxl) bits from the proof string. namely a truth assignment for the n variables. 0) = P and PCP(n0 (l). another result that 0 ( we shall not examine further states that PCP(n 1 ).11 Verify that PCP(O. again. In particular. we have PCP(O. With sufficiently many random bits. for instance x. where "no" instances are always rejected. the power of the system grows quickly. A small number of random bits does not help much when we can see the whole proof: a difficult result states that PCP(O(log n).14 A problem H admits a probabilisticallycheckable proof if there exists a probabilistic algorithm C (the checker) that runs in polynomial time and can query specific bits of a proof string 7r and a constant e > 0 such that * for every "yes" instance x of I. in fact. Consider the usual certificate for Satisfiability. No amount of randomization will enable us to determine probabilistically whether a truth assignment satisfies the parity function by checking just a few variables. q) is the class of problems that can be decided by a deterministic algorithm running in O(n0 ") 2 q(n)) time. Exercise 9. we have PCP(O. the proof string must assume a specific form. the entire proof string for a problem in NP. F] (Again. no(i)) equals NP. and e for every "no" instance x of H and any proof string Jr. the two outcomes remain equally likely! Yet the proof string is aptly named: it is indeed an acceptable proof in the usual mathematical sense but is written down in a special way. -1 Clearly. We used a one-sided definition in the proof of Theorem 8. there exists a proof string 7r such that C accepts x with probability at least 1/2 + £. 0) = BPP. the reader will also see one-sided definitions. C rejects x with probability at least 1/2 + E. since we have no randomness yet are able to inspect polynomially many bits of the proof string-that is.23.1 of the n variables.

396
Complexity Theory: The Frontier
Were these the sum of results about the PCP system. as a result of the work of Robertson and Seymour. Worse.. It ties together randomized approaches. it would be regarded as a fascinating formalism to glean insight into the nature of proofs and into some of the fine structure of certain complexity classes. basically due to the fact that really good approximations depend in part on the existence of gap-preserving or gap-creating reductions (where the gap is based on the values of the solutions) and that good probabilistic checkers depend also on such gaps (this time measured in terms of probabilities). their results are inherently nonconstructive: they
. is the next major step in the study of complexity theory after Cook's and Karp's early results. Yet. the PCP theorem. This connection motivated intensive research into the exact characterization of NP in terms of PCP models. there is a deep connection between probabilistically checkable proofs and approximation. Yet now. as we saw in Section 8. as it is commonly known. using a logarithmic number of random bits.e. (Certainly anyone who has reviewed someone else's proof will recognize the notion of looking at selected excerpts of the proof and evaluating it in terms of its probability of being correct!) But. every decision problem in NP has a proof that can be checked probabilistically by looking only at a constant number of bits of the proof string. the existential basis of complexity theory has come back to haunt us. proved that large families of problems are in P without giving actual algorithms for any of these families.6
Complexity and Constructive Mathematics
Our entire discussion so far in this text has been based on existential definitions: a problem belongs to a complexity class if there exists an algorithm that solves the problem and exhibits the desired characteristics. Until the 1980s. in a long series of highly technical papers. proof systems. and nondeterminismthree of the great themes of complexity theory-and has already proved to be an extremely fruitful tool in the study of approximation problems.
9. Theorem 9. In many ways.3. i. there was no reason to think that the gap between existential theories and constructive proofs would cause any trouble.15 NP equals PCP(O(log n). 0(1)). Robertson and Seymour. we have implicitly assumed that placing a problem in a class is done by exhibiting such an algorithm.
D
In other words. culminating in this beautiful and surprising theorem. that it is done constructively. at the same time.

6 Complexity and Constructive Mathematics
397
0
. 3 . and designating the merged Sl/S2 vertex to be the source of the result and the merged t1 /t 2 F1 its sink. where the composition is obtained by taking the union of the two graphs.3. K3 .4). S2 . S2. How did this happen and what might we be able to do about it? A celebrated theorem in graph theory. nor would we be able to recognize such an algorithm if we were staring at it. is a series-parallelgraph. t. * the series composition of two series-parallel graphs (GI. if it is one of * the complete graph on the two vertices s and t (i.2 shows a nonplanar graph and its embedded K3 . where the composition is obtained by taking the union of the two graphs. and a distinguished (and separate) sink vertex. yet we will never find a polynomial-time algorithm to solve any of them. merging sI with s2 and tj with t2. s1. Another family of graphs with a similar property is the family of series-parallelgraphs. as mentioned in the introduction. the results are nonconstructive on a second level: membership in some graph family is proved without giving any "natural" evidence that the graph at hand indeed has the required properties.
Figure 9. or * the parallel composition of two series-parallel graphs (GI. known as Kuratowski's theorem (see Section 2. 3 . states that every nonplanar graph contains a homeomorphic copy of either the complete graph on five vertices.3 -
cannot be turned uniformly into methods for generating algorithms. Definition 9. s. tj) and (G 2 . K5. Finally. t 2 ). Figure 9. t). sI. Thus. A fairly simple argument shows that an undirected graph with two distinct vertices designated as source and sink is series-parallel if and only if it does not contain a homeomorphic copy of the graph illustrated in Figure 9.9.. we are now faced with the statement that certain problems are in P.e. and designating s1 to be the source of the result and t2 its sink. written (G. or the complete bipartite graph on six vertices. tj) and (G 2 .2 A nonplanar graph and its embedded homeomorphic copy
of
K 3 . t 2 ).16 An undirected graph G with a distinguished source vertex. s. a single edge).
. merging tj with s2 .

which justifies writing H -minor G when H is a minor of G.3
The key subgraph for series-parallel graphs. * Given graphs G and H. as any minor of a planar graph is easily seen to be itself planar. can be tested by checking whether one of a finite number of specific graphs is a minor of the graph at hand. the actual proof of Wagner's conjecture is in the 15th paper in a series of about 20 papers on the topic. each with O(n) vertices.16 * Families of graphs closed under minor ordering have finite obstruction sets. (The generalization is in allowing the contraction of arbitrary edges. Planar graphs are closed under this relation." The key viewpoint on this problem is to realize that the property of planarity. {K5 . Theorem 9. Kuratowski's theorem can then be restated as "A graph G is nonplanar if and only if one of the following holds: K 5 Sminor G or K3 . Definition 9. we restate and generalize these operations in the form of a definition.3) in the case of planar graphs. a property that is closed under minor ordering. K3. We call this finite set of specific graphs.3 -minor G. The two great results of Robertson and Seymour can then be stated as follows. cl The first result was conjectured by Wagner as a generalization of Kuratowski's theorem.
Homeomorphism allows us to delete edges or vertices from a graph and to contract an edge (that is.398
Complexity Theory: The Frontier
Figure 9.17 A graph H is a minor of a graph G if H can be obtained from G by a sequence of (edge or vertex) deletions and edge contractions. Its proof by Robertson and Seymour required them to invent and develop an entirely new theory of graph structure. (Thus not only does the proof of Wagner's conjecture
.) The relation "is a minor of" is easily seen to be reflexive and transitive. an obstruction set. it is a partial order. homeomorphism allows us to contract only an edge incident upon a vertex of degree 2. to merge its two endpoints). deciding whether H is a minor of G can be done in O(n 3 ) time. in other words.

Q.1 Membership in any family of graphs closed under minor ordering can be decided in 0(n3 ) time. removing vertices or edges of the graph can only make it easier to embed without knots. (Another. two interlinked cycles.
A graph G belongs to the family
if and only if it does not contain as a minor any of the O s. At this point. An instance of this problem is given by a graph. it also has the more dubious honor of being one of the most complex. Thus the family of graphs that can be embedded without knots in three-dimensional space is closed under minor ordering-and thus there exists a cubic-time algorithm to decide whether an arbitrary graph can be so embedded.D.) A particularly striking example of the power of these tools is the problem known as Three-Dimensional Knotless Embedding. it has a
399
finite obstruction set. we can decide membership. Yet we do not know of any recursive test for this problem! Even given a fixed embedding. But we can test Oi -minor G in cubic time. until someone comes up with a polynomial-time algorithm for the problem. The catch is that the Robertson-Seymour Theorem states only that a finite obstruction set exists for each family of graphs closed under minor ordering. After the Robertson-Seymour theorem. . it was even recursive. now that we know that the problem is solvable in polynomial time. indeed. the question is whether this graph can be embedded in three-dimensional space so as to avoid the creation of knots. we know that the problem is in P. And. form a knot. say {01.9.O . as a few minutes of thought will confirm. as in the links of a chain. (A knot is defined much as you would expect-in particular. graph
.) The power of the two results together is best summarized as follows. catch is that the constants involved in the cubic-time minor ordering test are gigantic-on the order of 10150. we do not know how to check it for the presence of knots in polynomial time! Until the Robertson-Seymour theorem. yet nothing else has changed-we still have no decision algorithm for the problem.-k
}. less important. indeed. we might simply conclude that it is only a matter of time. It is thus a purely existential tool for establishing membership in P. ThreeDimensional Knotless Embedding could be thought of as residing in some extremely high class of complexity-if. Corollary 9.) Clearly. where k is some fixed constant depending only on the family. contracting an edge cannot hurt either. such has been the case for several other.
02. after at most k such tests.E. it does not tell us how to find this set (nor how large it is nor how large its members are). El Proof Since the family of graphs is closed under minor ordering.6 Complexity and Constructive Mathematics stand as one of the greatest achievements in mathematics. less imposing.

if Gi is a minor of Gj.D.17 There is no algorithm that. then the obstruction set for Sx is empty. This is a reduction from K to the problem of deciding whether the obstruction set for a family of graphs closed under minor ordering is empty. Q. Ii Proof. Theorem 9. (This enumeration always exists. the Robertson-Seymour Theorem is inherently nonconstructive. because minor ordering is a partial ordering: a simple topological sort produces an acceptable enumeration. So we should view the Robertson-Seymour theorem more as an invitation to the development of new algorithms than as an impractical curiosity or (even worse) as a slap in the face. if x belongs to K. Planar graphs and series-parallel graphs remind us that the minor ordering test need not even be the most efficient way of testing membership-we have linear-time algorithms with low coefficients for both problems. if x does not belong to K. Exactly how disturbing is this result? Like all undecidability results. then Gi is enumerated before Gj. nor does it preclude individual attacks for a particular family. t) = 1 is closed under minor ordering. again. But now. would output the obstruction set for the family.400
Complexity Theory: The Frontier
families closed under minor ordering. x. this result creates a problem only for those families constructed in the proof (by diagonalization. it states only that a single algorithm cannot produce obstruction sets for all graph families of interest. the set Sx = {Gt I f (x. then the set {Go(x)} is an obstruction set for Sx. It does not preclude the existence of algorithms that may succeed for some families. whereas.E.) Define the auxiliary partial function 0(x) = /ti[step(x. given a family of graphs closed under minor ordering. x. However. t) = 0 or (0(x) < t and Go(x) otherwise
minor
Gt)
Observe that f is total.
. proving our theorem (since any algorithm able to output the obstruction set can surely decide whether said set is empty). i) =A 0] Now define the function f(X
t)
I
O
step(x. since we guarded the 0 test with a test for convergence after t steps. The undecidability result can be strengthened somewhat to show that there exists at least one fixed family of graphs closed under minor ordering for which the obstruction set is uncomputable. For each x. yet. Let {Gil be an enumeration of all graphs such that. Let {f}) be an acceptable programming system. naturally).

immersion ordering uses edge lifting.9. Immersion ordering is similar to minor ordering. and I2. we remain unable to find a four-coloring in low polynomial time. the use of the existential quantifier is now "catching up" with the algorithm community." natural evidence may be hard to come by. it exploits the finite obstruction set in exactly the same manner as described for the Robertson-Seymour theorem. While the algorithm designer can ignore the fact that some graph families closed under minors may not ever become decidable in practice. the theoretician has no such luxury. although we know that all planar graphs are four-colorable and can check planarity in linear time. The second level of nonconstructiveness-the lack of natural evidence-is something we already discussed: how do we trust an algorithm that provides only one-bit answers? Even when the answer is uniformly "yes. An I2 can be done in polynomial time. v. v) and {v. it is not too hard. we can completely turn the tables and identify P with the class of problems that have finite obstruction sets with polynomial ordering tests.
. This view was satisfactory as long as this membership was demonstrated constructively. Theorem 9. Given vertices u. Ii Indeed.18 A problem H is in P if and only if there exists a partial ordering En on the instances of H such that: (i) given two instances i. w}. For instance. and (ii)the set of "yes" n instances of H is closed under An and has a finite obstruction set. Testing for immersion ordering can also be done in polynomial time. The existence of problems in P for which no algorithm can ever be found goes against the very foundations of complexity theory. In particular. edge lifting removes the edges (u. however. The "if" part of the proof is trivial. which has always equated membership in P with tractability. When we turn to complexity theory.2 Membership in a family of graphs closed under immersion ordering can be decided in polynomial time. v} and {v. The basic idea is to define partial computations of the polynomial-time machine on the "yes" instances and use them to define the partial order. Corollary 9. but where minor ordering uses edge contraction. testing I. they proved the NashWilliams conjecture: families closed under immersion ordering have finite obstruction sets. and w. Robertson and Seymour's results further emphasize the distinction between deciding a problem and obtaining natural evidence for it. because we can afford to define any partial order and matching obstruction set. with edges {u. w} and replaces them with the edge {u. yet. however. The "only if" part is harder. w). the inherently nonconstructive nature of the Robertson-Seymour theorem stands as an indictment of a theory based on existential arguments.6 Complexity and Constructive Mathematics
401
Robertson and Seymour went on to show that other partial orderings on graphs have finite obstruction sets.

P. F The checker M defines a relation between yes instances and acceptable evidence for them. in other words. simply as the class of all P-checkable relations. Thus the complexity of such problems is simply the complexity of their search and checking components. we use checkers and evidence generators (provers).18 A decision problem is a pair H = (I. y) belongs to R? * Searching: Given x. it also takes us tantalizingly close to proof systems. thus requiring both generator and checker to run in polynomial time. in deterministic classes. we can write: * Checking: Given (x. M) belongs to a class (Ol. C1 For instance.20 A problem (I. Solving a constructive problem entails two steps: generating suitable evidence and then checking the answer with the help of the evidence. where Ah denotes the resource bounds within which the evidence generator must run and TCthe bounds for the checker.T2) if and only if the relation defined by M on I is both I 1-searchable and C -checkable. for a fixed relation R C E* x E*. based on the relationships among the three aspects of a decision problem: evidence checking. the proof itself should be constructible. we define NP. 2 Resource bounds are defined with respect to the classical statement of the problem. and searching (or construction). y) belong to R? * Deciding: Given x. Definition 9.402
Complexity Theory: The Frontier
These considerations motivate the definition of constructive classes of complexity. In contrast. does there exist a y such that (x. we define the class P. M). decision. to be the pair (P. this definition sidesteps existence problems. As in classical complexity theory. with respect to the size of the domain elements. y) belongs to R. is the class of all P-checkable and P-searchable relations. w2
. find a y such that (x. i. all three versions of a problem admit a simple formulation. placing no constraints on the generation of evidence. does (x. Since only problems for which a checker can be specified are allowed. C2). moreover. Definition 9. Definition 9. In a relational setting.19 A constructive complexity class is a pair of classical complexity classes.. We briefly discuss one such definition. P). (Cl. but we now give the checker as part of the problem specification rather than let it be specified as part of the solution. The basic idea is to let each problem be equipped with its own set of allowable proofs. y).e. where I is a set of instances and M is a checker.

whose article contains all of the propositions and theorems following Theorem 9.
403
9. and a proof that a certain tiling problem is NP-complete in average instance (a much harder proof than that of Theorem 9.1990b] introduced and developed the theory of algorithmic information theory. circuits. Kindervater and Lenstra [1986] survey parallelism as applied to combinatorial algorithms.2.4. Our treatment of average-case complexity theory is principally inspired by Wang [1997]. based on what we termed descriptional complexity. The parallel computation thesis is generally attributed to Goldschlager [1982]. conglomerates. in particular. who proved Theorem 9. uncg.7. The circuit model has a well-developed theory of its own. were laid by Levin [1984]. [1986].edu/mat/avg.4 and 9. Wang maintains a Web page about average-case complexity theory at URL www. Blass and Gurevich [1993] used randomized reductions to prove DisTNPcompleteness for a number of problems.9.7 Bibliography These general definitions serve only as guidelines in defining interesting constructive complexity classes.7
Bibliography
Complexity cores were introduced by Lynch [1975]. [1993] present an altered version of the standard PRAM model that takes into account the cost of communication. he discussed at some length a number of models of parallel computation. for which
. Cook [1981] surveyed the applications of complexity theory to parallel architectures. Hartmanis [1983] first used descriptional complexity in the analysis of computational complexity. who stated and proved Theorem 9. including several problems with matrices. in a notorious one-page paper! The term "distributional problem" and the class name DIsTNP first appeared in the work of Ben-David et al. Chaitin [1990a. The foundations of the theory. That backtracking for graph coloring runs in constant average time was shown by Wilf [1984]. while Culler et al.8). Li and VitAnyi [1993] give a thorough treatment of descriptional complexity and its various connections to complexity theory.8. combining the two approaches to define the complexity of a single instance was proposed by Ko etal.1.html. Karp and Ramachandran [1990] give an excellent survey of parallel algorithms and models. Early results in the area indicate that the concept is well founded and can generate new results. [1989]. a partial version of Theorem 9. they were further studied by Orponen and Schdning [1984]. including PRAMs. they briefly discuss theoretical issues but concentrated on the practical implications of the theory. including Definitions 9. and aggregates.

he also gave a number of equivalent characterizations of the class NC. most of which
. [1992]. Our definition of communication complexity is derived from Yao [1979]. Cook [1985] gave a detailed survey of the class NC and presented many interesting examples. Goldreich and Oren [1994] give a detailed technical discussion of zero-knowledge proofs. O(log n)) and proved several pivotal inapproximability results).12 are from their article. A thorough treatment of the area can be found in the monograph of Kushilevitz and Nisan [1996]. [1986]. The example of graph nonisomorphism is due to Goldreich etal.2 can be found. who also discussed the role of uniformity. Papadimitriou and Sipser [1984] discussed the power of nondeterminism in communication as well as other issues. Theorem 9. [1985].404
Complexity Theory: The Frontier
see Savage [1976]. advocated the use of P-uniformity rather than L-uniformity in defining resource bounds. Diaz et al. while Parberry [1987] devoted an entire textbook to it. JaJa [1992] devoted a couple of chapters to parallel complexity theory. Zeroknowledge proofs were introduced in the same article of Goldwasser et al. [1997] study the use of parallelism in approximating P-hard problems. Goldreich's "Taxonomy of Proof Systems" (one of the overview papers available on-line) includes a comprehensive history of the development of the PCP theorem.13 is due to Shamir [1990]. in its version for NP. Pippenger [1979] introduced the concept of simultaneous resource bounds. [1990]. [1994]. [1986]. Theorem 9. is from Goldreich et al. while Goldreich maintains pointers and several excellent overview papers at his Web site (URL theory.mit.15) is from Arora et al.edu/-oded/pps. our proof sketch follows the simplified proof given by Shen [1992]. who defined the class IP. where the proof of Lemma 9. The proof that IpA differs from PSPACEA with probability 1 with respect to a random oracle A is due to Chang et al. The PCP theorem (Theorem 9. Theorem 9.html). Goldreich [1988] and Goldwasser [1989] have published surveys of the area of interactive proof systems.14. who used the arithmetization developed by Lund et al. Interactive proofs were first suggested by Babai [1985] (who proposed Arthur-Merlin games) and Goldwasser et al. among others. building on the work of Arora and Safra [1992] (who showed NP = PCP(O(log n).9 is from this latter paper (and some of its references).10 through 9. its application to modeling parallel machines was suggested by Borodin [1977]. while the extension to PSPACE can be found in Ben-Or et al. Theorems 9. Ruzzo [1981] provided a wealth of information on the role of uniformity and the relationships among various classes defined by simultaneous resource bounds.lcs. The theory of graph minors has been developed in a series of over twenty highly technical papers by Robertson and Seymour. [1992]. both discuss NC and RNC. [1985]. Allender [1986].

Thus many efforts have been made to develop a theory of computational complexity that would apply to the reals and not just to countable sets. an introduction appears in Downey and Fellows [1995]. [1993] wrote a monograph on the complexity of the graph isomorphism problem. This series surely stands as one of the greatest achievements in mathematics in the twentieth century. a good introduction can be found in the monograph of Kearns and Vazirani [1994]. starting in 1983. the reader should start with the text of Papadimitriou [1994]. Computational learning theory is best followed through the proceedings of the COLT conference. Series B. together with Abrahamson and Moret.1989] have pioneered the connections of graph minors to the theory of algorithms and to complexity theory. [1988. [1989] and Ko [1983]. [1987]. many mathematicians and physicists are accustomed to manipulating real numbers and find the restriction to discrete quantities to be too much of an impediment. it will continue to influence theoretical computer science well beyond that time. Readable surveys of early results are offered by the same authors (Robertson and Seymour [1985]) and in one of Johnson's columns [1987]. complexity theory is concerned only with quantities that can be represented on a reasonable model of computation. [1991]).
. Fellows and Langston [1988. while appreciating the power of an approach based on resource usage.7 Bibliography
405
have appeared in J.1990]. illustrating the type of work that can be done for problems that appear to be intractable yet easier than NP-hard problems. interesting articles in this area include Blum et al. The proof that the theory of graph minors is inherently nonconstructive is due to Friedman et al. To explore some of the topics listed in the introduction to this chapter. However. CombinatorialTbeory.9. Downey and Fellows [1997] are completing a monograph on the topic of fixed-parameter complexity and its extension to parameterized complexity theory. they also proposed a framework for a constructive theory of complexity (Abrahamson et al. Structure theory is the subject of the two-volume text of Balcdzar et al. Kobler et al. As we have seen.