Wednesday, May 31, 2006

Perhaps the very first example one sees of the probabilistic method:
Show there exists a graph on n vertices with no clique or independent
set of size k = 2log n. Simply pick the graph at random. For any set
S of k vertices the probability that the graph restricted to S will be
a clique or independent set is at most p=2-(k choose
2)/2. The probability that any subset S is a clique or
independent is at most p times (n choose k) which is less than
one for k = 2log n. So there must be some graph with no clique or
independent set of size k.

Actually constructing such a Ramsey graph is another story. You can create
the graph in quasipolynomial (2poly log n) time using
standard derandomization techniques. In 1981, Frankl and Wilson
had a polynomial-time construction of a graph that had no clique or
independent sets of size 2(log n log log
n)1/2. That bound stood until the recent STOC paper by
Barak, Rao, Shaltiel and Wigderson creating a graph with no clique or
independent set of size 2(log n)o(1).

Barak et. al. were not trying to create Ramsey graphs, rather to
create randomness dispersers for two independent weakly random
sources. As a bonus they improved the Frankl-Wilson bound giving yet
another example where proving one kind of theorem in complexity gives
an exciting result in an apparently unrelated area.

Tuesday, May 30, 2006

Who would have thought the next theoretical computer science
powerhouse would have come from Atlanta? Georgia Tech this year has
hired Santosh Vempala and Adam and Yael Tauman Kalai into an already
amazing theory
group. In the past few years Georgia Tech has gone from a good
theory program to one of the largest and strongest in the country. How
did they accomplish that feat? Seeing potential where others haven't,
solving two-body problems both inside and outside the department and
most importantly having the resources to go after opportunities when
they occur. Alas two of their recent hires came at the expense of
Chicago/TTI but hey, all's fair in love, war and recruiting.

It's not like other departments have stopped hiring in theory. In the
past two hiring years alone several schools have hired junior
theorists including Carnegie-Mellon, Cornell, Michigan, MIT
Math, Penn State, Rochester, Stanford, Washington and
Wisconsin. Many universities realize that in order to have a strong CS
department one needs a strong theory group and in order to improve or
even maintain strength in theory one needs strong young talent.

Thursday, May 25, 2006

A few publishers showed their wares at STOC. This is a good
opportunity to talk them about book ideas or about publisher's
policies and how they operate. You can also get some good books at
a discounted price; I picked up J. Michael Steele's The
Cauchy-Schwarz Master Class from Cambridge University Press based on a
recommendation.

I had a lengthy conversation with Sweitze Roffel, who took
over Chris Leonard's position as publishing editor of the Elsevier
theory journals. Sweitze is quite aware of the negative perception of
Elsevier in the theory community and wants to talk to the community
about their concerns. So talk to him at conferences or send him email and tell him
your concerns about Elsevier's policies.
A couple of nuggets.

Sweitze will move the offices for his journals from Amsterdam to
New York to emphasize the global nature of Elsevier and be closer to editors.

Elsevier has an arrangement with Microsoft's Academic Search but negotiations
with Google Scholar are going slowly because
of Google's "secretive" policies.

Elsevier plans to give contributors (editors, referees, authors)
access to their Scopus
system. Elsevier also has their own free academic search site Scirus.

Wednesday, May 24, 2006

A commenter asked about the Fly America Act
that requires, with few exceptions, that when traveling abroad on US
money, such as an NSF grant, one must use a US-based carrier. This
silly protectionist law just supports American airlines with tax-payer
dollars and in the end wastes both money and time. Not only should I
be able to fly a foreign carrier to other countries I should even be
able to fly a foreign carrier between US cities.

As a practical matter the Act is more a nuisance than a serious
problem. US carriers have an extensive collection of foreign routes,
you can fly most foreign flights via a US-airline codeshare, and in a
jam one could launder some grant money to a non-grant account and then
use non-grant money to fly the foreign carrier. Not that I ever have,
ever would or ever condone taking the last action.

Tuesday, May 23, 2006

I don't go to many talks at STOC, I prefer hanging out in
the hallways and talking with other attendees. But the best talk I saw
so far was the first, Atri Rudra gave an easy to follow overview of a
his paper Explicit
Capacity-Achieving List-Decodable Codes with Guruswami. Both
student paper award winners also gave nice overviews of their
technical results. Much better than trying to wow us with complicated
formulas.

What were the folks in the hallways talking about? Worry about funding
in the short term but cautious optimism a few years down the
line. Some optimism on employment; many good places hired in theory
this year and most students found postdoc, faculty or industrial
positions. I didn't see many students scrambling for jobs. Last time
STOC was in Seattle (1989) I was one of those scramblers.

Prabhakar Raghavan, a theorist who now heads Yahoo! Research, gave the
first invited talk about some mathematical questions related to
Yahoo. Prabhakar spent a considerable part of his talk on sponsored
search, the bidding and ranking mechanisms for those who pay to be
listed right of the search results. Yahoo currently uses a variant
of second-price auctions that is
non-truth telling and has some other flaws but is simple enough for
people to understand how much they will pay. Google on the other hand
use more complicated schemes with nicer properties but most if not all
of its users don't really understand the bidding mechanism.

Russell Impagliazzo gave the other invited talk on pseudorandomness,
his title slide containing a joke only my generation would get.

Let's secretly replace Al's coffee cup of random bits with
pseudorandom bits and see if he notices.

Russell's take-home message: Randomness does not help in
algorithms but we can't prove it doesn't help until the circuit
complexity people (like Russell) get on the ball and prove some good lower
bounds.

On a personal note, today I have lived as long as my father. Puts a
real perspective on life.

Monday, May 22, 2006

There were 275 registrants including 109 students. Rooms at the
conference hotel ran out slightly before deadline and the organizers
scrambled to find rooms at a nearby hotel. Be sure and reserve your hotel early
in the future.

As of January 2, Bill Steiger has become the new program
director for theoretical computer science at the NSF. He expects to
fund 15-20+ awards from an
estimated 100 theoretical computer science submissions to the Theoretical
Foundations solicitation due this Thursday.

Let n be an odd number and n≠rq for r,q>1. Fermat's
little
theorem shows that for any a, 1<a<n, if
an≠a (mod n) then n is composite. But for a special
set of composites called the Carmichael Numbers, this test will fail
for all a. Miller adds the condition that
a(n-1)/2k-1 and n are relatively prime for all
2k dividing n-1 and shows that assuming the Extended
Riemann Hypothesis, for any composite n there is an a < O(log2
n) passing the test. This gives a polynomial-time (in the
number of bits to express n) algorithm for
primality assuming ERH. Rabin showed that one could instead choose
random a's giving a probabilistic algorithm with no assumption. The
resulting test is now called the Miller-Rabin primality test. Solovay
and Strassen give a different test based on the Jacobi Symbol.

Of course now we know
that primality has a polynomial-time algorithm with no unproven
assumptions. So why did I choose these obsolete algorithms for
favorite theorems? They are not so obsolete as they are considerably
more efficient then Agrawal-Kayal-Saxena. But more importantly they
served as the motivation for the study of probabilistic
computation, much like Shor's Quantum Factoring Algorithm motivates
quantum computing today. Without studying probabilistic computation we would
have had no modern cryptography, no derandomization results, no
Toda's theorem, no interactive proofs and no probabilistically
checkable proofs with their hardness of approximation consequences.

Faculty
Salaries. Truly useful when negotiating your pay. An interesting
inversion where the schools ranked 13-24 in CS pay higher salaries
than the schools ranked 1-12. And just who is being paid $400K?

The end of the report sums up the statistics nicely.

As predicted last year, our field is producing Ph.D.s at a record
rate, and the short-term forecast is for continued record
production. While there is no evidence in our employment statistics
that the increased production is resulting in an inability of
Ph.D. graduates to find work, an increasing fraction of new Ph.D.s
appear to be taking positions outside of North America. In the wake of
accelerating globalization of the marketplace, this is not surprising.

Three consecutive years of decreasing numbers of new Ph.D. students,
and a sharply reduced pipeline at the Bachelor's level, will make
it difficult to sustain this production rate in the longer term.
Moreover, it is not yet clear when the decline in our undergraduate
program enrollments will end. The double-digit percent decrease in
bachelor's production observed this year is likely to continue for
the next several years. Coupled with the declining representation of
women in our undergraduate programs, our ability to produce a
workforce that is sufficiently educated technically to meet the needs
of the job market in computing is being severely challenged. The
declining enrollments at the Bachelor's level also will
increasingly challenge the ability of CS/CE departments to grow their
faculty as they desire.

Tuesday, May 16, 2006

How could I not blog about Jonathan Farley's op-ed piece The
NSA's Math Problem? Farley looks at the NSA's use of phone data as
and argues that using graph theory to analyze the calls won't help
find terrorists. But Farley doesn't do a particularly good job making his case.

First, the "central player" the person with the most spokes
might not be as important as the hub metaphor suggests. For example,
Jafar Adibi, an information scientist at the University of Southern
California, analyzed e-mail traffic among Enron employees before the
company collapsed. He found that if you naively analyzed the resulting
graph, you could conclude that one of the "central" players
was Ken Lay's…secretary.

Somehow if we find the "secretary" of a central US terrorist
that should be considered a major success. But more importantly if we
know just a little about some terrorist activities the graph can give
us a great advantage in finding important sites. Google's PageRank
primarily uses graph theory very successfully to rank order
search results and there is no reason similar ideas won't work on
phone data as well.

By no means should we accuse someone of being a terrorist solely because of
their calling pattern. Will all terrorists be found on phone data
alone? Of course not. But using graph information can give us
important information as to where to look and narrow the search.

The main objection to the NSA's work is not that the phone data has
little value but that it is too valuable. You can use the data to find out
much more about Americans than who is a terrorist. We lose our
freedom against government intrusion when the NSA has this data and
that's what we need to argue against, not some fake argument that the
NSA's algorithms won't work.

Monday, May 15, 2006

Suppose we could create a system where all automobile traffic in the
US would be controlled by a central computer system. Traffic would
flow more smoothly, fuel consumption could be better controlled, but
with a catch that failures in the system could cause 10,000
deaths/year.

Keep in mind that we now have over 38,000 automobile fatalities per
year. Even if we leave out alcohol-related deaths that number drops
only to about 24,000. Still the public would find 10,000 deaths
unacceptable and we would junk a system that would actually save lives
as well as time and fuel.

I find when we frame the debate on the unreliability of computers, we
usually measure it against perfection, rather than measuring it
against the status quo. Whenever I read the Inside
Risks column at the end of each CACM, I feel they fail to point
out how little the risks are compared to the advantages of computing,
instead of pointing out how bad the risks compared to unachievable
perfection.

Consider electronic voting. I noticed that Diebold, the company in the
middle of the electronic voting controversy, also makes the ATMs I use
to withdraw money from my bank. ATMs are not foolproof, thieves have
managed to fake ATM cards and discover passcodes to steal money from
these machines. But banks know that the labor cost savings they get
from ATMs greatly outweigh the losses.

Will electronic voting ever completely prevent any kind of fraud? Of
course not. But will it beat out the systems we currently have in
place? That's not that high a bar to pass.

I can vote proxies on my stocks and mutual funds over the
Internet. There are some real money issues involved in the proxies. If
Internet voting works well enough when serious money is involved, why
can't we use it for general elections as well?

Next year both Complexity and EC will be part of the Federated Computing Research
Conference (FCRC) in San Diego along with a plethora of other conferences
including STOC, Computational Learning Theory (COLT), and Parallel Algorithms
and Architecture (SPAA). June 13th is the day of death: STOC (2
tracks), EC, Complexity and COLT all have sessions that day. A
theorist could find him or herself wanting to see five talks all given
at the same time.

Thursday, May 11, 2006

Ehud Kalai proposes
renaming the Game Theory
Society to the Game Science Society and welcomes your
comments. The goal of changing the name of the society is to change
the name of the field to better describe what the field does and
broaden its image.

An important example where the expanded name may get additional
support is within a university. For example, it is hard to imagine the
creation of a department devoted to the study of a theory. On the
other hand a department devoted to study a science seems more
plausible. To put things in perspective think of an analogy within
another young field. Devoting major resources to a subject called
"computing theory" less likely than devoting major resources
to a subject called "computer science."

I've mentioned changing the name of Game Theory before
but then again I never felt Computer Science is a great name.
Following Kalai's reasoning, what
if we renamed "Theory of Computing" to "The Science of
Computing". Would that make our field sound more noble and
generate more funding?

Wednesday, May 10, 2006

Razborov and Rudich's paper Natural Proofs
gives a combinatorial framework for proofs to show that circuits in
a given class cannot achieve a given computational task. The paper
explains the difficultly of extending these proofs to show, for
example, that NP does not have polynomial-size circuits (and thus
P≠NP). But do natural proofs really present a barrier to proving
important circuit lower bounds?

One approach to showing NP does not have polynomial-size circuits: Find
some property C of functions such that SAT is in C. We then show,
using some sort of inductive argument, that no function computable by
polynomial-size circuits can have property C. This would imply SAT,
cannot have polynomial-size circuits.

Briefly a natural proof is such a proof where C has two properties.

Largeness: C contains many functions.

Constructivity: One can efficiently verify that a function f is in C.

Razborov and Rudich show that such a proof against polynomial-size
circuits would break pseudorandom generators and in particular imply that
the discrete logarithm is not hard. So under reasonable hardness
assumptions, natural proofs cannot be used to prove lower bounds
against polynomial-size circuits. See the paper for more
details.

Sounds bad for proofs against circuits. But let's consider the two
properties. The authors give a good argument why the largeness
condition should hold, however

We do not have any similar formal evidence for constructivity, but
from experience it is plausible to say that we do not yet understand
the mathematics of Cn outside exponential time (as a function of n)
well enough to use them effectively in a combinatorial style proof. We
make this point in Section 3, where we argue that all known lower
bound proofs against nonmonotone circuits are natural by our
definition.

Indeed they do show all known proofs are natural, but in some cases go
through considerable effort to "naturalize" these proofs (as
opposed to relativization where the fact that a theorem relativizes
follows immediately from the proof).

Consider what I call quasinatural proofs, where we only require the
largeness condition. One might say that if discrete logarithm is hard
then a quasinatural proof must prove the nonconstructivity of C. But
really you get a conditional. If there are quasinatural proofs against
polynomial-size circuits then

If C is constructive then Discrete logarithm is easy

which is just a "pigs can fly" theorem that we see often in
complexity.

Avi Wigderson points out that unconditionally you cannot have a
natural proof showing that the discrete logarithm problem is hard. If
we unravel this statement then we get that giving a quasinatural proof
showing discrete logarithm is hard would require proving that
discrete logarithm is hard, hardly a surprise.

I don't have an example of a quasinatural proof not known to be
natural as we have very limited techniques for proving circuit lower
bounds. Natural proofs do give us some insight into what kind of proof
techniques we need for stronger lower bounds, but they do not, in and
of themselves, present a major barrier to finding such proofs.

Tuesday, May 09, 2006

A computer science paper goes through many phases: manuscript,
technical report, conference submission and proceedings and journal
submission and published version. As a paper goes through these stages
they usually improve, adding more background and intuition, better and more
detailed proofs and so on.
If someone wants to read your paper, you'd like them to look at the
latest version. How do you make sure that they even know about the
latest version?

You can't go into everyone's paper proceedings and add a yellow sticky
to your paper saying to check out the new and improved journal
version. But in this electronic age we can, in principle, add these notes.

First of all keep the papers on your webpage up to date. Many people
just go to an author's page to download a paper and often they find
some ancient version.

But after that then what? ECCC allows one to submit a
revised version or add a comment which could point to a revised
version. arXiv allows one to
add journal information to an existing paper. Both of these require
actions by authors that rarely happen.
The digital libraries of proceedings publishers ACM and IEEE-CS
don't have any mechanism to add pointers to later papers.

The field should have some standard mechanism for updating pointers to
future papers. Until then we have to rely on the readers to find the
latest papers on their own and perhaps hope that paper search tools like Citeseer, Google Scholar and Microsoft Academic Search
will point to the latest and greatest version of a paper.

Sunday, May 07, 2006

Rejection hurts. Academics thrive on earning the
respect of their peers and it's tough to think that someone doesn't
want you. So go ahead and be depressed for a day or two and then move
on.

Stanford is my ultimate rejecter, having turned me down for undergrad,
grad, junior and senior faculty positions without the least bit of
interest. But I got my revenge–I once got a parking ticket at
Stanford and I never paid it. Ha!

A few people have complained to me about how rejections letters are
written. A
rejection letter contains exactly one bit of useful information. The
rest is irrelevant and you should not let it get to you.

Suppose Alice sends email to her friend Bob asking if Bob's department
would be interested in her. In academics, Bob usually won't give his
real thoughts ("We are looking for strong candidates and you are
not one of them"), instead he'll find some property P such that
Alice has P but they won't hire in P, for example "Unfortunately
we are not looking for any cryptographers this
year." A couple of warnings for Bob:

Be sure P is not illegal,
i.e., based on religion, race, gender, etc.
Even if you don't
discriminate, saying that you do is not a smart thing.

Bob's department might end up hiring a cryptographer. Then Alice will
realize that Bob didn't want Alice because she was a cryptographer,
rather he just didn't want Alice.

Thursday, May 04, 2006

As a scientist I should support such a project, especially one located
in the suburbs of Chicago. But at what cost? As a perspective, next
year's proposed budget for the entire National Science Foundation is
just over $6 billion.

This ILC reminds me of the Superconducting
Super Collider, a project that spent $2 billion dollars digging a
hole in Texas that was killed by Congress in 1993 once the projected
costs topped $12 billion.

Putting large dollars into a single basket will take away the
incentive to increase basic research funding in other scientific
endeavors. The main argument for the ILC at Fermilab is not that the
research won't get done, it just won't get done in the US. So let the
Europeans or the Japanese have the flashy expensive collider and let
the US do what it does best—basic research advancing science
over a large range of disciplines.

Wednesday, May 03, 2006

For a talk I wanted to show a map of the United States using four
colors with the usual constraint that every pair of states that share
a common border have different colors. The four-color theorem says
that such a coloring must exist.

So I tried Googling to find such a picture of a four-colored United
States but I couldn't find one. NASA states it as a challenge
but doesn't give the solution. Most maps I found like this one

use five or more colors, probably because they use a simple greedy algorithm.

Four coloring the US is not difficult. I found a
Map Maker utility
that lets you color the states anyway you want. Here is my four
coloring.

My independent sets:

AK, AL, AR, CT, DE, HI, IL, ME, MI, MN, MT, NE, NM, NV, SC, VA, WA

AZ, DC, FL, KS, KY, MS, NC, ND, OR, PA, RI, TX, VT, WI, WY

CA, CO, GA, ID, IN, LA, MA, MO, NJ, SD, WV

IA, MD, NH, NY, OH, OK, TN, UT

The United States cannot be three colored, just consider Nevada and its
neighbors.

When writing this post I Googled on "four-color theorem" and
the first link was this
page which features a four-colored United States. Well at least I
got a weblog post from all this work.

Tuesday, May 02, 2006

Bob Soare, who wrote one of the great
textbooks on
recursion theory and then almost
single-handedly changed the name of the field to computability theory,
teaches an intense two-quarter class on the topic every other year in
Chicago. To my surprise just now, halfway through the second quarter
(15 weeks into computability theory) he is just proving the solution
of "Post's Problem", the existence
of incomplete degrees, that
excited Gödel in his letter.

When I sat in on Soare's class in the early 90's, by this time he had
covered much more complicated finite injury arguments and was starting
the infinite injury constructions like the Sacks Density Theorem:
Given r.e. sets A and B, such that A is reducible to B and B is not
reducible to A, there is a set C that lies in between.

I asked Soare about this last week. He isn't dumbing down his class,
rather he's acknowledging a change in direction in the field, more
along the lines of looking at the complexity of reals (infinite
sequences of bits), often by examining those defined by infinite
branches of computable trees.

For example, many computability theorists today are studying notions of
"random reals", infinite sequences that share some
properties of randomly chosen numbers. They have shown neat
connections to Kolmogorov complexity and connections to Chaitin's
Ω. For any reasonable ordering, Chaitin's Ω is a computably
enumerable random real and Kucera and Slaman show the surprising
result that the converse is true as well.

One used to measure a
recursion theorist by the difficulty of their constructions; now we
see more a focus on the beauty of the theorems and their proofs.

Soare is working on a new version of his textbook that will differ in
a couple of ways. He is changing terminology (recursive and r.e become
computable and c.e.), but more importantly he changes the focus to
more strongly develop the theory that drives computability today.
A good lesson: Fields change over time and better to acknowledge and
embrace those changes than to fight them.