a personal view of the theory of computation

Checking the Higgs Boson

Dennis Overbye is Deputy Science Editor of the New York Times. He has just written the lead article for Tuesday’s “Science” section of the Times, which is entirely devoted to the recent discovery of the Higgs Boson.

Today Ken and I want to talk about the large-scale human-reliability and software-reliability side of the equations.

As Overbye reports, the corks have popped on the bubbly, the press releases are out, people are buying their tuxedos, the Nobel and other prizes are coming, all is set—the elusive Higgs particle has been discovered. Or has it? How can we know, and when can we know it?

Overbye is a terrific reporter and writer who has written two books, Lonely Hearts of the Cosmos and Einstein in Love. His article, like his books, details the personal stories of the major scientists involved in these quests for discovery, and how they coped with issues along the way. We suggest you read the article yourself, and Overbye’s companion piece, “All Signs Point to Higgs, but Scientific Certainty Is a Waiting Game.”

Checkpoints

Two things have struck us about Overbye’s articles and some surrounding commentary from the web. One is the involvement of over 3,000 people on each of the two major teams, ATLAS and CMS, working on the Higgs detection at CERN’s Large Hadron Collider. Only a few below the team leaders are mentioned in the article—of course only a few can be—and these include several graduate students mentioned by name. But surely many more must have had critical and interconnecting roles vital to the integrity of the results, including the seven million lines of code needed to run ATLAS, for one.

The second is the fundamental skepticism in force at various points in the process, especially regarding the December 2011 pre-announcement of ‘evidence’ which we discussed (see also this). This included the imposition by both teams of “blind” procedures to reduce human bias from January through June, 2012, until the data was felt sufficient for results to be “opened.” Even now there are two discordant notes in the results: the ATLAS team has two different computations of the Higgs mass, 124.3 GeV versus 126.8 GeV, which lie outside each other’s error bars (while CMS says 125.8 GeV), and the data has not yet pinned down that the particle has the spin value of 0 needed to be the Higgs, rather than 2. Of course scientific skepticism is necessary, and the articles show its good side. But all of this still has us wondering:

Have the human and software components been checked as thoroughly as the physics side has?

We are not doubting CERN here—CERN itself is famous as a bastion of the open-source movement and its benefits for reliability. Okay, we are doubting CERN here. There—we said it. Let’s first talk about some doubts and then discuss explicitly a criterion that may help.

Who Checks The Checkers?

CERN’s collider costs a lot. Here is a way to define “a lot”: Google could build only about 25 colliders based on Google current capital valuation of 275 billion dollars. The main goal of the CERN collider is—was?—to discover the Higgs particle. The curious situation is that close to all physicists believed that the particle existed, before it was discovered. The physics community “knew” that there must be such a particle, but they needed experimental proof. Very good.

To make the experiments more reliable, they cleverly had two independent teams setup to do experiments to search for the Higgs. But were they really independent? We infer they were not—rather, they were symbiotic. Imagine the following scenario:

One team, say ATLAS, discovers the particle while the other does not. Who would get the credit, the acclaim, the prizes?

Of course the teams were not independent in another sense: they used the same collider. The huge cost of the collider means that right now there is no other place on earth that can run the same experiments.

We are not physicists, although Ken knows a fair bit of physics, but I think there are fundamental issues that Overbye does not address. How do we really know that the Higgs boson has been found?

A Relative Matter

Arthur Eddington famously tested Albert Einstein’s theory of general relativity. His theory predicted that light would bend twice as much as Newtonian mechanics predicted. Eddington used the 1919 Solar Eclipse to confirm Einstein. For years there has been discussion that the confirmation could have been tainted by Eddington’s strong belief that General Relativity had to be right.

We now know via many other experiments including our daily use of GPS that General Relativity is certainly correct in predicting the almost-double effect of light bending. But in 1919 when the experiments were performed, and for years afterward, there was discussion of how reliable they were. A recent paper by Daniel Kennefick argues that there was no problem with the experiments. The title explains all: “Not Only Because of Theory: Dyson, Eddington and the Competing Myths of the 1919 Eclipse Expedition.”

So the issue of preconceived beliefs in science are certainly not new. What can we do about them?

Self-Checking in Experiments

The idea of self-checking in programs has been studied in theory for over two decades. One way you can be confident in results obtained via computation is when there is a quicker way to verify the answers after you have them, or when the answers themselves come with a proof of their correctness.

We imagine that most if not all of the LHC’s machinery is self-checking. LHC physicist and software manager David Rousseau wrote a column for the Sep.–Oct. 2012 issue of IEEE Spectrum on the software for ATLAS, and noted:

“Meanwhile, the reconstructed [particle-collision] events are distributed worldwide, and thousands of histograms are filled with well-known quantities and monitored semiautomatically for a significant departure that would indicate a problem with the data. The process lets physicists analyze an event with confidence in less than one week after it’s recorded.”

Further down he mentions “a wealth of self-calibrating algorithms” which ensure that one detector is recording data in a manner consistent with other detectors. He then goes on to discuss how the raw data is refined by a human-guided process that has many heuristics to try to isolate subsets that will have interesting events.

Still, we wonder whether necessary features of the experiments bring their own vulnerabilities. We’re prompted by memory of the mistaken reporting by the OPERA experiment of faster-than-light speeds for neutrinos with high confidence. They were apparently faulty owing to relatively simple unverified features. In fact the cause of the faulty data is still on-record as being

“a loose fibre optic cable connecting a GPS receiver to an electronic card in a computer.”

We speculate that the OPERA experiment had less innate self-checkability in general, owing to its very fine tolerances on speed measurements and intrinsic difficulties in working with neutrinos over long distances on Earth. Two distant stations had to be synchronized after subtracting out relativistic effects, in a way that seems not to have come with its own verifiability check.

Happily others were able to carry out similar experiments relatively cheaply, and the other n-1 votes had already upheld Albert Einstein’s light limit by the time OPERA corrected their equipment-data analysis. Outside checkers for CERN are less available, however, and we are not the only ones to worry about possible weak links in the human-data chains.

Open Problems

Is there a sensible “self-checkability metric” for computer-dependent physics experiments?

We also note via Peter Woit’s physics blog two articles posted on the Simons Foundation website. The first, by Natalie Wolchover, is about computer-aided mathematical proofs, while the second is an essay by Barry Mazur on the even murkier subject on what kind of “evidence” can lead you to believe in mathematical results while finding directions to try to prove them.

Three possibilities:
1/ P!=NP and for this very reason, you can’t prove it.
2/ P=NP but to prove it, you’d have to find an algorithm that defeats human intelligence.
Because 1/ and 2/ are equivalent for all practical purposes, there’s also:
3/ P=NP is undecidable.

First class mathematicians have been trying to prove all three – with no definite result so far.

People, the publishing standards does not allow first class mathematicians to express their opinion in written, when it is not a finished product. Give me intuition what did you try and why do you think it is not working, what part is missing, please do not keep it private. I want to know what structures you have used, where is the problem. There are plenty of smart young people around. I heard the the saying, attributed (may be wrongly) to Einstein, when he was asked how advances are happening. The reply was: Most of the people believe it is not possible, but than someone who was not aware it is not possible start it and do it. We need to know what is going wrong to fix it. and you know that since it is not solved for a long time we need fresh ideas from the fresh heads. Remember the chess programs have much less not interesting draws, compared to first class chess players, because they are not afraid about their reputation.

I give you another example, in Signal Detection Theory in yes/no (publish/not publish) task, one establish subjective criterion where to decide signal(result)/ noise. It is not optimal to have very conservative criterion. The optimal one is having equal amount of misses and false alarm, unless you bias yourself with payoff matrix. Let independent (although they are highly biased by the need to run their carrier) actors to verify/reject. Of cause in this situation we loose the fame and priority. But as one of my advisers told “You need to choose either science in you or you in science”.

@Javaid
You have to make a distinction between “algorithm” and “efficient algorithm”. This is why the class P was invented for. It will lead you to another meaningful distinction, that between average programmers, good programmers and superhuman programmers.
@mkatov
I’m more pessimistic than you about this matter. Most people who’ve tried to settle this conjecture seemed convinced at first that it was an easy problem. They changed their minds *after* trying.

I think the point of pessimism is precisely not willing to believe there are “smarter” people than you. If you tried, and failed, because one of the pieces is missing in your mind, it does not mean there are missing in another mind, but the other mind may miss the pieces you have. That is why i want to know the intuition, and where it got wrong, even before starting to settle the problem. i have no knowledge about determinantial varieties, and i’ve been told that the following construction may be missing from mathematical world.

take the system of polynomial equations, for simplicity just a set of quadratic equations f_i(x)=0. Homogenize it for simplicity, and assume homogenization variable x0. If system is not consistent, 1 is in the ideal of system, and is a Groebner basis. this is Nulstellensatz refutation system. Next, form the vector of all quadratic monomials X, your system than is M X =0, where M is the coefficient matrix. Any solution of the system should lye in the kernel of M. Therefore, we can express X in terms of linear combination of kernel basis vectors say with coordinates C. Now the interesting part. X is quadratic vector, so it has representation as rank one matrix xx^T (guess what I mean, if you are still reading). Every minor is a tautology (i.e. === 0 in x variables ), but not in C variables. Moreover, the solution in C should satisfy tautology. We have two questions here. What degree tautology should be satisfied, to guaranty C representing true solution, and second since C is smaller that number of quadratic monomials, otherwise, it seems that groebner basis is found, how many recursive steps are required to get to the groebner basis, and particular for refutation to 1 (x0^…). Is it more efficient than groebner basis, with whatever convoluted algorithms it may currently use.

OK, you’ve designed a more efficient way of computing algebraic varieties than the Gröbner basis. Good for you – and thanks for the primer in computational algebraic geometry.
But Luke is right to say that intuitively, NP-complete programs don’t have an efficient algorithm. He says it doesn’t exist. I say it exists but will never be found. You say it will be found someday. Only time will tell…

I strongly disagree. P!=NP is nothing but the more reassuring interpretation of our experience with this problem. Another likely view is that there are algorithms which exist mathematically without being producible by any thinking being.

hi, interesting pov. this is a very deep issue. science is becoming amazingly more complex in our era, fueled to a large degree by computers and other technology [its a self-reinforcing feedback loop]. a whole book could easily be written on this, but not sure one has. we’re maybe 2 centuries past simple experiments like a frog leg that kicks when voltage is applied. and also the theory is very complex and specialized also. this is especially seen in mathematics and theoretical computer science.

there are several different ways to answer this problem. one is to realize that experts exist, that modern society and culture and science are highly dependent on them. even that is an understatement. massive edifices of our reality are *constructed* and *maintained* by experts.

so your question is really like, “what happens when the experts are wrong?” such cases definitely happen. we cannot hope to self-check large proofs or scientific studies after a certain scale is reached. they have to be seen as “big buildings” that are built by large teams of experts.

this has a lot of good scientific precedent to study. one of the best analyses is by kuhn who came up with a framework to consider such large scale scientific “revolutions” in his book “structure of scientific revolutions”. in a sense a kuhnian revolution is in fact a massive shift in scientific understanding based on *wrong* ideology but which is widely adopted. there have been many such shifts in human history.

another answer is that we have to see “proofs” and “results” as milestones and monuments that can sometimes later be toppled. objective truth and certainty are theoretical constructs, from the platonic realm, that approach becoming illusions over long periods time. they’re a snapshot in time of human knowledge. we will have a better picture in the future, we will understand the error bars and discrepancies better. this is the ongoing evolution of truth & understanding, going on since the beginning of humanity. =)

There is no chance for an individual or a small group of people to check or to challenge the reports from those big experiments in the way of their procedures. But their results can still be checked and challenged “intelligently”. The major challenge for the OPERA result is not a procedure check but is from too many known facts (including some principles), and the procedure check finally did the job.

If nature did not adapt Higgs mechanism, any proof either from CERN or else will eventually be dropped although it might take a long time, in decades or even centuries.

and i completely agree here in physics there is an objective reality – the nature, which does not care about social acceptance of any theory. The sooner or later, when one would incorporate physical theories in a real device, he would be challenged to correct all missing theoretical parts, which is called engineering – filling the gap between imaginary and real.

I’m a physicist who works on one of the LHC experiments that discovered the Higgs, and enjoy your P=NP discussions here. Let me try to gives just a few good reasons why we are so convinced that the Higgs is real and our measurements are accurate:
1) ATLAS and CMS (the two detectors where the discovery was jointly made) are extremely independent tests. The fact that they are observing proton collisions at the same accelerator is not an issue. The collisions at ATLAS are completely independent from the collisions at CMS. We use different software, different detectors, and don’t discuss things with each other until the results are public. They are as separate as two telescopes looking at the sky in the Northern and Southern hemisphere. Also, the Tevatron accelerator, in Chicago, has seen 3 sigma evidence for Higgs decays.
2) The experiments and software are continuously tested against the particles of the Standard Model that are well-established, such as the top quark, W, and Z bosons. The decays of these particles give signatures that include the same features as in Higgs decays, such as high-energy photons, electrons, and muons. We understand in great detail the response of the detector to these particles, often at the sub-% level.
3) Evidence has been seen for many different decays of the Higgs boson. In particular, the decays to photons and the decays to Z bosons are extremely “clean”. All decay products of the Higgs are detected in these decays, and the total energy (mass) can be summed up. The Higgs events are grouped together in a cluster at the Higgs mass (125 GeV), whereas background events from other known processes have a smooth distribution over a large mass range. 3a) It is very hard to imagine any process or experimental error which would conspire to form a peak at one particular mass on top of a smooth background. 3b) It is impossible^2 to have a bump appear at the same mass in two different decay channels!

Something you didn’t worry about, but is actually probably the largest concern, is tampering / forgery. A person with a very good understanding of Higgs physics and the LHC experiment, such as a rouge member of the experiment itself, could try to place fake Higgs decay signals into the data. The only time/place this could happen is in the software running on a large computer farm that first records the data from the experiment. This would be extremely difficult to pull off. First, access to the farm is limited to a small subset of the experimenters and not connected directly to the outside world via internet. The software running is open-source and regularly rebuilt and put on the farm to include enhancements. Second, the code would have to have been running since early 2011, since Higgs events have been showing up continuously. Third, producing a convincing fake signal at just the right rate would be extremely challenging. Simulated events are regularly generated for legitimate reasons all the time, but despite our best efforts to mimic real data as accurately as possible, the simulated events can still be identified by their flaws by experts. Certainly a large amount of code and steady access to large databases of offline information would be needed even to perform the standard simulation – which would easily be detected quickly. Lastly, this would need to be carried out at both LHC experiments!

There will always be conspiracy theories, but the chance that the Higgs results are wrong is in my opinion somewhere between the chance Obama was born in Kenya and aliens built the pyramids of Egypt. :)

Thanks very much—this is good to see succinctly in public. (First comment from someone is moderated, then comments pass thru freely except the N-th where N == 0 (mod K)—and we don’t know what K is or even if it’s constant.)

Actually Dick and I did talk about your “largest concern”, but we decided to mollify the doubt on the human side down to a few clauses in the “Who Checks the Checkers?” section, so as to emphasize “self-checking” as a positive structural idea.

We also left unsaid that actually the properties found so far are completely consistent with the “garden-variety” Higgs of the Standard Model—the sense seems to be that the discrepancy in the two ATLAS mass readings is not like the 1990’s discrepancy in ages of stars which presaged the non-zero cosmological constant. I’ll still be curious to learn more about it, though—full CMS results on it coming next week?

Ironically—and perhaps relevant to say—I myself submitted (uniquely?) the second correction to the NY Times companion article, which you can now see dated March 8th at the bottom. I noticed because I had mentioned the semi-stability point in my own Dec. 2011 post. Human factors indeed… :-)

It is to my great surprise that tampering / forgery is discussed here. I do not believe that any type of tampering can stand the test of time. I do not doubt that a new particle is found. But, a new particle found during the Higgs search needs not to be a Higgs. Yet, you are saying that the Higgs was discovered. This is the reason that I do agree with the concerns of this article. Not a single person truly knows the entire enterprise. And, at this point, no one truly knows that it is a Higgs while many people claim it is.

The industry experience is that you get a bug per thousand lines of code. If you work much harder, you get a bug per two thousand lines of code. The oft-quoted Space Shuttle main code, 500,000 lines, says that they achieved a bug per 20,000 lines of code – though at a cost of about $20,000 per line. This is determined statistically. They don’t actually know any of these bugs. These figures give you 7,000 and 3,000 and maybe 350 bugs. Not all the bugs have to be important. Perhaps there’s a bug in some error handling code for an error that never comes up. The shuttle didn’t blow up due to a software error, despite perhaps 25 bugs. Seven thousand bugs might give you pause.

Because the coq proof-code itself plausibly contains dozens (even hundreds?) of flaws, shall we conclude that the Feit–Thompson plausibly is incorrect?

This skeptical conclusion seems injudicious (to me), because major theorems have a quality of robustness to them. As evidence of this inherent robustness, coq has found (so far) no irreparable logical flaws in any of the dozens of mathematical theorems that have been formalized in coq.

This is wonderfully reassuring with regard to the robust integrity of the human-created mathematical enterprise!

by the way the recent claim in physics where some italian physicists measured neutrinos going faster than the speed of light is an example and good case study of very subtle/complex measurements being difficult to verify, and turning up erroneous, but eventually straightened out. [it was due to something like a faulty detector….]

Dear Alex,
Thank you very much; I’d be happy to have one, address on my webpage. From the blurb of your book it looks like you’d really be more concerned with the possibility in my Dec. 2011 “Higgs Confidence Game” post, which however has nothing to do with human misleading but rather with the question of whether and where “algorithmic probability” may operate in physics. The question is fraught because at least one of falsifiability and verifiability is forfeited when you try to make and test hypotheses about it, but I’m provoked to mention it from a digital-physics angle which undergirds my “Digital Butterfly Effect” demo.