Saving a good text from a few mistakes November 21, 2008

After spending some time with my family this evening, I found the courage to delve into Lubos Motl’s last post about the “ghost sample” cross-section issue. I must admit he wrote overall a good post -mainly the first part is good, at least-, which however contains quite a few inaccuracies -besides, of course, insisting on his original mistake. I think it is a good idea to spend a few lines pointing out the few mistakes of that text, which can be then used with profit.

Fine: so, 742/pb or 2100/pb?

For me, the interesting part of his post starts with the subtitle “Fine: so, 742/pb or 2100/pb?”, which comes after a few lines of text he could have avoided. The first mistake, unfortunately, comes right then at the first paragraph:

Of course, the total integrated luminosity, 2,100/pb (two thousand and one hundred inverse picobarns) must be used as the denominator, as Matt Strassler explains in detail […]

However, there follows a rather clear account of what luminosity is and other interesting information. At the end of the section, however, he falters again:

Now, there’s no doubt that the total integrated luminosity (of proton-antiproton beams) used to suggest the “lepton jets” in the recent CDF paper is 2,100/pb: see e.g. the second sentence of the abstract. If you want to keep things simple, the right denominator has always been 2,100/pb and there is nothing to talk about. But still, you may ask: why the hell Tommaso Dorigo is talking about 742/pb? Isn’t he supposed to know at least some basic things here?

The problem is, the CDF paper is not very clear. Lubos is totally correct: the abstract does quote 2100/pb. This in fact alleviates his guilt a bit, because he gets deceived.

The study first uses 742/pb, and only after page 28 an analysis of the larger dataset, 2100/pb which include the initial 742/pb, begins. The reason is that the smaller dataset, collected until 2005, did not withstand a complicated online selection called “prescale” which basically is enabled whenever the rate of proton-antiproton collisions is too high for data acquisition (which can save to disk no more than about 100 events per second).

Whenever the detector gets flooded with too high rates, prescaling factors are applied to specific triggers, such as the dimuon trigger which collected the 1400/pb used only in the second part of the study. The dimuon trigger until 2005 did not have a prescale, so it is much easier to use that dataset for cross sections and rates.

This is why the CDF paper uses 742 inverse picobarns of data until page 28, when kinematics is studied with more data (at that point, absolute rates are not important anymore, so CDF includes all the data in one single set).

Silicon vertex tracking

Then, a second subsection, titled “Silicon vertex tracking” starts. Here, Lubos falters again. He discusses the SVT trigger, which is not used to collect “ghost events” by the CDF study, nor by the former study of the correlated bb cross section. It is only used for some control samples, but he ignores this fact. It would have been better if he avoided discussing the SVT altogether, because it creates the conditions for another blunder:

“Now, only a subset of the events are picked by the strict SVT criteria: the jets in these events are said to be “b-tagged”. The precise percentage depends on how strict criteria the SVT adopts: it is partly a matter of conventions. In reality, about 24.4% of the events that excite the dimuon triggers also pass the strict SVT filter: this percentage is referred to as the “efficiency” of the (heavy flavor) QCD events. The silicon vertex tracker may also choose the events “loosely”; in that case, the efficiency jumps to 88% or so. However, if you assume that there is no new physics, pretty much all events in which the dimuon trigger “clicks” should be caused by heavy flavors – essentially by the bottom-antibottom initial states.”

Not even wrong! Lubos is confused. He confuses the SVT, which is an online trigger (not used by this analysis), with offline SVX requirements applied to the muon tracks used to select a sample where the composition is studied in detail. This is a minor mistake, although it shows just how much one can confuse matters by being careless.

Also wrong is that the SVT may select events loosely: again, it is offline selections that can do that: SVT has fixed thresholds, being an online algorithm implemented on hardware boards. But let’s not blame Lubos for not knowing the CDF detector.

More nagging is his other mistake above, also highlighted in red: by no means the simple selection of the dimuon trigger only selects bottom-antibottom! Indeed, that only accounts for 30% of the data or so. But there is an even more nagging mistake in the paragraph: he calls bottom-antibottom “initial states“, while those are FINAL states of the hard process. You have a negligible chance to find (anti)bottom quarks in the (anti)proton, so you only get them as the final product of the collision! Lubos, please use correct terminology if you want to have a chance to be taken seriously!

Unfortunately, inaccuracies pile up. Here is the very next paragraph:

“In these most special 24.4% events, bottom-antibottom pairs “almost certainly” appear at the very beginning. So at the very beginning, it looks like you just collided bottom-antibottom pairs instead of proton-antiproton pairs. If you now interpret the Tevatron as a machine where you effectively collide bottom-antibottom pairs, it has a smaller luminosity because only a small portion of the proton-antiproton collisions included protons and antiprotons that were “ready to make heavy flavor collisions”. Even though the remaining 75.6% dimuon events probably also contained bottom quarks, you discard the collisions as inconclusive.”

Amazingly, Lubos really means it: he thinks bottom-antibottom quark collisions happen at the Tevatron in numbers. Yes, he means it: “looks like you just collided bottom-antibottom pairs”. This is slightly embarassing. However, I must give Lubos a few points here for making a serious attempt at explaining things at a layman level. Let’s move on.

“You may define the corresponding fraction of all the events and normalize it in the same way as you would do with bottom-antibottom collisions. Assuming that the bottom quarks are there whenever the SVT says “Yes”, the integrated luminosity of this subset is just 742/pb, not 2,100/pb. The collisions up to this day that have passed the intermediate, loose SVX filter, give you the integrated luminosity of 1,426/pb or so.”

Again, not SVT triggering, but offline SVX cuts. Anyway: alas, Lubos, it really is that difficult, isn’t it ? This is very, very wrong, as a reader, Dan Riley, well explains in a thread here. HEP experimentalists do not do that: they do not assign integrated luminosity to subsets.

Integrated luminosity is a number which applies to a sample of data, and then, whatever cuts or further selections you make, that number remains. To make an example: you have 1000/pb of integrated luminosity, it corresponds to 10,000 events of some rare kind. The cross section of those events is of course 10,000/1000=10 pb. Now, imagine you select 5% of the data by requesting the presence of a high-Et jet. This sample has 500 events (5% of 10,000), but its integrated luminosity is still 1000/pb. Only, when you compute the cross section, you do not just do , but rather , where stands for the efficiency of the cut. One may say it is a convention (since still has units of integrated luminosity), but it in fact avoids the mistake Lubos gets into.

The data used for the studies mentioned in the paper correspond to 742/pb. All of the data! Both the subset of data selected with tight SVX cuts (143k events), or the subset of data making the ghost sample (153k events), or the total sample (743k events) which includes both subsets.

As I already mentioned, the CDF publication is not clear about this, since in the introduction it mentions the larger integrated luminosity used for later checks of the kinematics, from page 28 on. Here Lubos is utterly confused: he splits integrated luminosity in different subsets, deceived by the fact that there is a rough proportion between the two datasets sizes and the two subsets of integrated luminosity collected without prescale until 2005, and with prescale after then.

Then, another bad paragraph, unfortunately:

“So is it OK for someone to write 742/pb in the denominator when he calculates the cross section of the “lepton jets” ghost events? The answer is, of course, No. It’s because these “new” events are actually argued not to include bottom quarks as the initial states. For example, Giromino et al. claim that the Higgs is produced and subsequently decays to various h1, h2, and/or h3 pairs (and 16 tau’s at the very end). Nima and Neal use various supersymmetric particles instead. So you can’t normalize the initial states with the assumption that the bottom quarks are there in the initial states because they are not there.”

Again foncused. True, the “new” events do not include bottom quarks. But NOT as initial states, for god’s sake!!! Anyway, it is “Giromini”, Paolo Giromini. And of course, integrated luminosity is the same for all samples considered this far in the paper, and indeed, it is always in the denominator. Always 742/pb, never an ounce more. Sorry, Lubos. Not your lucky day.

Tables

The third subsection is called “Tables”. It is here that we get a glimpse of the faulty reasoning of Lubos, which got him stuck on accusing me of a mistake:

“Open the CDF paper on page 16. The set of all dimuon events – 743,006 – is divided to the 589,111 QCD events and our 153,895 ghost events. In the second column of this Table II, you see that only 143,743 events passed the tight SVX filter, neither of which was a ghost event.

Now, if you switch to page 12 and look at Table I, you may add the entries to get 143,000+ and to see that exactly these tight SVX-positive events correspond to the (smaller) integrated luminosity of 742/pb, as the caption of Table I says. For another “written proof” that the 742/pb luminosity corresponds to tightly SVX-filtered collisions, and not all (unfiltered) collisions as Tommaso seems to think, see page 11/52 of Giromini’s talk.”

What I highlighted in blue this time is the source of Lubos’ confusion: indeed, the 143k events which were used in the past analysis by CDF (the measurement of correlated cross section) belong to a dataset comprising 742/pb. But the rest of the data belong to it too!

The mistake of Lubos is to not reason like an experimentalist: he believes integrated luminosity follows subsets and divides accordingly, while it is a constant. The data (before any selection) amounts to 742/pb. Then, the tight SVX cuts select 143k events, or the loose cuts select more, but all samples derived from the original one all have the same denominator: 742/pb. Only, they get different efficiency factors at the denominator (the symbol used above).

Ok, I made this post longer than it needed be. Sorry to have bored many of you, but I felt there were still quite a few readers around who had not a clue yet of whom they should believe.

A note to those of you who are still undecided: I built the CMX chambers installed in CDF, with which the data we have been discussing was collected, with my very own hands, between 1999 and 2000. I have worked for CDF since 1992. I have signed the paper on anomalous muons, and I have followed a six-month-long review process before the publication. I befriend the main author, Paolo Giromini, and I have discussed Strassler’s paper with him over the phone at length. Do you not think it is a bit arrogant for a retired theorist to believe he can win an argument on such an exquisitely experimental matter with me ? I am not boasting: I am just stating a fact. Lubos is arrogant. This time, he got a lesson. Lubos, I still like you, but please, don’t mess with me on these matters.

Like this:

Related

Comments

At first I had some qualms about TD’s slow crucifixion of LM. It seemed cruel — imagine having your incompetence paraded across the blogosphere in such detail.

But after thinking about it I realised that *somebody* has to do this. The standard response to LM so far has been to ignore him or ban him [as at Cosmic Variance]. But that does not solve the real problem. The real problem is not that he is obnoxious and rude; the real problem is that, outside his very narrow [and now rather outdated] field of expertise [matrix theory] he *just isn’t a very good physicist*. For example, he has made comments about general relativity that [as somebody said] should not be tolerated from an undergraduate; again, he has put forward his own theory about the foundations of thermodynamics, and this theory is on the very brink of outright crackpotism. Yet he is able to project an image of expertise. This is very harmful. It’s true that serious people have mostly given up posting on LM’s blog, but who knows how many students and amateurs read it and are misled by it. [I would however say the same thing about Peter Woit’s blog; interestingly, both of them have a Messiah Complex, thinking that physics needs them to save it from something-or-other.]

It’s about time somebody really slapped LM down and exposed his incompetence in detail. I think we should thank TD for doing this and inevitably exposing himself to abuse. The only thing I can say against it is that, as some people have argued, it is not morally justifiable to laugh at mentally ill people, no matter how unpleasant or childish they may be. But in this case there appears to be no choice. Well done, TD!

as I asked you in your first posting an estimate for cross section of anomalous events you said that authors refer to 100 nb which is suspiciously large taking into account the production rate from b -bar.

From previous postings I understood that the actual number would be about 100 pb (75 nb).

My own calculations would predict by a direct fractal scaling of electropion model that leptopions would have same maximal boost factor gamma =about 1+2*10^(-3) as in heavy ion collisions so that the cross section would be about 1 nb. Maximal boost factor 1+10^(-3) would give .1 nb.

The requirement tof 100 nb would give gamma below about 1.5 so that taupions would have upper limit of energy about 36 GeV in rest system, which looks large to me and is inconsistent with fractality.

I am confused! Could you help! What is the correct order of magnitude?

The blogosphere has a tendency to demonize without going into specifics. Blah blah blah, this person said xyz and its clearly wrong, he/she is therefore a crackpot.

A minute or two of blackboard work in practise usually will clear the mess or confusion up, but alas thats never the case and is an intrinsic problem of the media. So in general we usually end up with long, utterly pointless semantic debates that really just makes things worse and tends to contribute to fog rather than anything else.

Im not saying this is the case here, but seriously it pays to lighten up a little and give people the benefit of the doubt, especially when we are dealing with colleagues or going over completely elementary concepts.

Sorry, Tomasso. I said on TRF that you had deleted one of my comments, which you didn`t do. I was looking in the wrong place.
I have corrected this on TRF, but, as it is on moderation mode, it doesn`t show up just yet.

Dear T, I think you have just destroyed whatever was left of LM’s career.
Maybe thats a public service but maybe there might have been better ways of investing the precious moments we have on this earth. One of my colleagues killed himself this week which makes me stop and think about how we spend our time.

I, Lubos Motl, apologize for the mistake in which I was driven by Strassler’s paper, and by the rather complicated [well, messy] way the CDF preprint on anomalous muons is written. I insisted I was right because I could not spend my time reading the [whole] CDF paper, I have better things to do. I now realize my mistake, and apologize with the CDF collaboration – whose results I misrepresented-, with Tommaso Dorigo – whom I accused of not knowing how to compute cross sections, without having a full proof of this incompetence, and with all the readers I may have misled.

My current reason to think that the subset of the CDF collaboration meant that the cross section of new events was over 200+ pb are testimonies of two other CDF members that I have no serious reason to doubt at this moment. My statement that it seems that they meant that the cross section was above 200 pb doesn’t mean that they actually wrote it in the v1 of their paper – it seems that they wrote it in such a way to contradict it – and it surely doesn’t mean that I have already been convinced that there exists a new event with a 200 pb cross section which I haven’t.

I realize that CDF may have been – and still is – in a nontrivial situation, being driven to conclusions that sound so unusual that many of the members probably didn’t want to express them clearly. Still, I think it is a better idea to write them clearly and supplement future papers (or versions of this one?) with estimated cross sections, while the processes that look sufficiently unbelievable are honestly presented as a result of either new physics, unaccounted standard model processes, statistical fluctuations that go slightly above the expected number of standard deviations, errors in the detectors, triggers or other parts of the technology, or conceptual and methodological errors of the whole CDF team – because all of these options are clearly possible.

[…] is also surprisingly capable of putting aside bad feelings, anger, enmity he sometimes display. The comment he left this morning in my thread (which comes from his private email address and has the right IP) […]

Amos, well in a sense he did, because he made me oscillate between a state of disbelief that he could still not get it, and of attempts at making him finally realize his mistake. I was not pissed off by the insults in his blog. I am rather accustomed to that from him and do not notice any more.

Iphigenia, thanks for your support. I did endure this because I thought it was important to disallow the manipulation of scientific results produced by my own collaboration. If it had been a more futile topic, I would probably have left it alone.

Hi Matti,
the effective cross section for these weird events is of the order of a few tens of picobarns, maybe a hundred. If however one assumes some particular process responsible for the formation of the multiple muons, one has to then account for branching fraction, identification efficiencies, Pt spectra, etc. This is impossible to estimate without a model. One can do two things: ignore the issue, and just quote effective xs, or assume that the mechanisms are similar in yield of muons and in kinematics to bb quark pairs. In the latter case, one is led to production cross sections (before branchings to many muons) that can be very large. The xs with decay to muons has been measured as 1549 pb at the tevatron in the same dataset where the ghost events are isolated…

Haelfix, I may agree, but in this particular case a battle had to be fought.

Tripitaka, sorry to hear that. I was thinking along the same lines yesterday, when I heard a similar news. We are visitors here, and we have better use our time well. I am also always reminded about our insignificance in the universe, when I look at the stars. Things to keep in mind.