Marcelo Gleiser, Appleton Professor of Natural Philosophy at Dartmouth College, is a theoretical physicist who has worked on a diverse set of topics: cosmology, particle physics, phase transitions, condensed matter physics and biophysics. He is also a well-known author and public science communicator. A couple of months ago Marcelo suggested a guest piece for Cosmic Variance, and I’m delighted to be able to post it below. I hope you enjoy it, and I’ll encourage Marcelo to look in on the comments section and contribute there if he’d like.

—————————————————————

Here are some thoughts on something that has been bothering me for a while. How do we know the world is the way it is? Easy, a pragmatic person would say, just look and measure. We see a tree, a chair, a table; we hear the wind, music, people talking. We feel heat and cold against our skin. Once our brains integrate this sensorial information, we have a conception of what is real that allows us to function in the world. We know where to go, what to eat, what not to touch; we enjoy a good meal, a nice hug. But what happens when we go beyond our senses, using tools to extend our conception of reality? We don’t see galaxies with the naked eye (well, maybe Andromeda on a moonless, dry night) and much less a carbon atom. How do we know they are there, that they exist?

When Galileo showed his telescope to the Venetian senators in 1609, some refused to accept that what they saw was real. More recently, late in the 19th century, physicist and philosopher Ernst Mach refused to accept the existence of atoms, claiming they would never be seen and hence couldn’t be proven to exist. Mach and the Venetian senators were wrong. What we see through telescopes is, of course, perfectly real; we capture photons—particles of light—that a celestial body emits (or reflects, for planets and moons). If the source doesn’t emit in the visible and is so dim that we can’t capture photons between red and violet, we capture photons from radio or infrared radiation, no less real even though our eyes can’t see them. When atomic electrons jump from orbit to orbit, they also emit (or absorb) photons that can be detected by instruments or, in the case of certain transitions, by our eyes. The instruments we use in the study of natural phenomena are an extension of our senses. This amplification of reality is one of the most spectacular feats of science, allowing us to see beyond the visible. So far, so good.

The situation gets complicated when the complexity of the phenomenon forces us to filter the data, and we select to study only part of what is happening. Our brains, of course, do this all the time, what we call “focus”; otherwise, we would be flooded with such an absurd amount of sounds and images that we wouldn’t be able to do anything. When we look at a star with the naked eye or with an optical telescope, we only see part of it, what it emits in the visible. A complete view of the star would incorporate all of its emissions, in the infrared, ultraviolet, x rays, etc. This fact has a simple but, to my mind, profound consequence: our construction of reality, being necessarily filtered, is incomplete. We only know what we can measure.

In the case of elementary particle physics the situation is even more alarming. The Large Hadron Collider, for example, should start working this coming summer or early fall. In its full capacity, it should produce around 600 million collisions per second. This translates to about 700 megabytes per second of data, more than 10 petabytes (1015) per year. That’s more than a million hard drives, each with a gigabyte. To make sense of this flood of information, physicists have to filter the data, selecting events deemed “interesting.” This selection, in turn, is based on our current theories that speculate on what’s beyond the standard model of particle physics, that is, theories that speculate on stuff we don’t know is there. Although these theories are mostly pretty solid (the Higgs particle as universal giver of mass; extensions of the standard model using more than one Higgs, supersymmetry or/and more than three spatial dimensions…) they can only be confirmed through the very same experiments whose outcome they are trying to predict. Given this mechanism, there is a risk that unexpected phenomena, not predicted by any current theory and hence not included in the subset of collisions deemed interesting, will be eliminated by the data filtering process. In this case, and in a paradoxical way, the theories that we construct to amplify our view of physical reality will actually limit what we can know about nature.

Well this is a sort of fundamental issue in not only science but our whole system of thought. We can only test something we think to test.

The key then, at least in my mind, is looser filtering procedures and a higher level of data archiving. If the sum total of this data is published, say as a continually updated online database, then anyone able to analyze the data can make a stab at it. Am I being too naive here?
Are the cost constraints too large?

I think this sort of transparency could act as a sort of safeguard for the things Dr. Gleiser is talking about.

http://blogs.discovermagazine.com/cosmicvariance/mark/ Mark

The problem is that it is impossible to keep all the data produced by a hadron machine like the LHC – the amount of data is immense, and one has to make real time decisions about which events to keep.

Faustus

I am a Computer Scientists, so please forgive me if I am asking something obvious. Is it not possible to encode current models into filters, and filter out events that are incompatible with the current models? I.e., filter only interesting events. That would be an entirely objective measure of “interestedness”.

thales

Faustus said just what I was thinking. This seems so obvious that I’m surprised this isn’t how the data filter is already set up.

I realize that the priority, however, is to detect a supersymmetry event, and I suspect that the sheer amount of data is prohibitive to having any secondary goals just yet. I would guess that a few years down the road – hopefully after the Higgs is found – the data filters can be tweaked to look a little more broadly.

Tod R. Lauer

I am not a particle physicist, so I can only reply to this based on my experience as an observational astronomer, but if my understanding of the filtering of such experiments is correct, it eliminates only the trivial events or items that are both “understood” and numerous. The things that are left over are the things that should be either very rare or not anticipated, regardless of whether they are predicted or not. For example, I and another astronomer recently engaged in a project to classify spectra of QSOs – the procedure was explicitly designed to winnow out the commonplace, with minimal expectations for the rare outliers.

A deeper question, is not the processing of the data, but the very nature of the space that any experiment can explore and the breadth pf the measurements it takes in the first place. This is a very very old problem in the philosophy of science. You can’t get anywhere with limited time and resources unless you put some level of blinders on and focus very specifically on a particular question. The issue then devolves to what can get through the windows so set up. A good experiment cannot admit all possibilities, but anomalies in the data that it does take should be readily recognized.

mandeep gill

Mark and Marcelo- Nice essay, but I doubt we can be too worried about the things you close with — particle physicists have dealt with this issue for decades, and there’s a whole subfield (and several subsystems on the LHC detectors) devoted to it — trigger design. It hasn’t prevented unexpected particles from being seen before, and as long as the trigger is broad enough, we generally shouldn’t be cutting out interesting events, but only things we know already that would flood the data channels (like minimum bias elastic hadronic collisions). But others who are more expert than I may have more detailed things to say..

D

In regards to collider experiments, I just don’t think this is a concern. The filtering can be loosely described as based on the energy of reconstructed particles. The lower energy regions we cut out have been deeply explored by other experiments. Further, these experiments generally make sure to randomly collect and store some events regardless of the details of the event – just in case we’re missing anything by our selection.

JoAnne

Indeed, as Mandeep says, the LHC experiments have designed extremely sophistcated triggers which filter through the events and records as many of the events as possible. The triggers are designed to record any event that has high energy particles (with the threshold for “high energy” actually being fairly low). The trigger does not search for specific signatures for, say Supersymmetry or the Higgs, but rather records *everything* with a high energy particle. The specific searches are carried out when one analyzes the recorded data for a particular signature.

There are, however, two worrisome cases where new physics could be missed. One is if the rate for the new physics production is far below the Standard Model background. This could easily happen if the new physics only has hadronic signatures. The second is if the new physics has an extremely strange behavoir – such as non-continuous tracks as it flies through the detector – and could be misinterpreted as a detector malfunction.

Kevin McCarthy

I’m no longer working on collider experiments, but I did spend a couple of years on them as an undergrad, doing stuff for CMS and CDF. Its been a couple of years, and there are certainly people who know a great deal more about the LHC data acquisition, but maybe I can shed a little light.

I don’t believe that the low-level data acquisition filters are of as sophisticated a nature as “Only store events consistent with a Standard Model Higgs” or anything like that. Data selection criteria such as that will be constructed later, in the data analysis phase. The primary triggers for data acquisition tend to be things like “Greater than 5 GeV of energy deposited somewhere in the electromagnetic calorimeter”, i.e., a charged particle or photon with greater than 5 GeV hitting the cylinder around the beam pipe, or a similar cut in the hadron calorimeter, tracker, or combination of these systems. These kind of criteria basically just get rid of events where the particles in the beam only glance off of each other, selecting out only the events with “hard scattering” processes. Since the particles in the beam initially have very low momenta in the direction transverse to the beam, for particles to leave the beam path and hit the calorimetry with high energy requires a process with fairly large momentum transfer between the quarks/gluons in the beam. This is done both because low momentum transfer processes would overwhelm the dataset, and also because these are the interesting processes – at low momentum transfer, there’s just not enough energy to either fragment the proton to study QCD processes, or to create new particles that haven’t been seen in previous colliders.

I think that the LHC faces a different problem than any previous collider with these preselection criteria, however; I believe most previous colliders have been concerned with allowing as much interesting data as possible, since any single crossing of the beams was expected to produce only one hard scattering event, if even that, so making the cuts efficient was paramount. At the LHC beam crossings, the number I recall hearing was a mean of 7 scattering events in each beam crossing (not sure if this is 7 expected hard scatters, or just 7 scatters), and so there’s the danger of allowing too much garbage in if they try to capture every single possible interesting event. Perhaps someone more familiar with the LHC data acquisition could shed more light on how they compromise between selecting as many events as possible while still managing to filter out uninteresting data.

David M

Doesn’t this touch on what Bohr spoke about so often with Einstein? Namely, that what we can know about any particle (or anything else) can only be specified in relation to an experimental setup/measuring device, and that it is meaningless to speak of a particle’s “actual” attributes except in terms of this relationship. So in this case, the setup of the LHC, the manner in which we measure the data and which date we choose to measure will absolutely be critical and very much will determine what we find. That’s not to say we can’t be surprised by the results, but that, as Gleiser points out, the results may well be very constrained.

Mike

It seems to me this would be a problem if we couldn’t repeat the experiment. But we can, so if we find results that defy explanation, I expect many scientists will turn to explanations that involve additional ingredients that were “filtered” by the first round of experiments. Likely scientists will try to do this even if the results conform to expectations. Then you can do the experiment again, changing whatever necessary, to address these hypotheses.

It’s possible that the results will be consistent with some expectation, even though something “important” has been ignored. Yet if history is any guide then scientists are always looking for these opportunities too, so any such proposals that are deemed merited can also be tested by future adjustments to the experiment.

It seems to me there is only a fundamental problem if you are given only one opportunity to perform the experiment. I would think this scenario invites a number of deep philosophical issues.

Matt

Truth equals predictivity.

Let me be more precise. We develop greater and greater certainty about the truth of a proposition or statement when it makes large numbers of nontrivial, testable, correct predictions about stuff we don’t yet know, whether that stuff has actually occurred in the past or will occur in the future.

If you stop and think really hard, you realize that everything we know about truth really comes down to this basic concept. And it’s certainly the way things work in science. For example, we know that the Standard Model is a correct description of a layer of reality up to roughly 10^3 GeV because it is the most predictive proposition in all of human history; it makes zillions of precise, numerical predictions that are accurate to many decimal places.

Evolution by natural selection is another such example. We know it’s true because it makes zillions of nontrivial predictions about phenomena in both the past and future that we have not yet observed, but that are almost always correct when we do finally observe them. Just think how nontrivial it was for animal and plant genomes sequenced in the last ten years to fit perfectly the family trees derived from comparative anatomy and paleontology that were laid down a century ago!

By contrast, the proposition of an anthropomorphic deity is one of the least predictive propositions in human history!

Truth equals predictivity. If you can find any counterexamples, I’d love to hear them.

Brian Pratt

It seems to me that data filtering is a problem only if a small subset of people are engaged in that activity:-if the data analysis is carried out by a large number of people who conduct independent analysis then most of the problem goes away. Facts do not have a voice:-we give them a voice by developing theories and various stories about them. But theory has another side:-it is a search light. “Look under that rock over there and you will see something totally new.” Theories of the second kind come from a process whereby individuals or small collections of individuals have the ability to put their wild ideas into the public domain for examination by strangers:-by people who they do not know and will never meet. If a process exists whereby wild ideas can be put into the public domain without censorship then those ideas will acquire a life of their own; they can be amended, criticized or rejected without mounting ad hominem attacks on the authors. That is the core.

If you look at the larger issue of perception or reality (or Cognitive frames of reference), there’s one development in signal acquisition in recent years (2006-…) that shows what we’ve been missing out for so long: Compressed Sensing or Compressive Sampling.

On the face of it, the basic principles behind compressed sensing seem so incredibly counterintuitive (Undersampling – rather than sampling the signal at the nyquist rate or over sampling -, and random sampling (rather than say, rastering)), that it took pure mathematics to weed this concept out and drive it into our intuitions. And in the years ahead, applied mathematicians will most likely devise new and innovative ways to process the vast amounts of data in order that we get more useful data. There most certainly is a limit to how our brains can adapt to new perspectives, but this is what makes discovery interesting.

I think this concern, about the LHC filtering system, is a little like being worried that the Keck telescope cannot also see the far infrared. It’s true, it can’t, and there are things we would like to observe that we can’t because it doesn’t have the capability. But we know what those are, and that our observations don’t constrain that part of the spectrum.

In particular, one can think of the 700 MB/s (incidentally, that’s only twenty thousand one-terabyte hard disks per year, which is not totally outrageous compared to the LHC running costs if there were useful data there) as the full electromagnetic spectrum from an object. The filters pick out the parts we know to be interested in, and they discard a well-characterized fraction of the data. Of course it’s possible that we’ll suddenly realize some of that data is interesting (though other commenters have ointed out that this is unlikely), but if that happens we can just change the filters to pull out the newly-interesting parts from the data stream, much more easily than most telecsopes can change the part of the spectrum they look at.

The key here is that the filters have a well-characterized effect. If you want to do statistics to argue that if theory X were true we would have seen the signature Y of the Z particle, it’s essential that we can say “our current filters do not reject signature Y” (or perhaps even “unless condition A, B, or C is true”). In fact I suspect a limiting factor on the cleverness that can go into filters is statisticians’ need to understand their effect on the data stream.

http://www.cthisspace.com Claire C Smith

Incredible topic.

Claire

http://rightshift.info Dileep

Say. With so much filtering, how do you account for statistical biasing while calculating probabilities?
Or is that done at a later stage, the first ones being study of possibilities rather than probabilities of events?

Garth

Astronomers call this the “Selection Effect” – in astronomy assessing what one can ‘see’ one always has to be mindful of what might be there but not seen because it is too faint or emitting outside the observed waveband, i.e. the observation is limited and prescribed by the apparatus.

It is important to apply this acknowledgement of ignorance to the theory as well as the apparatus and never be too sure about the ‘Standard Model’.

Garth

Christian

To answer the question by Faustus : You cannot always filter things out event by event by comparing with established models. Sometimes single events of “new physics” are identical to Standard Model events – only the statistical distribution of many events differs.

Giotis

Since we are part of the physical world and not outside observers of reality, everything we believe automatically acquires a substance and becomes real. At the end any experiment must be interpreted by humans according to their believes and thus its result can have a meaning only in the context of their reality; that is all that counts. There is no absolute measure of reality. We are the reality and its measure.

http://www.dartmouth.edu/~mgleiser Marcelo Gleiser

First, I’d like to thank the people that took the time to comment on my text. I learned a lot, which was my main goal, together, of course, with inciting some discussion.

People raised both technical and more philosophical issues. First, a general point, very important: even if data filtering is ultimately biased at some level, this is not WRONG; it is INCOMPLETE. It means that what we measure, as I argued, is only a fraction of what is out there. A star, viewed only in the optical is still a star, albeit a fraction of what it is in the full glory of the electromagnetic spectrum. One hundred years back, this optical fraction was taken to be the whole thing. Now we know that’s not the case. Our perception of reality broadened with the development of our tools. This is important to my argument, as captured by Giotis in his comment.

Starting with the more technical points, it is true, as JoAnne mentioned, that at a first passing (and that’s important), triggering does not rely on a specific model. However, it relies on one fundamental assumption (and I thank my friend Michelangelo Mangano, a theorist from CERN and heavily involved with LHC physics for his comments, although any mistakes here are my own): whatever happens during the collision, its final products MUST be Standard Model particles, electrons, muons, photons, jets and missing energy (e.g. neutrinos). After all, we only know how these “objects” interact with the detector. Can we be sure that this will always be the case? One then tracks these objects and records their related events above a minimum threshold of energy to avoid filling the hard drive with useless stuff. One can refine this strategy, but at the cost of introducing theoretical bias. For example, for SUSY searches you can combine the request to have several jets with the request of missing energy or of multiple leptons. Other requests can test other theoretical models. However, there are always events that could be missed, if existent: for example, a very light axion has such a small amount of missing energy that it would never trigger the detector. Also, strange metastable particles or solitonic objects with large mass that could slow down and sit inside the detector for a while; I think JoAnne alluded to this possibility at the end of her comment.

So, from a practical perspective, a lot of thinking has gone into optimizing the triggers so as to cover an enormous number of possible situations. But, as with any experimental set up, there is only so much it can do. Again, this doesn’t mean that physicists are doing something wrong; but it does mean that what we do is always INCOMPLETE. Even as we move onward and perfect the search with more refined filtering and triggers, there is always going to be the possibility that we are missing something. An example is the discovery of the J/psi particle, a bound state of charm and anti-charm quarks. First searches missed it because they weren’t triggering low transverse-momentum muons. Eventually it was found when the search was refined.

We will always be limited by what we can measure. Even as we get more skilled, there is no end in sight. Only improvements. And even as we improve, history tells us that the more we measure, the more there is to measure as new tools open new “vistas” into nature. In this case, what we call reality is based on what we can measure of it. Nothing wrong with this, but it means that our view of the universe is always constrained by our tools. In my opinion, this makes science more human, something I like better than saying that science is the search for some ultimate “truth”.

Ellipsis

Hi Marcelo,

Indeed there are certain types of signatures that do not behave like any physics we know that could potentially avoid detection (e.g. events that contained no high-pT tracks or clusters). However, efforts are made to look for, for example, tracks that look like they don’t come from any known charged particle (see e.g.

that do some of what you mention (perhaps as much as is practically possible — unless you have other suggestions!).

http://www.dorianallworthy.com daisyrose

Ha ha – how can you really know that you know anything – You can not ! You start out doing the possible and before you know it you are doing the impossible.

Little capillarial forays into minutia ? Its the big picture that matters —- what ever that it.

Everyone wants to think they live in (are part of) the most interesting times. Its all vanity !

Nilanjan Ray

It seems there is no end to our findings. The more sophisticated our instruments get, the more we find…from atoms to subatomic particles to higgs bosons and more to come, surely. We become more conscious in the whole process. Is there really and to it?..Or is it our consciousness which gives our sense of reality without an end !!!

TSFTRTruth

Well, i do meditation for investigation. in the end it is said, this is all an illusion. and i must prove it within myself. but of course i have to say, within the air, i dont know sometimes there is this tear in diamension. i cant explain it. our brain or minds could be seeing it, but it could not. The mind sees only what it want to see. it is biased, i am hoping to see what is there and yet it isnt. they say if you want to find it you will. But then again, it is all an illusion. we are made of these tiny tiny molecules. where are we, is my question. can you put your mind on your finger now? Where are we. If we were in Nebula axis now, would earth be a past. who is the past who is the present. there isnt a fixed place of time. when u finish reading this, if you would, it is already a past. it is bad. very bad.

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Cosmic Variance

Random samplings from a universe of ideas.

About Mark Trodden

Mark Trodden holds the Fay R. and Eugene L. Langberg Endowed Chair in Physics and is co-director of the Center for Particle Cosmology at the University of Pennsylvania. He is a theoretical physicist working on particle physics and gravity— in particular on the roles they play in the evolution and structure of the universe. When asked for a short phrase to describe his research area, he says he is a particle cosmologist.