As I write this, it’s December and it’s Munich. I am at the Center for Mathematical Philosophy to attend a conference that promises to answer the question “Why trust a theory?” The meeting is organized by the Austrian philosopher Richard Dawid, whose recent book String Theory and the Scientific Method caused some upset among physicists.

String theory is currently the most popular idea for a unified theory of the [fundamental physics] interactions. It posits that the universe and all its content is made of small vibrating strings that may be closed back on themselves or have loose ends, may stretch or curl up, may split or merge. And that explains everything: matter, space-time, and, yes, you too. At least that’s the idea. String theory has to date no experimental evidence speaking for it. Historian Helge Kragh, also at the meeting, has compared it to vortex theory.

Richard Dawid, in his book, used string theory as an example for the use of “non-empirical theory assessment.” By this he means that to select a good theory, its ability to describe observation isn’t the only criterion. He claims that certain criteria that are not based on observations are also philosophically sound, and he concludes that the scientific method must be amended so that hypotheses can be evaluated on purely theoretical grounds. Richard’s examples for this non-empirical evaluation—arguments commonly made by string theorists in favor of their theory—are (1) the absence of alternative explanations, (2) the use of mathematics that has worked before, and (3) the discovery of unexpected connections.

Richard isn’t so much saying that these criteria should be used as simply pointing out that they are being used, and he provides a justification for them. The philosopher’s support has been welcomed by string theorists. By others, less so.

In response to Richard’s proposed change of the scientific method, cosmologists Joe Silk and George Ellis warned of “breaking with centuries of philosophical tradition of defining scientific knowledge as empirical” and, in a widely read comment published in Nature, expressed their fear that “theoretical physics risks becoming a no-man’s-land between mathematics, physics and philosophy that does not truly meet the requirements of any.”

I can top these fears. If we accept a new philosophy that promotes selecting theories based on something other than facts, why stop at physics? I envision a future in which climate scientists choose models according to criteria some philosopher dreamed up. The thought makes
me sweat.

But the main reason I am attending this conference is that I want answers to the questions that attracted me to physics. I want to know how the universe began, whether time consists of single moments, and if indeed everything can be explained with math. I don’t expect philosophers to answer these questions. But maybe they are right and the reason we’re not making progress is that our non-empirical theory assessment sucks.

The philosophers are certainly right that we use criteria other than observational adequacy to formulate theories. That science operates by generating and subsequently testing hypotheses is only part of the story. Testing all possible hypotheses is simply infeasible; hence most of the scientific enterprise today—from academic degrees to peer review to guidelines for scientific conduct—is dedicated to identifying good hypotheses to begin with. Community standards differ vastly from one field to the next and each field employs its own quality filters, but we all use some. In our practice, if not in our philosophy, theory assessment to preselect hypotheses has long been part of the scientific method. It doesn’t relieve us from experimental test, but it’s an operational necessity to even get to experimental test.

In the foundations of physics, therefore, we have always chosen theories on grounds other than experimental test. We have to, because often our aim is not to explain existing data but to develop theories that we hope will later be tested—if we can convince someone to do it. But how are we supposed to decide what theory to work on before it’s been tested? And how are experimentalists to decide which theory is worth testing? Of course we use non-empirical assessment. It’s just that, in contrast to Richard, I don’t think the criteria we use are very philosophical. Rather, they’re mostly social and aesthetic. And I doubt they are self-correcting.

Arguments from beauty have failed us in the past, and I worry I am witnessing another failure right now.

“So what?” you may say. “Hasn’t it always worked out in the end?” It has. But leaving aside that we could be further along had scientists not been distracted by beauty, physics has changed—and keeps on changing. In the past, we muddled through because data forced theoretical physicists to revise ill-conceived aesthetic ideals. But increasingly we first need theories to decide which experiments are most likely to reveal new phenomena, experiments that then take decades and billions of dollars to carry out. Data don’t come to us anymore—we have to know where to get them, and we can’t afford to search everywhere. Hence, the more difficult new experiments become, the more care theorists must take to not sleepwalk into a dead end while caught up in a beautiful dream. New demands require new methods. But which methods?

ABOUT THE AUTHOR(S)

Sabine Hossenfelder

Sabine Hossenfelder is a theoretical physicist at the Frankfurt Institute for Advanced Studies in Germany, who researches physics beyond the Standard Model. She is author of the physics blog Backreaction and the book Lost in Math: How Beauty Leads Physics Astray (Basic Books, 2018).

Scientific American is part of Springer Nature, which owns or has commercial relations with thousands of scientific publications (many of them can be found at www.springernature.com/us). Scientific American maintains a strict policy of editorial independence in reporting developments in science to our readers.