Wednesday, January 30, 2008
... /////

Eight days after the Big Ben lowered his defining interest rate by 0.75 percentage points, they subtracted extra 0.5 points, ending at 3.0%. The sum, 1.25 percentage points in 8 days, is unprecedented. Even after the 9/11/2001 attacks, the interest rates were only lowered by 0.5 percentage points.

The interest rates in the U.S. should have been higher by 3-5 percentage points during the last 5 years or so. Let me sketch several general reasons why it was wrong for the Fed to reduce the rates so rapidly and why it is generally bad for the Fed to maintain low rates and to allow the U.S. currency to weaken.

Regulators should regulate fluctuations

As we have discussed repeatedly, markets have the tendency to amplify various fluctuations. The herd mentality of the investors is one of the reasons. Such economic cycles may lead to crises. These things are natural but if they are excessive, they are unhealthy. If the central banks and federal bodies are supposed to do something, they should try to make the behavior of markets more constant, not more violent.

So they should act as a kind of negative feedback. They should never try to overreact. They shouldn't try to overcompensate an effect by another but stronger effect or amplify the overall havoc on the markets.

Now, most slowdowns are preceded by various unsustainable bubbles. In many cases, various equity prices grow faster than certain sustainable rates. While the growth may be trusted in the short run and many people earn cheap money from it, it is very clear that eventually, it must stop or collapse. The dot com bubble and the housing bubble of the last decade are two recent examples.

In my opinion, responsible officials should try to regulate these movements already when they are going up. They might want to say what prices and their time derivatives they consider reasonable and try to influence and calm down the psychology of the markets. Some price dynamics is clearly unsustainable. For example, if housing prices increase by 10 percent every year while wages only grow by 5 percent or less, it is not hard to see that houses are rapidly getting increasingly unaffordable. Constant affordability essentially means the same average growth of the housing prices and wages.

Still, it is not unusual that the housing prices sometimes increase by 10 percent for a couple of years. However, it is then obvious that these prices must sometimes also drop by comparable fractions. If the authorities didn't act to slow down the excessive increase of prices, they shouldn't act against their drop either. A further drop in housing prices by 20-50% is pretty much unavoidable and responsible people shouldn't pretend that it is not.

Now, a decreasing feeling of wealth surely reduces consumers' spending which might be considered a bad thing by some people. But the very same sentence also holds in the opposite direction. Increasing home prices are (or were) artificially increasing consumption above the rate that would exist if the housing prices were increasing sustainably. I feel that too many people want to see only one side of this coin (and many other similar coins). If they become financial government officials, they inevitably lead the economy to an unsustainable behavior that must obviously end up in amplified cycles and deeper crises.

Inflation and exchange rates are more robust measures of the proper value of money

Finally, all central banks look at inflation because inflation is always and everywhere a monetary phenomenon. The Federal Reserve in particular emphasizes the economic growth. It uses lower rates to stimulate it when the growth slows down. I think it is a wrong perspective.

While lower rates do stimulate the economy, they also lead, to one extent or another, to many other effects, including higher inflation, weakening currency, increasing spending, increasing debt. I think that the primary goal of the central banks should be to keep the value of money constant.

In the past, the value of money was determined by the gold standard, by the ultimate "constant" precious metal. However, gold doesn't play such an important role today. Neither does silver, the second candidate for a "prototype" of value. In fact, the gold/silver price ratio has been dramatically fluctuating during the last two centuries. A much more robust definition of the value of money involves all possible products that people buy.

The inflation rate measures how the value of money with respect to the basket of actual consumable things changes every year. This number should be kept more or less constant because price stability defines the equilibrium of supply and demand for money.

The GDP growth depends on many other things - for example the weather in agricultural countries - and there exists no principle that would dictate that this figure should be constant. Also, stock prices are derived quantities that determine the ability of companies to create values under certain (and changing) circumstances. Again, there is no a priori reason why these things should be constant. But a non-constant value of the money - with respect to things that people actually need - is simply a bad thing.

Irresponsible behavior should be punished

We have discussed the issue or moral hazard many times. Once again, irresponsible behavior must be punished. If someone takes a risk and makes a profit, it must also be possible that sometimes the risk works against the person and leads to a loss. If the government or the central banks save the speculators - both rich as well as poor ones - in such a way that the sign of the speculators' profit is always positive, it leads to increasing speculation, less stable markets, and less efficient markets where people effectively insured by the government earn cheap money for activities that are not useful for anyone (except for the person who makes the money).

The Fed shouldn't be a slave of the Wall Street. The decisions of the Federal Reserve influence many other types of people - such as U.S. students who must now pay a lot of money abroad. The bankers should be independent from all pressures of limited subgroups of the population or the economy.

Strong dollar policy is beneficial for the U.S.

The strong dollar policy has been a very good policy for the U.S. and if someone openly or secretly believes that it is not the case, he or she is extremely wrong.

First of all, a strong dollar has been one of the major reasons that is (or was) making American economy, science, and technology superior. A stronger currency means higher salaries - when converted to another currency - and higher salaries attract skilled workers and increase the competition. All these things increase productivity and related observables.

It is an effect that we also know from individual countries. For example, Prague is able to concentrate skillful, hard-working, smart people because it has a richer local economy than the rest of Czechia. The causal relationship goes in both ways. The local economy is strong because there are lots of hard-working people who have something to offer and they are there because the local economy is strong and offers them high salaries.

If the effect of concentrating people worth high salaries diminishes, the comparative strength of the city or the country diminishes, too. What do I want to say? For example, the U.S. still may have about 3 times higher salaries than the Czech Republic if measured by conversion (but 2 times as measured by the PPP). Will this ratio of 3 or 2 persist? I think that the answer is No unless the U.S. restores the strong dollar policy. If it doesn't, the average salaries in both countries will eventually coincide - just like the average IQs (98) and other objective quantities describing the economical environment in both countries coincide.

Once again, the currency strength has a profound impact on the attraction of brains and qualified workers in general. The competitive edge of a country largely depends on these things.

Relationship with trade balance

Moreover, America has a significant trade deficit. While it is true that a weaker currency could reduce it, it takes some time. In the short run and medium run, it is much easier to reduce it by a strengthening U.S. dollar simply because the imports become cheaper in the U.S. dollars and imports are more important for the overall calculation than exports because they are larger (because of the trade deficit).

I think that a weak currency significantly helps the trade balance only if the country already has a significant surplus (an example is or was China). For countries with a large trade deficit such as the U.S., a weak currency may make the balance even worse and the last 6 years demonstrate this fact pretty clearly.

On the other hand, there is nothing wrong about having a large trade deficit for many decades because the growth of the economy - and population - of different countries may simply differ for whole centuries. There would be nothing surprising about the U.S. economy growing, building, and importing more than the Japanese economy simply because there is more space in the U.S. for people, their houses, and their new companies.

I want to say one more thing: a strategic, political observation. Friends of the U.S. are much more likely to hold the U.S. dollars while the U.S. enemies have a much higher probability to bet against the U.S. currency. By weakening the currency, the Fed effectively helps the enemies of the U.S. financially while it punishes its friends. It is a very bad evolution for the American (and not only American) strategic interests.

Fast rate cuts create the feeling that something really serious is going onAnother observation is so obvious that I will only dedicate two sentences to it. Fast rate cuts create the impression that the U.S. economy is in a serious trouble and such an impression has the ability to transform itself into reality. Such a dramatic behavior repels all kinds of investors, especially the international investors who are influenced not only by the prices of U.S. stocks etc. denominated in the U.S. dollars but also by the value of the U.S. dollar.

Americans borrow easily and they need higher ratesFinally, America should have higher rates than many other countries simply because the Americans are clearly not shy to borrow money. After all, their self-confidence in borrowing money is one of the driving forces behind the trade deficit. This comment is another reason supporting the thesis that lower interest rates "help" to increase the trade deficit.

If I summarize, I think that the importance of one causal relationship - between interest rates and the stimulation of the economy - is being heavily overestimated because of some flawed, Keynesian thinking while many other, more important relationships and principles are being largely neglected. When you think about all these things, you will see that the bankers are creating at least as much damage as they help.

Quite a long time for the "speedy" PRL. In the paper that has collected 25 citations so far, Neil Bevis, Mark Hindmarsh, Martin Kunz, and Jon Urrestilla statistically investigate the cosmic microwave background. They try to parameterize it by two models. One of them is based on ordinary inflation - what matters is the scale-invariant spectrum - with an adjustable power law tilt. The other has cosmic strings included.

By looking at the l=10 spherical harmonics, they argue that the relative contribution f_{10} of the cosmic strings is optimized for fitting the data at f_{10}=0.11 plus minus 0.05. So it is not zero and the strength of this statement is approximately two sigma. Well, that's not a terribly strong signal but it is justifiably enough for some people to find it intriguing.

I am somewhat skeptical about this kind of an argument because it reminds me of various "proofs" of anthropogenic global warming: you can't match the curves with the first naive natural model you write down and if you add men to the naive model, you do better. Well, it is not too surprising. Two-sigma signals are guaranteed to be almost everywhere and a model with additional parameters - (almost) whatever they are - is guaranteed to fit the data more accurately than a more robust and simple model. Of course, if this were a 5-sigma signal, I would be more afraid to make such a statement but with 2 sigma, I have enough courage to do it. ;-)

The work is surely interesting but the results so far are uncertain enough to allow me to stick to my subjective and purely theoretical 15% probability estimate that cosmic strings exist and will be reliably observed (or produced) by 2100.

What I would find more convincing would be if a cosmic-string model were able to fit the data better than a cosmic-string-free model with the same number of parameters. For example, if you showed that a model with a fraction of cosmic strings and a fixed tilt is more accurate than a model with an adjustable tilt and its time derivative (or scale derivative) or whatever new additional but "conventional" parameter is useful to reduce the errors.

Couldn't this become a standard technique - in all scientific disciplines - to decide about the relevance of a very new effect previously unused to match the data?

Abdus Salam, the first Muslim and the first Pakistani Nobel prize winner, was born on January 29th, 1926. What was his main goal? Well, let me speak himself:

See also 35-minute interview with Abdus Salam (start with the last one and continue to the left; Salam superenthusiastically celebrates string theory from around 4:45 of the part 2/4; at the beginning of 3/4, Salam has a funny description of Edward Witten) and dozens of other videos.

As you can see, he had pretty much the same goals as other great and passionate theoretical physicists and he has made a substantial contribution towards this goal.

Monday, January 28, 2008
... /////

inform us that a group at University of Illinois has proposed a new method to search for signatures of cosmic strings in the skies. The project is based on the 21-centimeter Hydrogen line.

Recall that the Hydrogen line arises from transitions between two nearly ground states of the neutral Hydrogen atom that are split by the so-called hyperfine structure: its origin is in the interaction between the spins of the electron and the proton. The two states, distinguished by different total spins, differ by a very small energy whose corresponding photon has frequency of 1420 MHz or the wavelength of 21 centimeters.

A direct transition between these two states is highly suppressed and almost certainly unobservable in the terrestrial labs (the rate is less than 3 emissions per 10^{15} seconds). However, there is a lot of Hydrogen in the Cosmos so this 21-cm line is easily observable. However, the radiation whose wavelength is 21 cm today is not what the people want to observe.

Instead, they want to focus on the radiation whose wavelength was 21 cm right during the decoupling era. By the expansion of the Universe, the wavelength is now closer to 20 meters and they would need to build a network of powerful radio telescopes and try to see something. I might misunderstand something, but I wouldn't expect this stretched spectral line to be too sharp.

Such signals, if observed, could nevertheless not only identify the inhomogeneities caused by cosmic string networks - that are unobservable in the normal CMB spectrum - but even determine the string tension and perhaps some other features of such cosmic strings hypothetically imprinted into this portion of electromagnetic radiation. Note that cosmic strings appear in various unified theories, starting from grand unified theories and ending with superstring theory itself: cosmic strings can literally be fundamental strings from string theory stretched into astronomical distances.

It looks as a rather interesting and unexpected experimental idea that should be looked into very seriously. Such possibilities highlight that creative people may often solve questions that look too difficult at the beginning. They also emphasize how incredibly idiotic are the aggressive crackpots' proclamations that modern theoretical physics in general and string theory in particular is untestable.

The beginning of the year 2008 brought us two new papers defending this concept: see some news from Australia. It is almost certainly becoming a topic of serious discussions between the people who have the power to modify textbooks. I mostly think that the idea is silly. But let us begin with some basic facts describing the eons, summarized in the (hopefully) most transparent way you have ever seen:

Hadean (eon): 4.6-3.8 Gyr BC, named after Hades, a Greek god of the underworld

Note that Tertiary covers both Paleogene as well as a part of Neogene while Quaternary ("čtvrtohory") roughly coincide with the broader human race. Now, there are hundreds of other facts that the mankind has learned that you might expect me to reproduce here. But I won't. Let me focus on more general facts.

Each geological eon, era, or period is associated with some geological events as well as with some epochs in the evolution of life. But because all of them are geological periods, it should be the rocks that determine the natural boundaries.

Continental drift and the creation of various mountains and other huge structures belong to the defining events of the geological classification. Life is added as a cherry on a pie. Its fossils are confined within the rocks.

If we look at very ancient eras, it is clear that our time resolution diminishes a little bit. For example, the Hadean lasted nearly for one billion of years and it has no official subdivisions. The fact that the recent subdivisions are finer has two major reasons:

subjective ones: I mean our inability to learn the distant past in detail

objective ones: I mean the fact that the events on Earth are speeding up

I guess that the objective aspect dominates in the very recent periods. In principle, we can measure time rather accurately even for events that occurred tens of millions of years ago. But we simply don't divide those events to as short periods as one million of years or thousands of years because we are not aware of too many dramatic events that occurred a long time ago.

It is not just that we are aware: many of us are convinced that the frequency of events worth human attention was limited, indeed.

Life occurred rather quickly after the Earth was created. While there has been a lot of rather sophisticated life on Earth in Phanerozoic, it doesn't mean that there was no life in the previous eons. In fact, you can find life not only in Proterozoic but even in Archean.

Consider modern life with internal membranes in cells and cytoskeleton, those that usually have the nucleus. These life forms are called "eukaryotes". Well, what is the evidence for the oldest eukaryote? It wasn't found by your humble correspondent but it was found by Jochen Brocks, his former roommate, and it probably lived 2.7 Gyr BC, in Neoarchean.

That's when the Earth was roughly 50% younger than today.

Only in Phanerozoic, in about the most recent 10% of the Earth's life, we could see abundant life forms around. And only in Cenosoic, the most recent 1% of the Earth's life, there have been mammals around. In the most recent 0.1% of the Earth's history, we saw some kinds of humans around. The white race as defined by the SLC24A5 mutation has only existed for the last 0.0001% of the Earth's history.

Things seem to be speeding up.

Well, it is plausible but unlikely that a similar acceleration occurred in the past and dinosaurs or other distant cousins were driving their SUVs around - before they were destroyed. Let us assume it is not the case and the Quaternary is the first geological period when the evolution of intelligent life forms started to speed up exponentially.

Fine.

But now the main point. With all my admiration for the unbelievable progress that life has recently made, I think that we - the mammals, humans, or whomever you want to include - have a very limited impact on the features of our planet that geologists will be able to study in the year 10,000,000.

I think that the notion of an anthropocene is arbitrary, its beginning is ill-defined, its justification is not really based on geology, and one could invent even newer, more recent eras associated with another kind of human progress.

Note that according to the classification above, we already live in Phanerozoic as well as Cenosoic as well as Neogene as well as Quaternary as well as Holocene. Holocene, the shortest period, approximately coincides with the existence of oldest civilizations as we know them. Do you really want to add Anthropocene to the list?

I don't see too many qualitative geological events that occurred in the last 200 years but that would distinguish us from the ancient Greeks or Romans. Honestly speaking, I consider myself to be much closer to some old people in Greece or the Roman Empire than most politically correct loons who live in the "Anthropocene".

Will we also have to add Microprocessorocene, Multiculturalismocene, or something else? Please stop this insanity. Create new mountains. If you can't do it, please wait until Mother Nature does it for you. Then you can start a new era. ;-)

Also, he argues that global warming in general and Al Gore's activities in particular are politically driven and caused by failures of previous left-wing ideologies (he says that the remaining ashes of socialism have themselves turned into a degraded Malthusian outlook, even in Europe) - these Cockburn's ideas sound just like Václav Klaus - but he adds a characteristically left-wing comment that this program can't help to realize the leftist dreams because it will be no one else than the corporations and evil capitalists who will benefit from the fear. ;-)

Well, I tend to agree that the people who benefit are evil even though their identification with capitalism or corporations depends on your conventions. At any rate, I agree with the liberal pundit that it is wrong for various non-profit organizations to have lunches paid by indulgences and it is wrong to promote nuclear energy by global warming fears even if - and even though - nuclear energy is a great thing.

Cockburn paints peer review in the real world as a method to form biased cliques of friends that fight against the unexpected and undesirable. He writes about the political implication of climate alarmism for the third world (e.g. green ideological problems with the new cheap Indian car) and about the self-righteous intimidation he has been subjected to because of his previous blasphemies.

Let me offer you one of hundreds of similar stories of your humble correspondent. A nice former senior ex-colleague of mine was spending some quality time with Naomi Oreskes and he had the interesting idea to put us in touch to discuss the climate. Unfortunately, she first saw my page debunking her paper that had argued that 100% of papers support the "global warming consensus" and she sent me a testimony of hers in the Senate, expecting that it would settle all my questions.

Joseph Louis, comte de Lagrange, an eminent mathematician and a tragic figure, was born on January 25th, 1736, in Turin as Giuseppe Lodovico Lagrangia. His father was a rich manager of the funds of the Sardinian royal army.

He only began to be interested in maths at the age of 17. However, two years later, he wrote a letter to Euler in which he solved the isoperimetrical problem (finding a curve minimizing [thanks, Carl] the perimeter, given a fixed area inside; the solution is clearly a circle but he gave a proof) using variational calculus. Let us say it bluntly: he used a kind of Lagrangian approach. ;-)

Euler and Lagrange: the leaders

Leonhard Euler, a fellow string theorist, instantly understood that Lagrange's methods were important. Euler generously withdrew his own, more primitive paper about similar issues, allowed the young Italian guy to take the full credit for his discovery, and even invented a catchy name for Lagrange's techniques. Lagrange instantly became a celebrated mathematician: only Euler was above him.

Despite Euler's generosity, we still usually talk about the Euler-Lagrange equations that physicists usually derive from "delta S = 0".

Thursday, January 24, 2008
... /////

Given that we agree not to use an assumption of 'typicality', is there any reason to discard a cosmology where the overwhelming majority of brains are Boltzmann brains? (And the majority of stars, planets, galaxies etc. are also Boltzmann-stars, planets, galaxies...)

The short answer is No. Here is my longer answer.

Dear Thomas, I don't understand what one can possibly mean by an "overwhelming majority of something in a cosmology". I only know how to measure majorities of people or brains or other objects or creatures who live or exist at the same moment (a thickened slice of spacetime), who interact with each other, and who have some enforced equality or a similar mechanism that makes counting of majorities relevant.

If you want to compute majorities in spacetime as opposed to space, what is it supposed to mean? You face exactly the same problems as with the problematic notion of "typicality" in spacetime because typicality and a membership in a majority is really the same thing.

First, you should know that John Christy and Roy Spencer (UAH MSU) have identified an error in their competition's data (RSS MSU). You should notice that two climate skeptics have actually made some data look warmer than previously reported. Would the champions of the global warming alarm ever actively identify an error whose correction would cool down the Earth?

The corrected RSS MSU results are approximately 0.1 °C warmer for 2007 than previously reported and they are closer to UAH MSU and articles such as one about the very cold year 2007 will have to be corrected. See the link at the top of that page.

Wednesday, January 23, 2008
... /////

A cold January is not the most likely month for such "anniversaries" but statistics happens:

I bought the BC 500 speedometer in Summer 1995 or so and it has measured kilometers on two continents. If I were riding my bike along a geodesic instead of those small gravitational quantum loops, I could have arrived to the antipole of Pilsen (1000 miles Southeast from New Zealand). But the puzzle is: what will happen with the number above after another kilometer? It will either

display 20,000 even though the space for the digit "2" seems somewhat constrained

Paul Langevin was born on 1/23/1872 in Paris. Because he was a rather important representative of French science and a science official, great physicists have encountered him at many conferences. Langevin was also a leading figure who promoted relativity in France.

The modern interpretation of diamagnetism and paramagnetism in terms of electron clouds asymmetrically or anisotropically located within atoms is due to Langevin. In statistical physics, he wrote down the Langevin equation describing Langevin dynamics. The simplest example is Brownian motion in a potential: Langevin's equation is then Newton's equation of motion with the classical potential term, a friction term, and a noise term. He also designed some ultrasound-based technology based on the piezoelectric effect (previously demonstrated by the Curie brothers) to locate submarines during the war but when the gadget was ready, the war was over.

says so. The effect used to be called erosion and it has been fought against for decades but now it has become a new result of cutting-edge science because urban-based journalists have no idea about the actual environment.

In this case, it seems easier, more meaningful, and cheaper to please Gaia. You don't need to stop using fossil fuels. Just switch to "no-till farming" that used to be called "chemical farming" but the adjective "chemical" recently became politically incorrect because it masks the opinion that chemical farming is more "natural" than the till farming. ;-)

Farmers have been doing it for decades, anyway. It is them who should know what is good for their business, isn't it? I guess that this problem won't be covered in the media because it is not interesting for urban politics.

Recently we wrote about a physicist who died exactly 100 years ago, namely Lord Kelvin, and a physicist who was born exactly 100 years ago, namely Edward Teller.

Lev Davidovich Landau belongs to the second category because he was born on January 22nd, 1908 into a Jewish family in Baku, Azerbaijan, Russian Empire, to become the most stellar Soviet and Russian physicist of his century.

Another ordinary prodigy

As in many other cases, he was a child prodigy. At the age of 13, he completed his high school. Because he couldn't enroll in a university, he at least did two schools in Baku simultaneously. In 1924, he moved to the physics department of Leningrad University where he graduated three years later, at the age of 19. He received a doctorate two years later, at the age of 21.

Two more years later, the Rockefeller Foundation teamed up with Stalin's government and allowed him to visit Göttingen, Leipzig, and especially Copenhagen: Landau always considered himself to be a pupil of Niels Bohr. In 1930, he became a friend of Edward Teller, among others. Landau traveled to Cambridge and Zürich before he returned to the Soviet Union, to lead theoretical physics in Kharkov, Ukraine.

One of his mottos was versatility. His students had to pass the "theoretical minimum" that covered all aspects of theoretical physics and only 43 candidates ever passed. I pretty much agree with this philosophy and I would say that I have passed an equivalent of it. In this way, the students may become more than narrow specialists.

Monday, January 21, 2008
... /////

While many places on Earth recently witnessed the coldest days in many decades, many not quite reasonable people continue in their crusade to regulate the world's carbon cycle in order to "fight climate change".

While the U.S. department of energy argues that the U.S. won't follow Australia and regardless of the winner of the 2008 elections, it won't join Kyoto, several more "progressive" regions of the world prefer a less reasonable approach.

A powerful German energy lobby group has calculated that certain new rules proposed by the EU could increase the costs of the carbon trading scheme 18-fold and make things more expensive for Germany by EUR 17 billion. They see the European industry in danger, following their calculation assuming a EUR 30 price per ton.

Meanwhile, the current EU ETS price of carbon emissions is between 1 and 2 eurocents per ton. ;-)

Food fights between the EU members are beginning and Germany expects that one million of jobs may be lost as a result of the new scheme. Steelmakers and representatives of other industries argue that if this lunacy is going to continue, they will simply leave Europe.

Moreover, Japan is wise enough to propose 2000 as the new reference year, instead of 1990, to determine future emissions according to a successor of the Kyoto treaty. This fact also makes the situation more difficult for Europe.

One of the main reasons why Europe has been so supportive of these schemes is that they were pegged to 1990 and Europe's CO2 production actually dropped during the 1990-2000 decade. The reasons had nothing to do with global warming - see e.g. Communism, capitalism, and environment - and Europe could simply benefit from being already below the 1990 numbers.

I personally think that if a regulation scheme would have to be adopted - which I don't believe to be the case - it would be more fair if a later year were chosen as the reference year. With 1990 as the reference year, many regions of the world are being punished for their growth in the 1990s while Europe is irrationally rewarded.

One sixth of Britons, close to a 10-year record, suffer from fuel poverty (more than 10% of income spent for utility bills). Green energy policy is one of the underlying reasons of this bad trend.

Sunday, January 20, 2008
... /////

André-Marie Ampère was born on January 20th, 1775, in Lyon, France, to a family of a rich and smart merchant and a pious mother. He was a child prodigy who lived in a nearby burg. André-Marie was able to resum long arithmetic series using pebbles and biscuits before he knew the figures.

His father wanted to teach him Latin but it was realized that the boy prefers maths and physics. Nevertheless, André-Marie had to learn Latin soon, in order to read the papers by Euler and Bernoulli. ;-)

Love and faith

In 1796, he fell in love with a very religious woman, Julie Carron, who was from the working class and they married in 1799. However, two years later, he moved to Bourg, to earn some money for the family as a professor, leaving his ailing wife and son (later, French philologist Jean-Jacques Ampère) in Lyon. Sadly, she died in 1804 and his heart was broken forever. Exactly when she died, he copied a touching verse from the Psalms and started to read the Bible and the Fathers of the Church more regularly.

When he was 18, he quoted the following three events as the key to his life: his first holy communion; the reading of "Eulogy of Descartes" by Thomas; and the taking of the Bastille.

Electrodynamics

Jean Baptiste Joseph Delambre was impressed by Ampère's probabilistic analysis showing why the gamblers always win in the long run and he made sure that Ampère stayed in physics. What did Ampère do on September 11th, 1820? No, he didn't fly into the World Trade Center and he didn't defend his PhD thesis. Instead, he heard of Hans Christian Ørsted's strange new phenomenon: a magnetic needle moves when there is a voltaic current nearby.

Saturday, January 19, 2008
... /////

have numerically calculated the properties of non-supersymmetric D0-like black holes in the maximally supersymmetric BFSS matrix model. The most difficult part is the strong coupling that is relevant for the Schwarzschild black holes which is what the D0-branes become.

They have used powerful computers and flexible algorithms to optimize their calculation and the resulting energy-temperature relation agrees with gravity even at strong coupling and even though the agreement cannot be guaranteed by any supersymmetric non-renormalization theorems because the whole setup breaks supersymmetry. Of course, I have never had any doubts that it would work but it is cool that one can actually do it. They can now literally calculate how the black holes are composed out of stringy objects.

Their calculation is not a lattice gauge theory approach but rather a sophisticated Fourier expansion over time with the Polyakov lines used as the key order parameter whose nonzero vev at all temperatures shows that supersymmetry removes the phase transition, as expected (in contrast to a purely bosonic model).

Half of the USD 500,000 total award goes to Rashid Sunyaev (middle) and two quarters go to Maxim Kontsevich (left) and Edward Witten (right), respectively. Congratulations. Click the picture for more details.

Friday, January 18, 2008
... /////

I have emphasized that the Bayesian probabilities are subjective in character. They depend on the precise evidence that one uses in his reasoning. It is meaningless to calculate Bayesian probabilities too accurately or claim that science has calculated one of them to be 90%. For example, if a report says that the probability that most of the 20th century warming was caused by man-made CO2 emissions was determined by science and equals 90%, it proves that the authors are just parrots who don't know what such probabilities mean. Why?

If someone's probability that the statement is correct really equals 90%, it means that the person thinks that a better scientist who could actually choose and analyze better and more extensive evidence more carefully would end up saying that the probability of that statement is 100% (with probability 90%) or the probability of that statement is 0% (with probability 10%). The precise figure of 90% is just a subjective result and a temporary state of affairs. The only reason why it is not equal to 0% or 100% is that the question is not settled. Some people can't distinguish subjective psychological conclusions from objective science.

Goal of this text

But it turned out that there exists another problem. Other people, and sometimes it is the same ones, also don't know how to look for their own subjective opinions and probabilities rationally. Bayesian inference is a good method to separate assumptions from results and to provide us with a solid methodology to use evidence and arrive at reasonable conclusions about the likelihood of various statements.

So even though nothing changes about my criticism of subjective probabilities, I will dedicate a special positive article to Bayesian inference and shed some light on its relevance for the naturalness problem, anthropic misinterpretations of the landscape, retrodictions, and thermodynamics.

Articles on this blog criticizing the anthropic reasoning are far too numerous to be listed. Try the landscape category.

Bayes' formula: its meaning

Rev. Thomas Bayes (1702-1761) was more than a Presbyterian minister. He was also a mathematician who gave us a useful formula how our psychological probabilities should be refined if we obtain some new evidence. I recommend you this Wikipedia article for a pretty clear explanation. Nevertheless, I give you mine, too.

In this formula, we investigate different, complementary (and mutually incompatible) hypotheses H_i to explain some phenomena. Before we obtained the new evidence, we had some idea about their likelihood (from previous evidence, from the testimonies of our favorite and wise friends, or from some laws of physics such as the Hartle-Hawking state, if you wish). These subjective probabilities P(H_i) are called the priors. If you like to think in terms of physics, the initial conditions of a physical system are the best example of such hypotheses: each initial state is a hypothesis H_i. Bayes' formula is then a method of retrodiction.

If we don't know anything about the validity of the theories at all, the priors should give a chance to every qualitatively or macroscopically different hypothesis to survive (e.g. 1/N). You shouldn't kill a theory by choosing a ridiculously small prior just because it has a low entropy or a small number of extraterrestrial aliens etc.

Suddenly, we observe some evidence E to occur. In the case of retrodiction, E is a particular feature of the final state that we observe - for example the whole macroscopic or approximate description of such a final state. We use this final state to deduce the initial state. In exact, microscopic physics, there would be a one-to-one correspondence between initial and final states. But if we only know some partial (e.g. macroscopic, in the thermodynamic sense, inaccurate, or otherwise incomplete) information about the final state, it is impossible to uniquely deduce the initial state, not even its basic macroscopic properties.

Predictions become irreversible and retrodictions follow different rules, ideally the rules of Bayesian inference. Because this kind of reasoning inherently depends on the priors, there will always be an uncertainty in any kind of retrodiction.

Because rationally thinking people want to avoid the base rate fallacy and they care about the evidence, their guess about the probability of different hypotheses (or initial conditions) is influenced by the evidence. The probability of a hypothesis after we obtained evidence E, namely the so-called posterior probability P(H_i|E), written down as the conditional probability of H_i given the evidence E, will be different from the prior P(H_i). But how much it will differ?

Bayes' formula: a derivation

First, assume that every hypothesis H_i either claims that the evidence should occur, i.e. the conditional probability P(E|H_i)=1, or it shouldn't occur, P(E|H_i)=0. How does the evidence that E has actually occurred influence the probabilities?

Well, it's easy in this case. We simply eliminate the hypotheses that have been falsified, i.e. those H_i with P(E|H_i)=0 that predict that the evidence E shouldn't occur. Note that all your knowledge of dynamics and Feynman diagrams is only used to calculate the conditional probabilities of evidence E in a hypothesis, P(E|H_i). That's where all the dynamics is hidden.

But when E is observed, we shouldn't change the ratios of probabilities of the hypotheses that have survived because all of them passed equally well. So we must only renormalize all of these probabilities by a universal factor independent of the particular hypothesis H_i, a factor chosen to guarantee that the sum of P(H_i|E) over i equals one. The correct formula is thus:

where P(E) is the normalization factor equal to the sum of P(H_i)P(E|H_i). Note that with this factor, the sum of the posterior probabilities P(H_i|E) equals one. Also, this number P(E) is equal to the weighted average of the conditional probabilities P(E|H_i) of the evidence E over different hypotheses. It is naturally weighted by the prior probabilities of these hypotheses which is why it is natural to call it simply P(E), the so-called marginal probability of seeing the evidence E.

Check that the formula has all the desired properties. If a hypothesis H_i predicts that E shouldn't occur, i.e. if the conditional probability of E given H_i, namely P(E|H_i)=0, then the posterior probability of H_i will also vanish. The hypothesis has been falsified. For the hypotheses where P(E|H_i)=1, we see that P(H_i) and P(H_i|E), the prior and posterior probabilities, only differ by the universal normalization factor which is what we wanted.

In reality, hypotheses often imply that the evidence E isn't sharply predicted or sharply impossible. Instead, a hypothesis H_i can predict E to have a probability P(E|H_i) between 0 and 1, given e.g. by the expectation value of a projection operator for the evidence E in quantum mechanics. It is natural to make the posterior probabilities P(H_i|E) linear in the conditional probabilities P(E|H_i). In other words, we can use the displayed formula above even if the conditional probabilities are generic numbers in between 0 and 1. Then it is the real Bayes' formula.

Everything clear?

Avoiding repeated evidence

You might ask why the relationship between P(H_i|E) and P(E|H_i) is linear. Well, the choice is natural because you can imagine that H_i is divided to equally likely "subhypotheses" H_ij, some of which (i.e. for some choices of j) have P(E|H_ij)=0 while others have P(E|H_ij)=1. In this setup, P(E|H_i) is the proportion of the subhypotheses H_ij of H_i with P(E|H_ij)=1. With this interpretation, the continuous Bayes' formula may be derived from the formula where P(E|H_i) are only allowed to be 0 or 1, assuming that you will choose the prior probabilities P(H_ij) of the subhypotheses to be independent of j.

But the question why the relationship is linear is a good one, anyway. For example, if someone incorrectly uses the same (or not at all independent) evidence E to refine his probabilities of hypotheses twice or thrice, he could end up with a quadratic or cubic relationship.

If someone uses the evidence 2500 times, as the IPCC does, the relationship will be a power law with the exponent equal to 2500. Posterior probabilities calculated in this way will effectively set the probabilities of all the theories H_i except for one that maximizes P(E|H_i) equal to zero. But such posterior probabilities are, of course, completely wrong. The corresponding logical fallacy that pretends that one or the most likely alternative (even among many) must occur is called the appeal to probability and it is the most frequent argument in all kinds of alarmism and paranoia. The correct relationship must be linear and only independent evidence may be used to refine the probabilities of hypotheses in the Bayesian reasoning.

As we have repeatedly discussed in the articles about thermodynamics, H is not a mirror image of E in this framework. Well, let me be more careful. It is true that Bayes' formula can be written in the following H-E symmetric form,

P(H_i|E) / P(H_i) = P(E|H_i) / P(E),

which is why you should believe that its essence kind of remembers the underlying H-E symmetry that becomes the time-reversal symmetry if we use the formula for retrodictions.

However, the interpretation of different objects in the formula above is asymmetric. For example, the conditional probability P(E|H_i) is a sharply calculable prediction of the hypothesis H_i for seeing evidence E, for example the expectation value of a projection operator representing E in quantum mechanics (or a sum of squared amplitudes) with H_i as the initial state. It has an objective and unchanging meaning. On the other hand, the posterior probability P(H_i|E) is a subjective probability of a hypothesis after we have taken some evidence E into account. It has no objective or eternal meaning, especially because it depends on the priors P(H_i).

Most importantly, P(H_i|E) is not equal to P(E|H_i) even though some people incorrectly think that time evolution is time-reversal-symmetric even when the information is incomplete: this assumption would imply that the two quantities should be equal to each other, much like the squared absolute values of the inner products of "evolved initial" and "final" states in both orders. But they are not equal. The mistaken belief that they are approximately or exactly equal is so widespread that it has a name: it is called "the conditional probability fallacy". Mathematician John A. Paulos explains that the mistake is often made by highly-educated non-statisticians such as doctors and lawyers (and cosmologists such as Sean Carroll).

Analogously, P(H) and P(E) play a different role, too. P(H) is purely subjective - or it depends on previous data that have nothing to do with the new evidence E - while P(E) depends both on the subjective likelihoods of H_i as well as calculations of the evidence E in the different hypotheses.

The purpose of this short section was to repeat that retrodictions in physical theories are not canonical, unlike predictions. They always depend on priors. Once any kind of incomplete information occurs in your discussion - e.g. if you only study the system at the macroscopic level - retrodictions follow very different rules than predictions. That's why high entropy may be predicted in the future but not in the past. The people who still don't get this basic asymmetry are probably just too zealous or too stupid.

Bayesian inference and naturalness

But I also want to discuss other topics related to Bayes' formula. We will dedicate a few paragraphs to a simple question, namely the interpretation of naturalness. Naturalness in particle physics says that dimensionless parameters in the Lagrangian are expected to be around one. Is it a universal law of physics?

No, it is just a psychological expectation. Consider the QCD theta-angle. We know that a shift by 2.pi is physically inconsequential which is why the QCD theta-angle is a priori a number between 0 and 2.pi. If we don't know anything, we should assume a uniform probability distribution for this parameter.

Is such an assumption canonical? Nope. It is just a guess. For example, you could also think that a power of theta, not theta itself, has a uniform distribution which would be equivalent to a different distribution for theta. In this case, a uniform distribution for theta itself sounds more "intuitive" because it measures the volume on a moduli space once theta becomes a modulus but there is no hard proof that it is the correct one. Equally importantly, different sensible distributions that are uniform in simple functions of theta lead to the same qualitative conclusions.

What conclusions? Well, if we assume the prior probability P(H_i) to be uniform - in this case, we must clearly use a continuous setup of Bayes' formula where probabilities P(E|H_i) and P(H_i|E) become densities and sums over i must be turned into integrals over theta - the probability that theta is gonna be smaller than 10^{-9} is clearly smaller than 10^{-9} (over two pi). Even with slightly different distributions, the probability will be very small.

So we should be surprised. Of course that there is no contradiction here if theta is measured to be smaller than 10^{-9} as it indeed is. But the surprise strongly suggests that the prior probability is probably unrealistic. In other words, there must be some other, so far unknown and not quite random physical phenomenon or phenomena (for example, a new substructure of the particles and their quantum fields or a new symmetry that implies new cancellations, at least approximate ones) that make small values of theta (or zero) more likely. Once you understand these phenomena (perhaps axions, in this case) more correctly, your expectations for the distribution of theta will obviously change. If you are lucky, the strong CP-problem - the puzzle why is theta so surprisingly small - will evaporate.

Once we understood inflation, the huge size or mass of the observable Universe in Planck units also became less mysterious. There are many examples of this kind. Science is about making surprises less surprising, after all.

Incidentally, in the case of the theta angle, we seem to know that the anthropic principle can't be enough to show that theta is very small because life could probably exist for large theta, too. If someone claims that the anthropic principle clarifies all unnatural puzzles and hierarchies in current particle physics, this observation of mine pretty much falsifies his statement. The anthropic principle is not enough to make all small numbers sound natural. It can constrain others. But is it unexpected that some quantities are constrained by life and others are not? Can the tautological fact that a correct theory of the Cosmos must be compatible with life be used to derive anything non-trivial about the Universe? Which things can be derived and which things can't? How many of them should be derivable (clearly not all of them)? How do you decide in which of them the anthropic explanation is enough and which of them should get a better one? Is there any rationally justifiable answer here?

I am not aware of one. The statement by Nima et al. that the dimensionful parameters of the Lagrangian are more affected by anthropic arguments than the dimensionless ones is the closest thing to a rationally semi-justifiable observation I can think about.

Bayesian inference and the anthropic principle

Bayesian inference allows us to sharply distinguish what is our assumption - the prior probabilities - and what is actually being deduced from some evidence E. Some people use seemingly rational proportionality laws that they present either as results of some evidence E - which they are clearly not - or as justified priors - even though there exists no rational justification for such priors. These mistakes have been pointed out and corrected by many people, for example by Hartle and Srednicki in Phys Rev D, but many people still don't get it.

The first logical fallacy is called the "selection fallacy" by Hartle and Srednicki and it involves counting the more or less intelligent observers, the density of life, or counting the vacua in a class of stringy compactifications in order to find out which kind of background is more likely to describe the real Universe, assuming that we are "typical representatives" of a class. People often say that a class of stringy vacua is likely to be correct because it has many elements. Others say that universes with a high expectation value of intelligent observers or a higher density of life per galaxy are more likely.

Are these things scientific? In other words, do they follow from Bayesian inference?

For example, let us consider the most important example in which the existence of humans is the evidence E from the formula and we use it to refine our subjective probabilities of different stationary points in the landscape. Some people argue that the theories with a (very) large portion of the Universe occupied by intelligent life are (strongly) preferred. Is that true?

To answer this question, we must be very careful what E actually says. The evidence we have says that at least at one point of this observable Universe, there exists a human civilization. More concretely, the available evidence doesn't say that at a randomly chosen place of this Universe, one finds a human civilization. This is a very important subtlety. ;-)

Why is this subtlety so confusing? Because we may say that our planet is located at a random point of the Universe - a sentence that sounds correct and almost equivalent to the previous one. But the meaning of the word "random" is different than it was in the previous paragraph. The people who don't distinguish the role of the adjective in these two contexts are, in fact, making the very same error as the people who don't distinguish predictions and retrodictions.

When we say that our planet is located at a random place of the Universe, it means that we are not aware of any special properties of our Galaxy or the region occupied by the Solar System. If you pick a random galaxy and a random star in it, you end up with stars that are pretty similar to the Sun. That's why the sentence "We live at a random place of the Cosmos" is kind of correct.

But you don't end up with the Sun itself (you shouldn't forget to randomize your random generator!) which is why the sentence "At a random place of the Universe, you find humans" is obviously wrong. Random stars in random galaxies usually don't have life on them and even if they do, the creatures don't look like us. If there were humans with TV antennas orbitting nearby stars, we would have already detected many of them.

So it is very important to notice that there is no evidence behind the statement "At a random place of the Universe, you find humans" because in this sentence, you allow the other people to run their random generators and look at their random places. They will find no life over there. If you want to use life as evidence to refine your ideas about the validity of different theories, you must formulate E more carefully.

Once again, the justifiable statement is that "At least one star in the Universe is orbitted by a planet with humans". Furthermore, you may add another justifiable observation that the lively star looks much like many other stars. If a hypothesis H_i seems to predict that there are humans somewhere in the Universe, it doesn't matter how many civilizations or how high density it predicts. It simply passes the test.

If you deal with a theory of a multiverse or a class of string compactifications, which involve more or less well-defined sets of stringy backgrounds in both cases, the corresponding hypothesis really says that "Our planet lives in the universe that is correctly described by [at least] one vacuum in the corresponding set of vacua."

Again, this is the hypothesis we are testing. If you use the existence of life to decide which class of compactifications is more likely, the only thing that matters is whether at least one vacuum in the class is good enough to admit life similar to ours. Once a class of vacua passes the test, it just passes.

Whether life of our kind is predicted in a large percentage of the compactifications or a small percentage of compactifications is clearly irrelevant. If you go through the exact formulation of the hypotheses and evidence and use the correct Bayes' formula, you will see that I am right and the anthropic people are simply making a mistake. Their reasoning was too sloppy. There is no mystery here.

So why do I like heterotic vacua?

You might say that if I deny that an observed property of the Cosmos should better be generic in a promising class of vacua, I also undermine a reason why I believe that the heterotic vacua - that can pretty simply and naturally give the Standard Model gauge group within a grand unified framework - are more likely to be correct than the type IIB flux vacua. Some vacua in the type IIB set will have these properties, too, and I just said that one was enough. So isn't it a tie?

OK. So how do I formulate my thinking in the Bayesian framework? In this case, it is all about the priors. I simply believe that there exists a cosmological mechanism that makes "simple" vacua, with a proper definition of the word "simple", more likely to result from a cosmological evolution and more likely to survive various instabilities, inconsistencies, and dualities that will be discovered in the future. Or perhaps, a new theory of initial conditions will assign simple vacua a greater weight. Simple vacua are preferred much like low-lying states of a cool enough harmonic oscillator. Because of this reason, my prior probabilities are concentrated around the "simple" representatives of various compactifications, for example the heterotic compactifications with small Hodge numbers or braneworlds with small numbers of branes or small fluxes.

In the type IIB set, my prior is mostly located at too simple flux compactifications that simply do not give us the correct gauge group or the correct fermion spectrum. With this kind of reasoning, I end up thinking that the heterotic vacua that predict a pretty good physics "without much work" and with "specially looking manifolds" are actually more likely to be true than numerous vacua that are more generically incorrect. But I also realize that this conclusion depends on my prior belief in some kind of simplicity of the world, in Nature's tendency to choose special compactifications.

Constraints from a small cosmological constant

The cosmological constant is observed to be something like 10^{-123} in the Planck units. This observation is the main empirical evidence used to defend the anthropic ideas. Using Bayesian reasoning, does the small cosmological constant actually imply that we must live in a vacuum inside a dense discretuum i.e. a huge landscape of possibilities?

As usual, the answer depends on the priors. First, let us assume that the cosmological constant in any realistic, supersymmetry-breaking vacuum must be a random number whose distribution is peaked somewhere around the Planck density. Then, it is indeed unlikely for a string compactification to generate any region of space where the cosmological constant would be so tiny. The probability that at least one place or bubble has the right cosmological constant approaches one as soon as you consider an ensemble of 10^{123} vacua or more. That's why the anthropic people like the large landscape.

Imagine that you only consider a small set of candidate vacua, for example the 10+10+10+10 most beautiful heterotic, Hořava-Witten, G_2 holonomy, and F-theory vacua. What does the observed cosmological constant - the evidence E - tell you about the probabilities? Well, indeed, the uniform priors would imply that the tiny observed cosmological constant would make it unlikely for one of these 40 theories to be correct.

However, my prior is not uniform. I think that there can exist many potential mechanisms such as the cosmological seesaw mechanism that make small values of the cosmological constant pretty natural. I am not certain about the existence of such a mechanism but I assign a non-negligible probability to its existence. This nonzero probability therefore influences my (inaccurately known, in this case) conditional probabilities for various theories to generate various values of the cosmological constant so that a small cosmological constant is simply not astronomically unlikely anymore: there is a finite "tail" near zero. With these assumptions, I don't need a huge set of possibilities.

What I say should have been expected. Whether or not you need a huge landscape depends on your beliefs. If your priors reflect your belief that there can't exist any mechanisms or alternative calculations making small lambda likely, a huge landscape of 10^{123}+ vacua is almost necessary. If you believe that there is a chance that a more detailed calculation can actually show that the cosmological constant likes to be small, the huge landscape is not needed.

If the correct answer is that there are way too many vacua and we live in a rather generic one, it still doesn't tell you much about other questions. For example, even if you know that the cosmological constant (or the number of dimensions of space) has an anthropic explanation, it is no free ticket for the anthropic explanations to spread.

Whether or not the strong CP-problem is explained by having many vacua is a new question, unrelated to the cosmological constant. And the answer to this question is almost certainly that the right explanation is non-anthropic, e.g. axions. These answers - whether the anthropic explanation is relevant for some question - primarily depend on something else than the universal, religious power of the anthropic principle. Quite on the contrary, these answers depend on the existence of deeper and more accurate explanations for the individual features of the Universe. Every physically independent question is a new one.

Life should be likely: but how likely?

You often hear that theories that predict that our life is more likely are more acceptable than the theories that predict that our life is much less likely. Indeed, this is a correct principle. In Bayes' formula, the theories that probably lead to life have a higher value of P(E|H_i), where E stands for life, which increases P(H_i|E), too.

However, once again, you must be very careful what the probability P(E|H_i) means. It is the probability that life E emerges somewhere - in at least one region - in the observable Universe predicted by the theory H_i. If your theory H_i predicts that a huge fraction of stars have life, it doesn't increase its posterior probability P(H_i|E) simply because the high density doesn't matter since there is no evidence for such a high density! One lively planet is good enough. You can't choose a probability P(H_i|E) greater than one.

If you imagine that a theory predicts a spatially infinite Universe, you could protest that such an infinite Universe will inevitably generate humans somewhere and my prescription assigns an unfairly high probability to such a theory. You might think that such a theory should be punished for predicting a very low density of life. I disagree. One planet predicted by such a theory where the phenomena look just like the phenomena observed from the Earth and follow the same patterns and relationships is simply good enough for the theory to pass the test of life.

In this context, you should notice that a theory that produces Boltzmann's brains in a spatially and temporally infinite Universe may also pass the test of life but it fails other tests. Indeed, life can emerge somewhere in a spatially and temporally infinite Universe in the form of Boltzmann's brains. So the conditional probability P(E|H_i) where E is life and H_i is a theory with the infinite Universe is equal to one and the theory is not punished by the observed existence of life at all, whether or not the theory predicts burning stars!

On the other hand, the binary fact about the existence of life is not the only evidence that can be used to refine the probabilities of various theories. Additional evidence implies, among billions of other things, that the observations E_2 from many telescopes are consistent with an ordered Big Bang cosmology. The probability of such an outcome predicted by Boltzmann's brains is something like exp(-s) where s is the number of data points ever measured in science. ;-) It is this evidence - the observed order of the real world that seems to make sense - that effectively rules out Boltzmann's brains as a correct explanation. But the observed existence of life itself is simply not constraining enough to do so!

Some people might just correctly want to punish Boltzmann's brain theories but they don't determine the correct reason why we know that these theories are very unlikely. The reason is not the known existence of life or a low density of life predicted by those theories but the observed order of our empirical data that is predicted to be very unlikely by every Boltzmann's brain theory.

Summary

Everyone is recommended to learn the formulation and a proof of Bayes' formula and use them carefully whenever there is a controversy about the calculation of some probabilities, especially if the differences between the opinions of people about some probabilities become exponentially huge and whenever there is a dispute about the difference between assumptions and the insights obtained from the evidence.

Once you do it, many arguments may be shown to be simply wrong while others might be shown to be nothing else than an encoded version of the author's preconceptions, preconceptions that are supported by no evidence. Technically, the latter mistake is based on choosing exponentially small priors for sensible (and probably true) theories.

Conclusions that the young Universe had to have a high entropy; that a scientific theory predicts that we should be Boltzmann's brains; that classes of vacua are better if they produce bigger Universes with denser life or if the class of compactifications are very numerous - all these conclusions may be sharply identified as results of faulty or sloppy reasoning, incorrect versions of Bayes' formula, misinterpretation of the hypotheses or the available evidence, or illegitimate choices of prior probabilities that suppress the correct answers a priori.

Let's avoid these mistakes, please.

And that's the memo.

Bonus: craziness of Bousso et al.

While Hartle and Srednicki are not only right but also win the citation-count battle in this typicality discipline, there are still people who disagree with their (and Bayes') obviously correct rules.

For example, Bousso, Freivogel, and Yang argue on page 1 of their bizarre paper about Boltzmann babies that Hartle and Srednicki's rule that we are not allowed to assume our civilization's typicality implies that we can't deduce anything from our predictions and that science as we know it is impossible. In other words, the anthropic principle is a pillar underlying all of science. Wow. ;-)

In their thought experiment, a theory T1 predicts the electron in your lab to have spin up with probability epsilon (much smaller than one) while T2 predicts spin down with probability epsilon. If you measure the spin to be up, T1 is pretty much falsified while T2 is confirmed.

Bousso et al. claim that one can't make this conclusion in the Hartle-Srednicki setup. Why? Because - hold your breath - we should actually compute the probability that the spin is up in at least one laboratory of the Universe predicted by T1 and this probability is not epsilon but X = 1-(1-epsilon)^L where L is the number of labs in the Universe and this number X is effectively equal to one for very large L, leading to the opposite conclusion than the correct one.

The conclusion by Bousso et al. is of course complete rubbish. When T1 predicts P(up)=epsilon, it is a probabilistic prediction that applies to every single lab in the Universe with the same initial conditions. It holds for typical labs as well as atypical labs, labs led by men and women, liberals and conservatives. In fact, the free will theorem guarantees that the electrons randomly decide according to the statistical predictions and they are not affected by the lab in which they live or any of the data in its past light cone: you can't really divide the labs to typical ones or atypical ones because all the electrons are free and their random decisions are unaffected by their environment (e.g. hidden variables that are thus forbidden).

By the way, it is useful to have many labs or many copies of the experiment if you want to measure the probabilities more accurately. Bousso et al. argue that according to the Bayesian reasoning, having many labs makes things less conclusive which sounds as a complete madness to me. I don't even know what confusion leads them to this conclusion so I can't discuss it. But having many labs is a different topic and I want to talk about the single-lab setup only.

When you say that T1 predicts P(up)=epsilon in your lab, you don't need to be making any assumption about your lab whatsoever, except for the assumptions and initial conditions that were used to calculate the result. The theory has already made the prediction for P(up) and it was epsilon. The statement that "at least one lab in the Universe saw the electron spinning up" is a completely different statement than the statement that "the electron in your lab - or any other one concrete lab - is spinning up". Raphael et al. seem to mix up these two different statements.

What does the evidence in the two cases actually say?

Because I don't genuinely believe that they're so confused that they don't distinguish these two clearly different statements, I think that the reason of their confusion must lie elsewhere. I think that they actually misunderstand which of these statements has been empirically justified in the two situations (spinning electron vs life in the Universe).

If we measure the spin to be up, we have actually proven the statement that "the electron spin in our particular lab is up". More concretely, it is the same lab for which we have defined the initial conditions. Quantum mechanics was able to link the initial state of this particular lab with the measurements of the spin in the same lab. It doesn't matter which lab in the Universe it was. The important thing is that we are still talking about the same lab.

If Raphael et al. use the initial conditions in the lab No. 2008 and use quantum mechanics applied to T1 to deduce that at least one lab in the whole Universe will see spin up with probability epsilon, they are just using quantum mechanics incorrectly. If they use some combined average information about all labs in the Universe and deduce something about a particular lab, they are making a similar mistake. If they use the same lab both in the initial and final state but they end up with the probability 1-(1-epsilon)^L, they are making a mistake, too. The laws of the realistic quantum mechanical theories are local and only allow us to predict the measurements in the same lab whose initial conditions had to be inserted to the machine to calculate the theoretical predictions. And such a result is independent of other labs and their number.

But the situation with the counting of life on planets (or in the universes) is different. Should an easily acceptable theory predict many planets with life? The answer is a resounding No, as explained above. Where is the difference from the spinning electron thought experiment? The difference is that the possible hypotheses or initial states that we are comparing in the case of life are no longer the initial states of a single lab or a single planet but the possible initial states of the whole Universe.

The whole Universe has no special relationship with any of its planets. So there is a dramatic difference here. In the case of the electron, we have measured the spin to be up in the same, special, marked lab whose initial conditions were used to derive the prediction. T2 correctly predicted the spin to be probably up but the probability was a conditional probability given the assumption that the same lab had certain initial conditions.

On the other hand, in the case of the planets in the Universe, we observe life on at least one, arbitrary, unmarked planet of the Universe but this planet is in no way connected with any special region of the Universe included in the initial conditions or in the defining equations of the theory and the corresponding probability of life is not really a conditional one.

So when we observe life, we observe "life on at least one planet", while when we observe the spin up, we observe "spin up in exactly the same lab whose initial conditions helped to define the very problem". In other words, the quantifiers are different. In the case of life, the empirical evidence only implies that "there exists" at least one planet with life. In the case of the spin, the empirical evidence implies something different and kind of stronger, namely that "in the same particular lab that was talked about when we defined the initial conditions of the problem, the spin was up".

In the case of the electron, the initial state of the same lab was a part of the conditions in the conditional probability P(up|conditions) predicted by T1 and this fact makes a huge difference. When we make the theoretical calculation of the observed existence of life, we mustn't make any a priori conditions about the planet where the life would be going to be observed: the probability is not really conditional.

If you wanted to defend the statement of Raphael et al. about the typicality of life, you would need a different sort of empirical evidence. You would need to show that there is life on every or almost every planet of this Universe. You would need evidence that our Universe has the property that when you start with a planet, you end up with life with a high probability. Or you would at least need to show that the density of lively planets is high. This is tautologically the evidence that you need to argue that a theory should better predict many copies of life in the Universe and we obviously don't have any such evidence because we only know one lively planet so far.

To understand the difference between these two things is a kindergarten problem that a kid should be able to figure out in a few minutes. Nevertheless, Raphael, despite his bright mind, has clearly been struggling with this triviality for years to no avail. It seems kind of amazing and Bayesian reasoning indicates that because he couldn't have figured out these basic things for years, it is unlikely (P < 1/(365 x 5)) that he will do it by tomorrow. But I still hope he will! ;-)

had an entertaining article in the New York Times about some cutting edge cosmology. Its scientific content is bizarre but I think that Overbye faithfully reproduces the actual discussions between theoretical cosmologists - including the most famous ones - and the ideas that they are currently thinking about. Yes, it seems that many of them are losing their minds.

Below, in this essay originally posted on January 16th, I will show that even though the silliness has many forms, most of it is a result of one very concrete flawed assumption about the priors.

Recall that Boltzmann half-jokingly argued a century ago that if the Universe exists for an infinite period of time, it is likely that all finite-volume configurations of matter are likely to be repeated infinitely many times (unless the matter density decreases too quickly). The configurations include your brain that spontaneously emerges, with the usual memory and its content, from the middle of some galactic gas. It is unlikely but if you wait for an infinite amount of time, such things do occur.

Some people have seriously claimed that it is thus infinitely more likely that your brain in its current state that you experience didn't occur as a result of an ordered evolution starting from the Big Bang and continuing with human evolution and with love affairs of your parents. Instead, it is more conceivable for it to be a random fluctuation emerging from a complete chaos. Such a brain without its proper context and history is referred to as Boltzmann's brain or a freaky observer.

Click the picture for more information and one more picture from the funny rally. These warriors are lucky that they don't have to protest in Russia where temperatures will drop to minus 67 degrees Fahrenheit at some places.

Thursday, January 17, 2008
... /////

describes, the Pope eventually cancelled his visit after a protest of 67 professors, led by Emeritus Professor Marcello Cini, who didn't like the fact that in 1990, Cardinal Ratzinger supported the following quote by Feyerabend:

In the age of Galileo the Church showed to be more faithful to reason than Galileo himself. The trial against Galileo was reasonable and just.

Galileo at the process of the Inquisition

I recommend you to click the picture above and read a page that tries to defend the Church against Galileo. For example, it mentions the opinion of Italian journalist Vittorio Messori:

Galileo was not condemned for the things he said, but for the way he said them. He made statements with a sectarian intolerance, like a ‘missionary’ of a new gospel …. Since he did not have objective evidence for what he said, the things he said in his private letters to those men [of the Roman College] made him suspect of dogmatism supporting the new religion of science. One who would not immediately accept the entire Copernican system was ‘an imbecile with his head in the clouds,’ ‘a stain upon mankind,’ ‘a child who never grew up,’ and so on. At depth the certainty of being infallible seemed to belong more to him than to the religious authority.

Also, Galileo is criticized for the fact that the boundaries between religion, philosophy, and science were fuzzy in the 17th century. I doubt it was Galileo's fault and I don't really care whether Galileo's teaching was a new religion, new philosophy, or new science. I think it is much more important that he was right and his wisdom turned out to be essential for the development of our civilization. I happen to care about the fact that Galileo was infallible in the fundamental questions, unlike the religious authority. This fact introduces a certain asymmetry and the asymmetry might lean to the opposite side than the side that Vittorio Messori would like.

André Bornemann and eight co-authors investigate whether there were ice caps in the Turonian, a warm period roughly 90 million years ago that lasted for 4 million years.

The tropical sea surface temperature constantly exceeded 35 °C, more than by 10 °C warmer than the current temperatures, and crocodiles used to live in the Arctic region.

However, new data involving the Oxygen 18 isotope indicate that during this era, there was a roughly 200,000-year-long period when the Antarctic ice cap did exist and it was about 50% of its current size. In contradiction with the common assumptions, 10 °C of global warming is apparently not enough to prevent ice sheets from growing. See also