maxjmartin.com

Rationality From AI to Zombies by Eliezer Yudkowsky

modified 2018/11/02

Quick review:

Insights, lessons learnt:

Many that have already been internalised from reading parts previously of this over the years

Highlights:

loc 1205-7

Do you believe that elan vital explains the mysterious aliveness of
living beings? Then what does this belief not allow to happen-what would
definitely falsify this belief? A null answer means that your belief
does not constrain experience; it permits anything to happen to you. It
floats.

loc 1470-74

Back in the old days, people actually believed their religions instead
of just believing in them. The biblical archaeologists who went in
search of Noah’s Ark did not think they were wasting their time; they
anticipated they might become famous. Only after failing to find
confirming evidence-and finding disconfirming evidence in its place-did
religionists execute what William Bartley called the retreat to
commitment, “I believe because I believe.”

loc 1499-1505

Not only did religion used to make claims about factual and scientific
matters, religion used to make claims about everything. Religion laid
down a code of law-before legislative bodies; religion laid down
history-before historians and archaeologists; religion laid down the
sexual morals-before Women’s Lib; religion described the forms of
government-before constitutions; and religion answered scientific
questions from biological taxonomy to the formation of stars. The Old
Testament doesn’t talk about a sense of wonder at the complexity of the
universe-it was busy laying down the death penalty for women who wore
men’s clothing, which was solid and satisfying religious content of that
era. The modern concept of religion as purely ethical derives from every
other area’s having been taken over by better institutions. Ethics is
what’s left.

loc 1865-70

To assign more than 50% probability to the correct candidate from a pool
of 100,000,000 possible hypotheses, you need at least 27 bits of
evidence (or thereabouts). You cannot expect to find the correct
candidate without tests that are this strong, because lesser tests will
yield more than one candidate that passes all the tests. If you try to
apply a test that only has a million-to-one chance of a false positive
(~20 bits), you’ll end up with a hundred candidates. Just finding the
right answer, within a large space of possibilities, requires a large
amount of evidence.

loc 1996-97

Your strength as a rationalist is your ability to be more confused by
fiction than by reality. If you are equally good at explaining any
outcome, you have zero knowledge.

loc 2142-48

Once upon a time, there was an instructor who taught physics students.
One day the instructor called them into the classroom and showed them a
wide, square plate of metal, next to a hot radiator. The students each
put their hand on the plate and found the side next to the radiator
cool, and the distant side warm. And the instructor said, Why do you
think this happens? Some students guessed convection of air currents,
and others guessed strange metals in the plate. They devised many
creative explanations, none stooping so low as to say “I don’t know” or
“ This seems impossible. ” And the answer was that before the students
entered the room, the instructor turned the plate around.

loc 2265-70

I encounter people who are quite willing to entertain the notion of
dumber-than-human Artificial Intelligence, or even mildly
smarter-than-human Artificial Intelligence. Introduce the notion of
strongly superhuman Artificial Intelligence, and they’ll suddenly decide
it’s “pseudoscience.” It’s not that they think they have a theory of
intelligence which lets them calculate a theoretical upper bound on the
power of an optimization process. Rather, they associate strongly
superhuman AI to the literary genre of apocalyptic literature; whereas
an AI running a small corporation associates to the literary genre of
Wired magazine.

loc 2927-36

How can you realize that you shouldn’t trust your seeming knowledge that
“light is waves”? One test you could apply is asking, “Could I
regenerate this knowledge if it were somehow deleted from my mind?” This
is similar in spirit to scrambling the names of suggestively named LISP
tokens in your AI program, and seeing if someone else can figure out
what they allegedly “refer” to. It’s also similar in spirit to observing
that an Artificial Arithmetician programmed to record and play back
Plus-Of(Seven, Six) = Thirteen can’t regenerate the knowledge if you
delete it from memory, until another human re-enters it in the database.
Just as if you forgot that “light is waves,” you couldn’t get back the
knowledge except the same way you got the knowledge to begin with-by
asking a physicist. You couldn’t generate the knowledge for yourself,
the way that physicists originally generated it.

loc 3388-91

What should I believe? As it turns out, that question has a right
answer. It has a right answer when you’re wracked with uncertainty, not
just when you have a conclusive proof. There is always a correct amount
of confidence to have in a statement, even when it looks like a
“personal belief” and not like an expert-verified “fact.”

loc 3394-99

But, writes Robin Hanson : 1 You are never entitled to your opinion.
Ever! You are not even entitled to “I don’t know.” You are entitled to
your desires, and sometimes to your choices. You might own a choice, and
if you can choose your preferences, you may have the right to do so. But
your beliefs are not about you; beliefs are about the world. Your
beliefs should be your best available estimate of the way things are;
anything else is a lie.

loc 3637-48

Believing in Santa Claus gives children a sense of wonder and encourages
them to behave well in hope of receiving presents. If Santa-belief is
destroyed by truth , the children will lose their sense of wonder and
stop behaving nicely. Therefore, even though Santa-belief is
false-to-fact, it is a Noble Lie whose net benefit should be preserved
for utilitarian reasons. Classically, this is known as a false dilemma ,
the fallacy of the excluded middle, or the package-deal fallacy . Even
if we accept the underlying factual and moral premises of the above
argument, it does not carry through. Even supposing that the Santa
policy (encourage children to believe in Santa Claus) is better than the
null policy (do nothing), it does not follow that Santa-ism is the best
of all possible alternatives. Other policies could also supply children
with a sense of wonder, such as taking them to watch a Space Shuttle
launch or supplying them with science fiction novels. Likewise (if I
recall correctly), offering children bribes for good behavior encourages
the children to behave well only when adults are watching, while praise
without bribes leads to unconditional good behavior.

loc 3859-63

In this case, I begin to suspect psychology that is more imperfect than
usual-that someone may have made a devil’s bargain with their own
mistakes, and now refuses to hear of any possibility of improvement.
When someone finds an excuse not to try to do better, they often refuse
to concede that anyone else can try to do better, and every mode of
improvement is thereafter their enemy, and every claim that it is
possible to move forward is an offense against them. And so they say in
one breath proudly, “I’m glad to be gray,” and in the next breath
angrily, “And you’re gray too!”

loc 4119-22

This isn’t the only way of writing probabilities, though. For example,
you can transform probabilities into odds via the transformation O =
(P/(1 - P)). So a probability of 50% would go to odds of 0.5⁄0.5 or 1,
usually written 1:1, while a probability of 0.9 would go to odds of
0.9⁄0.1 or 9, usually written 9:1. To take odds back to probabilities
you use P = (O/(1 + O)), and this is perfectly reversible, so the
transformation is an isomorphism-a two-way reversible mapping. Thus,
probabilities and odds are isomorphic,

loc 4134-38

A famous proof called Cox’s Theorem (plus various extensions and
refinements thereof) shows that all ways of representing uncertainties
that obey some reasonable-sounding constraints, end up isomorphic to
each other. Why does it matter that odds ratios are just as legitimate
as probabilities? Probabilities as ordinarily written are between 0 and
1, and both 0 and 1 look like they ought to be readily reachable
quantities-it’s easy to see 1 zebra or 0 unicorns. But when you
transform probabilities onto odds ratios, 0 goes to 0, but 1 goes to
positive infinity. Now absolute truth doesn’t look like it should be so
easy to reach.

loc 4263-73

We live in an unfair universe. Like all primates, humans have strong
negative reactions to perceived unfairness; thus we find this fact
stressful. There are two popular methods of dealing with the resulting
cognitive dissonance. First, one may change one’s view of the facts-deny
that the unfair events took place, or edit the history to make it appear
fair. (This is mediated by the affect heuristic and the just-world
fallacy .) Second, one may change one’s morality-deny that the events
are unfair. Some libertarians might say that if you go into a “banned
products shop,” passing clear warning labels that say THINGS IN THIS
STORE MAY KILL YOU, and buy something that kills you, then it’s your own
fault and you deserve it. If that were a moral truth, there would be no
downside to having shops that sell banned products. It wouldn’t just be
a net benefit, it would be a one-sided tradeoff with no drawbacks.
Others argue that regulators can be trained to choose rationally and in
harmony with consumer interests; if those were the facts of the matter
then (in their moral view) there would be no downside to regulation.

loc 4990-91

Not every change is an improvement, but every improvement is necessarily
a change.

loc 5118-22

What is true is already so. Owning up to it doesn’t make it worse. Not
being open about it doesn’t make it go away. And because it’s true, it
is what is there to be interacted with. Anything untrue isn’t there to
be lived. People can stand what is true, for they are already enduring
it. -Eugene Gendlin

loc 7188-90

In the same sense that every thermal differential wants to equalize
itself, and every computer program wants to become a collection of
ad-hoc patches, every Cause wants to be a cult.

loc 9321-26

In a 1989 Canadian study, adults were asked to imagine the death of
children of various ages and estimate which deaths would create the
greatest sense of loss in a parent. The results, plotted on a graph,
show grief growing until just before adolescence and then beginning to
drop. When this curve was compared with a curve showing changes in
reproductive potential over the life cycle (a pattern calculated from
Canadian demographic data), the correlation was fairly strong. But much
stronger-nearly perfect, in fact-was the correlation between the grief
curves of these modern Canadians and the reproductive-potential curve of
a hunter-gatherer people, the !Kung of Africa. In other words, the
pattern of changing grief was almost exactly what a Darwinian would
predict, given demographic realities in the ancestral environment.

loc 9945-49

If you read Judea Pearl’s Probabilistic Reasoning in Intelligent
Systems: Networks of Plausible Inference , 1 then you will see that the
basic insight behind graphical models is indispensable to problems that
require it.

loc 10909-13

But that’s not the a priori irrational part: The a priori irrational
part is where, in the course of the argument, someone pulls out a
dictionary and looks up the definition of “atheism” or “religion.” (And
yes, it’s just as silly whether an atheist or religionist does it.) How
could a dictionary possibly decide whether an empirical cluster of
atheists is really substantially different from an empirical cluster of
theologians? How can reality vary with the meaning of a word? The points
in thingspace don’t move around when we redraw a boundary.

loc 11386-89

When you find yourself in philosophical difficulties, the first line of
defense is not to define your problematic terms, but to see whether you
can think without using those terms at all. Or any of their short
synonyms. And be careful not to let yourself invent a new word to use
instead. Describe outward observables and interior mechanisms; don’t use
a single handle, whatever that handle may be.

loc 11412-13

A word itself can have the destructive force of clich茅 ; a word itself
can carry the poison of a cached thought

loc 11696-99

eyeballing suggests that using the phrase by definition, anywhere
outside of math, is among the most alarming signals of flawed argument
I’ve ever found. It’s right up there with “Hitler,” “God,” “absolutely
certain,” and “can’t prove that.”

loc 13708-10

it sometimes seems to me that at least 20% of the real-world
effectiveness of a skilled rationalist comes from not stopping too early
. If you keep asking questions, you’ll get to your destination
eventually. If you decide too early that you’ve found an answer, you
won’t.

loc 14567-74

In one of the standard fantasy plots, a protagonist from our Earth, a
sympathetic character with lousy grades or a crushing mortgage but still
a good heart, suddenly finds themselves in a world where magic operates
in place of science. The protagonist often goes on to practice magic,
and become in due course a (superpowerful) sorcerer. Now here’s the
question-and yes, it is a little unkind, but I think it needs to be
asked: Presumably most readers of these novels see themselves in the
protagonist’s shoes, fantasizing about their own acquisition of sorcery.
Wishing for magic. And, barring improbable demographics, most readers of
these novels are not scientists. Born into a world of science, they did
not become scientists. What makes them think that, in a world of magic,
they would act any differently?

Yes!

loc 14696-98

Modern science is built on discoveries, built on discoveries, built on
discoveries, and so on, all the way back to people like Archimedes, who
discovered facts like why boats float, that can make sense even if you
don’t know about other discoveries. A good place to start traveling that
road is at the beginning.

loc 14772-78

There is an acid test of attempts at post-theism. The acid test is: “If
religion had never existed among the human species-if we had never made
the original mistake-would this song, this art, this ritual, this way of
thinking, still make sense?” If humanity had never made the original
mistake, there would be no hymns to the nonexistence of God. But there
would still be marriages, so the notion of an atheistic marriage
ceremony makes perfect sense-as long as you don’t suddenly launch into a
lecture on how God doesn’t exist. Because, in a world where religion
never had existed, nobody would interrupt a wedding to talk about the
implausibility of a distant hypothetical concept. They’d talk about
love, children, commitment, honesty, devotion, but who the heck would
mention God?

loc 14833-35

As Cialdini remarks, a chief sign of this malfunction is that you dream
of possessing something, rather than using it. (Timothy Ferriss offers
similar advice on planning your life: ask which ongoing experiences
would make you happy, rather than which possessions or status-changes.)

loc 14968-74

The problem is that, in the present world, very few people bother to
study science in the first place. Science cannot be the true Secret
Knowledge, because just anyone is allowed to know it-even though, in
fact, they don’t. If the Great Secret of Natural Selection, passed down
from Darwin Who Is Not Forgotten, was only ever imparted to you after
you paid \$2,000 and went through a ceremony involving torches and robes
and masks and sacrificing an ox, then when you were shown the fossils,
and shown the optic cable going through the retina under a microscope,
and finally told the Truth, you would say “That’s the most brilliant
thing ever!” and be satisfied.

loc 15567-71

The sound of these words is probably represented in your auditory
cortex, as though you’d heard someone else say it. (Why do I think this?
Because native Chinese speakers can remember longer digit sequences than
English-speakers. Chinese digits are all single syllables, and so
Chinese speakers can remember around ten digits, versus the famous
“seven plus or minus two” for English speakers. There appears to be a
loop of repeating sounds back to yourself, a size limit on working
memory in the auditory cortex, which is genuinely phoneme-based.)

loc 17301-7

But the conjunction rule does give us a rule of monotonic decrease in
probability: as you tack more details onto a story, and each additional
detail can potentially be true or false, the story’s probability goes
down monotonically. Think of probability as a conserved quantity:
there’s only so much to go around. As the number of details in a story
goes up, the number of possible stories increases exponentially, but the
sum over their probabilities can never be greater than 1. For every
story “X and Y,” there is a story “X and 卢Y.” When you just tell the
story “X,” you get to sum over the possibilities Y and 卢Y. If you add
ten details to X, each of which could potentially be true or false, then
that story must compete with 210 - 1 other equally detailed stories for
precious probability.

loc 19622-27

Tell people to choose an important problem and they will choose the
first cache hit for “important problem” that pops into their heads, like
“global warming” or “string theory.” The truly important problems are
often the ones you’re not even considering, because they appear to be
impossible, or, um, actually difficult, or worst of all, not clear how
to solve. If you worked on them for years, they might not seem so
impossible … but this is an extra and unusual insight; naive realism
will tell you that solvable problems look solvable, and
impossible-looking problems are impossible.

loc 20008-14

This parable helps illustrate why Bayesians must think about prior
probabilities. There is a branch of statistics, sometimes called
“orthodox” or “classical” statistics, which insists on paying attention
only to likelihoods. But if you only pay attention to likelihoods, then
eventually some fixed-coin hypothesis will always defeat the fair coin
hypothesis, a phenomenon known as “overfitting” the theory to the data.
After thirty flips, the likelihood is a billion times as great for the
fixed-coin hypothesis with that sequence, as for the fair coin
hypothesis. Only if the fixed-coin hypothesis (or rather, that specific
fixed-coin hypothesis) is a billion times less probable a priori can the
fixed-coin hypothesis possibly lose to the fair coin hypothesis.

loc 20105-8

Therefore do I advocate that Bayesian probability theory should be
taught in high school. Bayesian probability theory is the sole piece of
math I know that is accessible at the high school level, and that
permits a technical understanding of a subject matter-the dynamics of
belief-that is an everyday real-world domain and has emotionally
meaningful consequences. Studying Bayesian probability would give
students a referent for what it means to “explain” something.

loc 20186-95

Once there was a conflict between nineteenth century physics and
nineteenth century evolutionism. According to the best physical models
then in use, the Sun could not have been burning very long. Three
thousand years on chemical energy, or 40 million years on gravitational
energy. There was no energy source known to nineteenth century physics
that would permit longer burning. Nineteenth century physics was not
quite as powerful as modern physics-it did not have predictions accurate
to within one part in 1014. But nineteenth century physics still had the
mathematical character of modern physics, a discipline whose models
produced detailed, precise, quantitative predictions. Nineteenth century
evolutionary theory was wholly semitechnical, without a scrap of
quantitative modeling. Not even Mendel’s experiments with peas were then
known. And yet it did seem likely that evolution would require longer
than a paltry 40 million years in which to operate-hundreds of millions,
even billions of years. The antiquity of the Earth was a vague and
semitechnical prediction, of a vague and semitechnical theory. In
contrast, the nineteenth century physicists had a precise and
quantitative model, which through formal calculation produced the
precise and quantitative dictum that the Sun simply could not have
burned that long.

loc 20259-62

Fascinating words have no power, nor yet any meaning, without the math.
The fascinating words are not knowledge but the illusion of knowledge,
which is why it brings so little satisfaction to know that “gravity
results from the curvature of spacetime.” Science is not in the
fascinating words, though it’s all you’ll ever read as breaking news.

loc 20552-57

There is a shattering truth, so surprising and terrifying that people
resist the implications with all their strength. Yet there are a lonely
few with the courage to accept this satori. Here is wisdom, if you would
be wise: Since the beginning Not one unusual thing Has ever happened.
Alas for those who turn their eyes from zebras and dream of dragons! If
we cannot learn to take joy in the merely real, our lives shall be empty
indeed.

loc 21702-4

Anthony, only those who have moralities worry over whether or not they
have them . If your metaethic tells you to kill people, why should you
even listen? Maybe that which you would do even if there were no
morality, is your morality.

loc 21827-35

So all this suggests that you should be willing to accept that you might
know a little about morality. Nothing unquestionable, perhaps, but an
initial state with which to start questioning yourself . Baked into your
brain but not explicitly known to you, perhaps; but still, that which
your brain would recognize as right is what you are talking about. You
will accept at least enough of the way you respond to moral arguments as
a starting point to identify “morality” as something to think about. But
that’s a rather large step. It implies accepting your own mind as
identifying a moral frame of reference, rather than all morality being a
great light shining from beyond (that in principle you might not be able
to perceive at all). It implies accepting that even if there were a
light and your brain decided to recognize it as “morality,” it would
still be your own brain that recognized it, and you would not have
evaded causal responsibility-or evaded moral responsibility either, on
my view.

loc 22400-22401

The experimental rule is that losing a desideratum-\$50, a coffee mug,
whatever-hurts between 2 and 2.5 times as much as the equivalent gain.

loc 22740-44

Once upon a time, three groups of subjects were asked how much they
would pay to save 2,000 / 20,000 / 200,000 migrating birds from drowning
in uncovered oil ponds. The groups respectively answered \$80, \$78, and
\$88. 1 This is scope insensitivity or scope neglect: the number of
birds saved-the scope of the altruistic action-had little effect on
willingness to pay.

loc 22747-51

People visualize “a single exhausted bird, its feathers soaked in black
oil, unable to escape.” 4 This image, or prototype, calls forth some
level of emotional arousal that is primarily responsible for
willingness-to-pay-and the image is the same in all cases. As for scope,
it gets tossed out the window-no human can visualize 2,000 birds at
once, let alone 200,000. The usual finding is that exponential increases
in scope create linear increases in willingness-to-pay-perhaps

loc 22941-42

I believe that one experiment showed that a shift from 100% probability
to 99% probability weighed larger in people’s minds than a shift from
80% probability to 20% probability.

loc 22954-56

Whenever you try to price a probability shift from 24% to 23% as being
less important than a shift from ~1 to 99%-every time you try to make
an increment of probability have more value when it’s near an end of the
scale-you open yourself up to this kind of exploitation.

loc 23013-23

Save 400 lives, with certainty. Save 500 lives, with 90% probability;
save no lives, 10% probability. Most people choose option 1. Which, I
think, is foolish; because if you multiply 500 lives by 90% probability,
you get an expected value of 450 lives, which exceeds the 400-life value
of option 1. (Lives saved don’t diminish in marginal utility, so this is
an appropriate calculation.) “What!” you cry, incensed. “How can you
gamble with human lives? How can you think about numbers when so much is
at stake? What if that 10% probability strikes, and everyone dies? So
much for your damned logic! You’re following your rationality off a
cliff!” Ah, but here’s the interesting thing. If you present the options
this way: 100 people die, with certainty. 90% chance no one dies; 10%
chance 500 people die. Then a majority choose option 2. Even though it’s
the same gamble. You see, just as a certainty of saving 400 lives seems
to feel so much more comfortable than an unsure gain, so too, a certain
loss feels worse than an uncertain one.

loc 23067-69

Occasionally people object to any discussion of morality on the grounds
that morality doesn’t exist, and in lieu of explaining that “exist” is
not the right term to use here, I generally say, “But what do you do
anyway?” and take the discussion back down to the object level.

loc 23156-57

you see people insisting that no amount of money whatsoever is worth a
single human life, and then driving an extra mile to save \$10;

loc 23244-45

But any human legal system does embody some answer to the question “How
many innocent people can we put in jail to get the guilty ones?,” even
if the number isn’t written down.

loc 23486-88

Being the lone voice of dissent in the crowd and having everyone look at
you funny is much scarier than a mere threat to your life, according to
the revealed preferences of teenagers who drink at parties and then
drive home.

loc 24187-90

Particularly damaging, I think, was the bad example set by the
pretenders to Deep Wisdom trying to stake out a middle way; smiling
condescendingly at technophiles and technophobes alike, and calling them
both immature. Truly this is a wrong way; and in fact, the notion of
trying to stake out a middle way generally, is usually wrong. The Right
Way is not a compromise with anything; it is the clean manifestation of
its own criteria.

loc 24678-85

I once lent Xiaoguang “Mike” Li my copy of Probability Theory: The Logic
of Science. Mike Li read some of it, and then came back and said: Wow .
. . it’s like Jaynes is a thousand-year-old vampire. Then Mike said,
“No, wait, let me explain that-” and I said, “No, I know exactly what
you mean.” It’s a convention in fantasy literature that the older a
vampire gets, the more powerful they become. I’d enjoyed math proofs
before I encountered Jaynes. But E. T. Jaynes was the first time I
picked up a sense of formidability from mathematical arguments. Maybe
because Jaynes was lining up “paradoxes” that had been used to object to
Bayesianism, and then blasting them to pieces with overwhelming
firepower-power being used to overcome others.