October 20, 2007

I’m working on a paper on this topic, but I figured I would post about it now and clear up some confusion.

Let’s say we have an agent interacting with an environment. The agent sends the environment some number (an action), and the environment sends the agent some number in response (a perception).

When we have an infinite set of hypotheses, the expected utility of some action is determined by an infinite series. It’s important for that series to converge, or else we won’t (in general) be able to compare the expected utilities of two different actions.

Given the following assumptions:

We have some finite set of data about the environment.

We assign a nonzero probability to all computable hypotheses consistent with our data.

Either this probability distribution is computable, or it is bounded below by some positive computable function.

Our utility function is determined by our perceptions, and is unbounded.

Either this utility function is computable, or it is bounded below in absolute value by some unbounded computable function.

…then it follows that our expected utilities do not converge.

This seems like a pretty strong reason to stick with bounded utility functions. It certainly seems much more plausible to me now than it did before that my utility function is bounded (to the extent that I can be said to have a utility function).

But we shouldn’t jump to conclusions. Here are some reasons why the above might not apply:

You might have an uncomputable utility function.

You might have a utility function which is not determined solely by your perceptions.

You might assign a probability of zero to some hypotheses which are consistent with your data.

Makes sense, now that it’s pointed out. One intuitive way of saying this: no matter what your past observations are, there’s always a non-zero probability that:

1. you’re living in a simulation, and

2. someone will replace the simulation with “Heaven or Hell” (give arbitrarily large negative or positive utility).

Given your assumptions, it seems reasonable that the absolute value of the expected utility, in the general case, grows faster than the probability shrinks (at least for non-edge-case examples) because of Rice’s Theorem-type considerations.

Are you saying it also holds for bizarre edge cases? Example that comes to mind:

1. construct a bizarre utility function where the utility of X always equals the negative of the utility of ~X (the bits-complement of X where 0’s and 1’s are flipped)

2. then, construct an equally bizarre Turing-equivalent machine (or a bizarre variant of Solomonoff Induction) where X and ~X are always equally probable. (For example, if the first bit of the program P is 0 the bit is ignored, but if the first bit of P is 1, the ultimate bits of the output.)

2. then, construct an equally bizarre Turing-equivalent machine (or a bizarre variant of Solomonoff Induction) where X and ~X are always equally probable. (For example, if the first bit of the program P is 0 the bit is ignored, but if the first bit of P is 1, the ultimate bits of the output are reversed.)

Let me quickly eat my words, the problem given is the rigorous “whether it converges” and not the fuzzy “whether tiny probabilities might be intuitively reasonable to ignore”. My example doesn’t actually converge, saying something converges is obviously different from saying that adjacent terms cancel out in one particular ordering.

The scenario that Rolf Nelson provides leads to a divergent expected utility, but not in a way that is relevant to decisions–the infinities “exactly cancel out.” It reminds me of renormalization. But I imagine that there are other problems.

Douglas: infinities don’t cancel out. I think you’re imagining a situation where the terms of the series remain bounded; that is, they oscillate back and forth but don’t grow arbitrarily large. This can’t actually happen.

Imagine we could construct such an example (the terms are bounded). Then replace the probability distribution with one that goes to zero more quickly than the original distribution – say, exponentially faster. The original terms were bounded, so the terms of the new series go to zero exponentially fast, so the series converges. But by the original theorem, your expected utility over this new distribution is also divergent. Contradiction.

Here’s a quick idea of how the proof of the theorem goes:

We index all of the computable functions, and use this index to construct a sequence analogous to the busy beaver numbers. Then (as Rolf says), we construct heavens or hells which yield outcomes with utilities exceeding the numbers in this sequence. The probabilities for these outcomes go to zero computably fast (not actually computable, but more slowly than some computable sequence). The busy beaver numbers (and our analogous sequence) grow faster than any computable function. So the utilities go to infinity faster than the probabilities go to zero.

(I’m not posting the full paper here because I want to publish it in an academic journal, and they seem to prefer that you don’t publish elsewhere beforehand. Contact me if you want to look at a draft.)

If I understand your proof correctly, there appears to be a loophole: an unbounded utility function can converge as long as the utility assigned to any state does not increase faster than C^N, where N is the number of bits of entropy/Kolmogorov complexity. Essentially, you arbitrarily decree that any function with a sufficiently high utility must have a sufficiently low probability. To give a simple example, define U(x) over an n-bit system x to be 2*n. The expected utility sums to 2 * 1/2 + 4 * 1/4 + 6 * 1/8 + 8 * 1/16…. = 4.

Sorry if this is a bit off-topic, but why assume that utility is determined solely by perceptions? Real people obviously do not follow this assumption, since they do things whose effects they will never observe，like writing wills. Is it just to make the math easier?

I know what “perception-determined” sounds like it means, but I don’t understand why it’s that expression. Wouldn’t that apply to any utility function? What would a non-perception-determined utility function look like?

Nick, Tom’s scenario is a good example of a non-perception-determined utility function because it’s expressed in terms of the *underlying algorithm that produces the perceptions* rather than being expressed in terms of the perceptions themselves. For the theorem to hold, you’re only allowed to base your theory on perceptions, or (equivalently) on things that can be computably determined from those perceptions. (In contrast, Wei Dai’s example of “signing a will” misses the point, since you would arguably try to infer the future well-being of your heirs from the perception that you have currently have a signed will.)

For example, suppose that Tom’s algorithm perceives a random-looking 10^100 digit number, p = 21346…19872 (with the middle digits omitted for brevity.) Tom’s algorithm has to assign this perception a low utility if its Kolmogorov complexity is low. Can we compute the Kolmogorov complexity based on the perception? In the general case, no, it’s uncomputable. If it turns out that p is “the last 10^100 digits of the imaginary part of the first number that violates the Riemann hypothesis”, then Tom’s algorithm has to magically know that p had low Kolmogorov complexity and needed to be assigned a low utility.

Rolf: good, you understand. Although now that I think about it, I wouldn’t call Tom’s example non-perception-determined. Instead, I would just say that it’s not computable as a perception-determined utility function (It’s not bounded below by a divergent computable function either – if it were, you could use it to build a halting oracle! Do you see how?). I’ll give you an example of a utility function which is not perception-determined, even uncomputably.

Above I wrote the formula for expected utility of a perception-determined hypotheses. For a non-perception-determined hypothesis, we can write the expected utility of an action k as:

Σh in H U(h, k) p(h)

Here we’ll take the example U(h, k) = k. That is, your utility is equal to your action, always. Now, you can’t figure out exactly what your action was just by looking at your perceptions. Actually, you can’t even get a probability distribution! By Bayes’ Theorem, to get a probability for a given action, we would need:

1. A perception (got it)
2. The likelihood of that perception given any action (got it, although not computably)
3. A prior probability for each action (whoops! We don’t have this!)

If the “signing a will” example doesn’t work because you can see yourself signing a will, then what if you can’t? Suppose just before you’re about to sign the will, all of your sensory nerves die, but your motor nerves are left intact. (You know your motor nerves are still functioning because you already knew that you have a disease that affects only sensory nerves.) If you have a perception-determined utility function, do you go through with the signature?

“If you have a perception-determined utility function, do you go through with the signature?”

This is the technical meaning of perception, referring to “data that is directly perceived”, not “data that you personally have perceived”, as in common usage. My loophole is closed because the Kolmogorov complexity of data is noncomputable, and for the utility function to converge, you have to hack it so that the Kolmogorov complexity of the data sets an upper bound on the utility.

So let me restate my original question. Do people work with perception-determined utility functions because they think either it’s a good model of how people actually make decisions or it’s an ideal that people (and AIs) should actually try to approximate, or do they consider it a simplification that makes the math easy enough to allow some theorems to be derived?

I think the latter is perfectly fine, but some of the literature in this field is written as if the author(s) believe in the former.

“I think the latter is perfectly fine, but some of the literature in this field is written as if the author(s) believe in the former.”

In contrast, Yudowsky has advocated basing the utility function on “External reference semantics”, which is another example of a non-perception-based utility function.

Note that an algorithm that cares about external reality, but believes that (with 100% certainty) it can “deduce” external reality from its past, current, and future perceptions, is technically still perception-based for the purposes of the theorem.

“Suppose just before you’re about to sign the will, all of your sensory nerves die, but your motor nerves are left intact.”

I like this example better, in this case it’s clearer that the agent cannot be using a perception-based utility function.

This part I might be able to (awkwardly) prove, along the lines of: “If you had a computable divergent function f that bounds the Kolmogorov complexity from below, you could construct a computable total function g such that the Kolmogorov complexity of g(x) is always greater than or equal to x, and that’s just plain impossible.” But:

“…if it were, you could use it to build a halting oracle! Do you see how?”

You mean a full-fledged member of the 0′ Turing Degree? Congratulations, you have stumped me, I cannot figure out how you would do this.

“I like this example better, in this case it’s clearer that the agent cannot be using a perception-based utility function.”

Methinks we need to define “perception” better- there’s nothing in the theorem that *requires* the utility function to refer to internal states instead of external states, as long as the improbability of the bitstring remains uncomputable (if I understand correctly).

[...] Suppose that you extended linear addition to self-addition: define U(s1 + s1) = U(s1) + U(s1), or more generally, define U(s1 * R) = R * U(s1) for some integer or scalar R. This seems fairly obvious- why can’t you add a state to itself using the same rules as two different states?- but it has a number of annoying consequences. R now can become arbitrarily large, which means that the expected utilities of actions over U will not converge, given a few assorted extra conditions. Trying to calculate preferences under that condition is a mess- the expected utilities end up being dominated by tiny probabilities of huge utilities. [...]

[...] Peter de Blanc, 2006 summer intern at the Singularity Institute and now with the Temple University Department of Mathematics, has posted an article titled “Convergence of Expected Utilities with Algorithmic Probability Distributions.” [...]

An interesting article. While no-one seems to have mentioned it, this is closely related to the “Pascal’s Mugger” problem, where Kolmogorov-based priors mean that the probability of a threat being true seems to diminish too slowly in comparison to the size of the threat (since 3^^^3 has complexity much less than itself, or the log of itself). Thus for a certain sized threat ‘Your money or 3^^^3 people’s lives’, your credence in it will be swamped by the stakes and you will have to pay up. Indeed, did you design the proof specifically to formalize this problem?

I’m not quite sure what to conclude from it all (and haven’t checked your mathematics), but on the face of it, I’m not sure which assumption should be questioned. For example, the kolmogorov-based priors might be better to drop than the unbounded utility function (there are some independent reasons to be skeptical of the former as well).

Internet, seventy to ninety percent of those who begin to create a obtain don total the procedure (abandoned purchasing carts, and so on.). These had been folks inspired sufficient to attempt to get, and gave up. It unlikely that figures are anyplace close to that higher in retail scenarios, but they nevertheless significantly larger than they want to be.
sell oakley sunglasses http://www.walestouristsonline.co.uk/inc/oakley.asp?p=267

Towards lead to this newsworthiness concerning the within Moncler jerkin, Moncler jacket, moncler jackets Moncler shoesMoncler young people, Moncler perspirer along with other Moncler programs, acquire so that you can Moncler on the internet shop0 feedback.

This is the lighting brown lightly dress I would like to propose, widespread 2 bottle repleat keys style through cozy dog’s fur scruff of the neck, along with the chest enhancement not to mention midsection are generally furnished utilizing installing tailoring, captivating along with beautiful.

cash of $127 million chop down listed below our very own $131 million compute as a result of political worry that most impeded revenues at the conclusion of sept. 2009 were the 60th year husband’s akin to taiwan country wide trip and the us govenment didn allow the diamond jewelry potential sales for several days before the march. 1st break on the protection inquiries, signifies FUQI was struggling to deliver services firewood income during that menstrual cycle,
nike jordan boxing shoes http://www.skagen-byferie.dk/nikefree.asp?p=2154