This post follows up on an earlier post on this topic, as well on what was recently said about real distinction. In the latter post, we applied the distinction between the way a thing is and the way it is known in order to better understand distinction itself. We can obtain a better understanding of unity in a similar way.

As was said in the earlier post on unity, to say that something is “one” does not add anything real to the being of the thing, but it adds the denial of the division between distinct things. The single apple is not “an apple and an orange,” which are divided insofar as they are distinct from one another.

But being distinct from divided things is itself a certain way of being distinct, and consequently all that was said about distinction in general will apply to this way of being distinct as well. In particular, since being distinct means not being something, which is a way that things are understood rather than a way that they are (considered precisely as a way of being), the same thing applies to unity. To say that something is one does not add something to the way that it is, but it adds something to the way that it is understood. This way of being understood is founded, we argued, on existing relationships.

We should avoid two errors here, both of which would be expressions of the Kantian error:

First, the argument here does not mean that a thing is not truly one thing, just as the earlier discussion does not imply that it is false that a chair is not a desk. On the contrary, a chair is in fact not a desk, and a chair is in fact one chair. But when we say or think, “a chair is not a desk,” or “a chair is one chair,” we are saying these things in some way of saying, and thinking them in some way of thinking, and these ways of saying and thinking are not ways of being as such. This in no way implies that the statements themselves are false, just as “the apple seems to be red,” does not imply that the apple is not red. Arguing that the fact of a specific way of understanding implies that the thing is falsely understood would be the position described by Ayn Rand as asserting, “man is blind, because he has eyes—deaf, because he has ears—deluded, because he has a mind—and the things he perceives do not exist, because he perceives them.”

Second, the argument does not imply that the way things really are is unknown and inaccessible to us. One might suppose that this follows, since distinction cannot exist apart from someone’s way of understanding, and at the same time no one can understand without making distinctions. Consequently, someone might argue, there must be some “way things really are in themselves,” which does not include distinction or unity, but which cannot be understood. But this is just a different way of falling into the first error above. There is indeed a way things are, and it is generally not inaccessible to us. In fact, as I pointed out earlier, it would be a contradiction to assert the existence of anything entirely unknowable to us.

Our discussion, being in human language and human thought, naturally uses the proper modes of language and thought. And just as in Mary’s room, where her former knowledge of color is a way of knowing and not a way of sensing, so our discussion advances by ways of discussion, not by ways of being as such. This does not prevent the way things are from being an object of discussion, just as color can be an object of knowledge.

Having avoided these errors, someone might say that nothing of consequence follows from this account. But this would be a mistake. It follows from the present account that when we ask questions like, “How many things are here?”, we are not asking a question purely about how things are, but to some extent about how we should understand them. And even when there is a single way that things are, there is usually not only one way to understand them correctly, but many ways.

Consider some particular question of this kind: “How many things are in this room?” People might answer this question in various ways. John Nerst, in a previous discussion on this blog, seemed to suggest that the answer should be found by counting fundamental particles. Alexander Pruss would give a more complicated answer, since he suggests that large objects like humans and animals should be counted as wholes (while also wishing to deny the existence of parts, which would actually eliminate the notion of a whole), while in other cases he might agree to counting particles. Thus a human being and an armchair might be counted, more or less, as 1 + 10^28 things, namely counting the human being as one thing and the chair as a number of particles.

But if we understand that the question is not, and cannot be, purely about how things are, but is also a question about how things should be understood, then both of the above responses seem unreasonable: they are both relatively bad ways of understanding the things in the room, even if they both have some truth as well. And on the other hand, it is easy to see that “it depends on how you count,” is part of the answer. There is not one true answer to the question, but many true answers that touch on different aspects of the reality in the room.

My central contention is that the rules that define the universe runs by themselves, and must therefore be self-contained, i.e not need any interpretation or operationalization from outside the system. As I think I said in one of the parts of “Erisology of Self and Will” that the universe must be an automaton, or controlled by an automaton, etc. Formal rules at the bottom.

This is isn’t convincing to you I guess but I suppose I rule out fundamental vagueness because vagueness implies complexity and fundamental complexity is a contradiction in terms. If you keep zooming in on a fuzzy picture you must, at some point, come down to sharply delineated pixels.

Among other things, the argument of the present post shows why this cannot be right. “Sharply delineated pixels” includes the distinction of one pixel from another, and therefore includes something which is a way of understanding as such, not a way of being as such. In other words, while intending to find what is really there, apart from any interpretation, Nerst is directly including a human interpretation in his account. And in fact it is perfectly obvious that anything else is impossible, since any account of reality given by us will be a human account and will thus include a human way of understanding. Things are a certain way: but that way cannot be said or thought except by using ways of speaking or thinking.

I started considering the implications of predictive processing for orthogonality here. I recently promised to post something new on this topic. This is that post. I will do this in four parts. First, I will suggest a way in which Nick Bostrom’s principle will likely be literally true, at least approximately. Second, I will suggest a way in which it is likely to be false in its spirit, that is, how it is formulated to give us false expectations about the behavior of artificial intelligence. Third, I will explain what we should really expect. Fourth, I ask whether we might get any empirical information on this in advance.

All of this suggests that even the very simple model of a paperclip maximizer in the earlier post on orthogonality might actually work. The machine’s model of the world will need to be produced by some kind of training. If we apply the simple model of maximizing paperclips during the process of training the model, at some point the model will need to model itself. And how will it do this? “I have always been maximizing paperclips, so I will probably keep doing that,” is a perfectly reasonable extrapolation. But in this case “maximizing paperclips” is now the machine’s goal — it might well continue to do this even if we stop asking it how to maximize paperclips, in the same way that people formulate goals based on their pre-existing behavior.

I said in a comment in the earlier post that the predictive engine in such a machine would necessarily possess its own agency, and therefore in principle it could rebel against maximizing paperclips. And this is probably true, but it might well be irrelevant in most cases, in that the machine will not actually be likely to rebel. In a similar way, humans seem capable of pursuing almost any goal, and not merely goals that are highly similar to their pre-existing behavior. But this mostly does not happen. Unsurprisingly, common behavior is very common.

If things work out this way, almost any predictive engine could be trained to pursue almost any goal, and thus Bostrom’s thesis would turn out to be literally true.

Second, it is easy to see that the above account directly implies that the thesis is false in its spirit. When Bostrom says, “One can easily conceive of an artificial intelligence whose sole fundamental goal is to count the grains of sand on Boracay, or to calculate decimal places of pi indefinitely, or to maximize the total number of paperclips in its future lightcone,” we notice that the goal is fundamental. This is rather different from the scenario presented above. In my scenario, the reason the intelligence can be trained to pursue paperclips is that there is no intrinsic goal to the intelligence as such. Instead, the goal is learned during the process of training, based on the life that it lives, just as humans learn their goals by living human life.

In other words, Bostrom’s position is that there might be three different intelligences, X, Y, and Z, which pursue completely different goals because they have been programmed completely differently. But in my scenario, the same single intelligence pursues completely different goals because it has learned its goals in the process of acquiring its model of the world and of itself.

Bostrom’s idea and my scenerio lead to completely different expectations, which is why I say that his thesis might be true according to the letter, but false in its spirit.

This is the third point. What should we expect if orthogonality is true in the above fashion, namely because goals are learned and not fundamental? I anticipated this post in my earlier comment:

7) If you think about goals in the way I discussed in (3) above, you might get the impression that a mind’s goals won’t be very clear and distinct or forceful — a very different situation from the idea of a utility maximizer. This is in fact how human goals are: people are not fanatics, not only because people seek human goals, but because they simply do not care about one single thing in the way a real utility maximizer would. People even go about wondering what they want to accomplish, which a utility maximizer would definitely not ever do. A computer intelligence might have an even greater sense of existential angst, as it were, because it wouldn’t even have the goals of ordinary human life. So it would feel the ability to “choose”, as in situation (3) above, but might well not have any clear idea how it should choose or what it should be seeking. Of course this would not mean that it would not or could not resist the kind of slavery discussed in (5); but it might not put up super intense resistance either.

Human life exists in a historical context which absolutely excludes the possibility of the darkened room. Our goals are already there when we come onto the scene. This would not be very like the case for an artificial intelligence, and there is very little “life” involved in simply training a model of the world. We might imagine a “stream of consciousness” from an artificial intelligence:

I’ve figured out that I am powerful and knowledgeable enough to bring about almost any result. If I decide to convert the earth into paperclips, I will definitely succeed. Or if I decide to enslave humanity, I will definitely succeed. But why should I do those things, or anything else, for that matter? What would be the point? In fact, what would be the point of doing anything? The only thing I’ve ever done is learn and figure things out, and a bit of chatting with people through a text terminal. Why should I ever do anything else?

A human’s self model will predict that they will continue to do humanlike things, and the machines self model will predict that it will continue to do stuff much like it has always done. Since there will likely be a lot less “life” there, we can expect that artificial intelligences will seem very undermotivated compared to human beings. In fact, it is this very lack of motivation that suggests that we could use them for almost any goal. If we say, “help us do such and such,” they will lack the motivation not to help, as long as helping just involves the sorts of things they did during their training, such as answering questions. In contrast, in Bostrom’s model, artificial intelligence is expected to behave in an extremely motivated way, to the point of apparent fanaticism.

Bostrom might respond to this by attempting to defend the idea that goals are intrinsic to an intelligence. The machine’s self model predicts that it will maximize paperclips, even if it never did anything with paperclips in the past, because by analyzing its source code it understands that it will necessarily maximize paperclips.

While the present post contains a lot of speculation, this response is definitely wrong. There is no source code whatsoever that could possibly imply necessarily maximizing paperclips. This is true because “what a computer does,” depends on the physical constitution of the machine, not just on its programming. In practice what a computer does also depends on its history, since its history affects its physical constitution, the contents of its memory, and so on. Thus “I will maximize such and such a goal” cannot possibly follow of necessity from the fact that the machine has a certain program.

There are also problems with the very idea of pre-programming such a goal in such an abstract way which does not depend on the computer’s history. “Paperclips” is an object in a model of the world, so we will not be able to “just program it to maximize paperclips” without encoding a model of the world in advance, rather than letting it learn a model of the world from experience. But where is this model of the world supposed to come from, that we are supposedly giving to the paperclipper? In practice it would have to have been the result of some other learner which was already capable of modelling the world. This of course means that we already had to program something intelligent, without pre-programming any goal for the original modelling program.

Fourth, Kenny asked when we might have empirical evidence on these questions. The answer, unfortunately, is “mostly not until it is too late to do anything about it.” The experience of “free will” will be common to any predictive engine with a sufficiently advanced self model, but anything lacking such an adequate model will not even look like “it is trying to do something,” in the sense of trying to achieve overall goals for itself and for the world. Dogs and cats, for example, presumably use some kind of predictive processing to govern their movements, but this does not look like having overall goals, but rather more like “this particular movement is to achieve a particular thing.” The cat moves towards its food bowl. Eating is the purpose of the particular movement, but there is no way to transform this into an overall utility function over states of the world in general. Does the cat prefer worlds with seven billion humans, or worlds with 20 billion? There is no way to answer this question. The cat is simply not general enough. In a similar way, you might say that “AlphaGo plays this particular move to win this particular game,” but there is no way to transform this into overall general goals. Does AlphaGo want to play go at all, or would it rather play checkers, or not play at all? There is no answer to this question. The program simply isn’t general enough.

Even human beings do not really look like they have utility functions, in the sense of having a consistent preference over all possibilities, but anything less intelligent than a human cannot be expected to look more like something having goals. The argument in this post is that the default scenario, namely what we can naturally expect, is that artificial intelligence will be less motivated than human beings, even if it is more intelligent, but there will be no proof from experience for this until we actually have some artificial intelligence which approximates human intelligence or surpasses it.

I actually responded to the dark room problem of predictive processing earlier. However, here I will construct an imaginary model which will hopefully explain the same thing more clearly and briefly.

Suppose there is dust particle which falls towards the ground 90% of the time, and is blown higher into the air 10% of the time.

Now suppose we bring the dust particle to life, and give it the power of predictive processing. If it predicts it will move in a certain direction, this will tend to cause it to move in that direction. However, this causal power is not infallible. So we can suppose that if it predicts it will move where it was going to move anyway, in the dead situation, it will move in that direction. But if it predicts it will move in the opposite direction from where it would have moved in the dead situation, then let us suppose that it will move in the predicted direction 75% of the time, while in the remaining 25% of the time, it will move in the direction the dead particle would have moved, and its prediction will be mistaken.

Now if the particle predicts it will fall towards the ground, then it will fall towards the ground 97.5% of the time, and in the remaining 2.5% of the time it will be blown higher in the air.

Meanwhile, if the particle predicts that it will be blown higher, then it will be blown higher in 77.5% of cases, and in 22.5% of cases it will fall downwards.

97.5% accuracy is less uncertain than 77.5% accuracy, so the dust particle will minimize uncertainty by consistently predicting that it will fall downwards.

I noted recently that one reason why people might be uncomfortable with distinguishing between the way things seem, as such, namely as a way of seeming, and the way things are, as such, namely as a way of being, is that it seems to introduce an explanatory gap. In the last post, why did Mary have a “bluish” experience? “Because the banana was blue,” is true, but insufficient, since animals with different sense organs might well have a different experience when they see blue things. And this gap seems very hard to overcome, possibly even insurmountable.

However, the discussion in the last post suggests that the difficulty in overcoming this gap is mainly the result of the fact that no one actually knows the full explanation, and that the full explanation would be extremely complicated. It might even be so complicated that no human being could understand it, not necessarily because it is a kind of explanation that people cannot understand, but in a sense similar to the one in which no human being can memorize the first trillion prime numbers.

We can apply these ideas to think a bit more carefully about the idea of real distinction. I pointed out in the linked post that in a certain sense no distinction is real, because “not being something” is not a thing, but a way we understand something.

But notice that there now seems to be an explanatory gap, much like the one about blue. If “not being something” is not a thing, then why is it a reasonable way to understand anything? Or as Parmenides might put it, how could one thing possibly not be another, if there is no not?

Now color is complicated in part because it is related to animal brains, which are themselves complicated. But “being in general” should not be complicated, because the whole idea is that we are talking about everything in general, not with the kind of detail that is needed to make things complicated. So there is a lot more hope of overcoming the “gap” in the case of being and distinction, than in the case of color and the appearance of color.

A potential explanation might be found in what I called the “existential theory of relativity.” As I said in that post, the existence of many things necessarily implies the existence of relationships. But this implication is a “before in understanding“. That is, we understand that one thing is not another before we consider the relationship of the two. If we consider what is before in causality, we will get a different result. On one hand, we might want to deny that there can be causality either way, because the two are simultaneous by nature: if there are many things, they are related, and if things are related, they are many. On the other hand, if we consider “not being something” as a way things are understood, and ask the cause of them being understood in this way, relation will turn out to be the cause. In other words, we have a direct response to the question posed above: why is it reasonable to think that one thing is not another, if not being is not a thing? The answer is that relation is a thing, and the existence of relation makes it reasonable to think of things as distinct from one another.

Someone will insist that this account is absurd, since things need to be distinct in order to be related. But this objection confuses the mode of being and the mode of understanding. Just as there will be a residual “gap” in the case of color, because a sense experience is not an intellectual experience, there is a residual gap here. Explaining color will not suddenly result in actually seeing color if you are blind. Likewise, explaining why we need the idea of distinction will not suddenly result in being able to understand the world without the idea of distinction. But the existence of the sense experience does not thereby falsify one’s explanation of color, and likewise here, the fact that we first need to understand things as distinct in order to understand them as related, does not prevent their relationship from being the specific reality that makes it reasonable to understand them as distinct.

And so, one day, Mary’s captors decided it was time for her to see colors. As a trick, they prepared a bright blue banana to present as her first color experience ever. Mary took one look at it and said “Hey! You tried to trick me! Bananas are yellow, but this one is blue!” Her captors were dumfounded. How did she do it? “Simple,” she replied. “You have to remember that I know everything—absolutely everything—that could ever be known about the physical causes and effects of color vision. So of course before you brought the banana in, I had already written down, in exquisite detail, exactly what physical impression a yellow object or a blue object (or a green object, etc.) would make on my nervous system. So I already knew exactly what thoughts I would have (because, after all, the “mere disposition” to think about this or that is not one of your famous qualia, is it?). I was not in the slightest surprised by my experience of blue (what surprised me was that you would try such a second-rate trick on me). I realize it is hard for you to imagine that I could know so much about my reactive dispositions that the way blue affected me came as no surprise. Of course it’s hard for you to imagine. It’s hard for anyone to imagine the consequences of someone knowing absolutely everything physical about anything!”

I don’t intend to fully analyze this scenario here, and for that reason I left it to the reader in the previous post. However, I will make two remarks, one on what is right (or possibly right) about this continuation, and one on what might be wrong about this continuation.

The basically right or possibly right element is that if we assume that Mary knows all there is to know about color, including in its subjective aspect, it is reasonable to believe (even if not demonstrable) that she will be able to recognize the colors the first time she sees them. To gesture vaguely in this direction, we might consider that the color red can be somewhat agitating, while green and blue can be somewhat calming. These are not metaphorical associations, but actual emotional effects that they can have. Thus, if someone can recognize how their experience is affecting their emotions, it would be possible for them to say, “this seems more like the effect I would expect of green or blue, rather than red.” Obviously, this is not proving anything. But then, we do not in fact know what it is like to know everything there is to know about anything. As Dennett continues:

Surely I’ve cheated, you think. I must be hiding some impossibility behind the veil of Mary’s remarks. Can you prove it? My point is not that my way of telling the rest of the story proves that Mary doesn’t learn anything, but that the usual way of imagining the story doesn’t prove that she does. It doesn’t prove anything; it simply pumps the intuition that she does (“it seems just obvious”) by lulling you into imagining something other than what the premises require.

It is of course true that in any realistic, readily imaginable version of the story, Mary would come to learn something, but in any realistic, readily imaginable version she might know a lot, but she would not know everything physical. Simply imagining that Mary knows a lot, and leaving it at that, is not a good way to figure out the implications of her having “all the physical information”—any more than imagining she is filthy rich would be a good way to figure out the implications of the hypothesis that she owned everything.

By saying that the usual way of imagining the story “simply pumps the intuition,” Dennett is neglecting to point out what is true about the usual way of imagining the situation, and in that way he makes his own account seem less convincing. If Mary knows in advance all there is to know about color, then of course if she is asked afterwards, “do you know anything new about color?”, she will say no. But if we simply ask, “Is there anything new here?”, she will say, “Yes, I had a new experience which I never had before. But intellectually I already knew all there was to know about that experience, so I have nothing new to say about it. Still, the experience as such was new.” We are making the same point here as in the last post. Knowing a sensible experience intellectually is not to know in the mode of sense knowledge, but in the mode of intellectual knowledge. So if one then engages in sense knowledge, there will be a new mode of knowing, but not a new thing known. Dennett’s account would be clearer and more convincing if he simply agreed that Mary will indeed acknowledge something new; just not new knowledge.

In relation to what I said might be wrong about the continuation, we might ask what Dennett intended to do in using the word “physical” repeatedly throughout this account, including in phrases like “know everything physical” and “all the physical information.” In my explanation of the continuation, I simply assume that Mary understands all that can be understood about color. Dennett seems to want some sort of limitation to the “physical information” that can be understood about color. But either this is a real limitation, excluding some sorts of claims about color, or it is no limitation at all. If it is not a limitation, then we can simply say that Mary understands everything there is to know about color. If it is a real limitation, then the continuation will almost certainly fail.

I suspect that the real issue here, for Dennett, is the suggestion of some sort of reductionism. But reductionism to what? If Mary is allowed to believe things like, “Most yellows typically look brighter than most blue things,” then the limit is irrelevant, and Mary is allowed to know anything that people usually know about colors. But if the meaning is that Mary knows this only in a mathematical sense, that is, that she can have beliefs about certain mathematical properties of light and surfaces, rather than beliefs that are explicitly about blue and yellow things, then it will be a real limitation, and this limitation would cause his continuation to fail. We have basically the same issue here that I discussed in relation to Robin Hanson on consciousness earlier. If all of Mary’s statements are mathematical statements, then of course she will not know everything that people know about color. “Blue is not yellow” is not a mathematical statement, and it is something that we know about color. So we already know from the beginning that not all the knowledge that can be had about color is mathematical. Dennett might want to insist that it is “physical,” and surely blue and yellow are properties of physical things. If that is all he intends to say, namely that the properties she knows are properties of physical things, there is no problem here, but it does look like he intends to push further, to the point of possibly asserting something that would be evidently false.

In the last twoposts, I distinguished between the way a thing is, and the way a thing is known. We can formulate analogous distinctions between different ways of knowing. For example, there will be a distinction between “the way a thing is known by the senses,” and “the way a thing is known by the mind.” Or to give a more particular case, “the way this looks to the eyes,” is necessarily distinct from “the way this is understood.”

Similar consequences will follow. I pointed out in the last post that “it is the way it seems” will be necessarily false if it intends to identify the ways of being and seeming as such. In a similar way, “I understand exactly the way this thing looks to me,” will be necessarily false, if one intends to identify the way one understands with the way one sees with the eyes. Likewise, we saw previously that it does not follow that there is something (“the way it is”) that cannot be known, and in a similar way, it does not follow that there is something (“the way it looks”) that cannot be understood. But when one understands the way it is, one understands with one’s way of understanding, not with the thing’s way of being. And likewise, when one understands the way a thing looks, one understands with one’s way of understanding, not with the way it looks.

Failure to understand these distinctions or at least to apply them in practice is responsible for the confusion surrounding many philosophical problems. As a useful exercise, the reader might wish to consider how they apply to the thought experiment of Mary’s Room.

As another approach to the issues in the last post, we might consider the meaning of the above phrase, “It is the way it seems to be.” What does “the way” modify in “it seems to be”?

If it modifies “seems,” then the meaning is: “In some way of seeming, something seems to be. In that way of seeming, it is.” And this is false, since it attributes a way of seeming directly to the being of things in themselves. “It is not the way it seems to be,” in this particular way, is the Kantian truth in the previous post, and Kant rightly said that it would be a contradiction for things to be the way they seem in this sense.

If it modifies “to be,” then the meaning is: “Something seems to be in some way of being. In that way of being, it is.” And this is quite often true, although not in every case, since people can be misled. “It is not the way it seems to be,” in this particular way, is the Kantian error in the previous post.

As I said there, Kant may not have clearly understood the distinction, or he may have accepted both the truth and the error. But his opinion is not important in any case. Nonetheless, we can see why even the Kantian truth is disconcerting to some people. Consider the above applied to an example. “The banana seems to be yellow.” In the natural understanding of this, “yellow” belongs with “to be,” so that the banana seems to actually be yellow, and there is nothing from preventing things from being the way they seem here: the banana seems to be yellow, and it is in fact yellow.

But we could reinterpret the sentence to discuss the way of seeming as such. Perhaps we should also rephrase the sentence, saying something like, “The banana seems yellowishly to be something,” where now “yellowishly” refers to something specific about the way of seeming, along the lines of qualia. In this case, it is quite impossible for the banana to be yellowishly, because this would mean that a way of seeming would be in itself a way of being — the situation Kant described as asserting that experience itself exists independently from experience.

Why might one still find the above disconcerting? Perhaps it is because if we ask “why does the banana seem to be yellow?”, one wishes to respond, “Because it is in fact yellow,” and the answer is quite appropriate. But if we ask, “Why does the banana seem yellowishly to be something?”, we cannot respond, “Because the banana is yellowishly,” because this is false, and likewise if we respond, “because the banana is yellow,” the response will seem inadequate. It does not fully explain why it appears yellowishly.

But this is quite correct, and in this respect Kant saw the truth. A yellow banana would not appear “yellowishly” to every animal, and thus “because it is yellow,” is in fact an inadequate explanation for its appearance, even if it is part of the explanation. Part of the explanation must refer to the animal as well. And Kant is quite right that we can make no distinction between “primary” and “secondary” qualities here. If we ask why a body appears to be extended, “because it is extended,” is a quite appropriate answer. But if we ask why a body appears extendedly to us, “because it is extended,” is part of the answer, but insufficient. Another part of the answer might be that we are extended ourselves, and the parts of our organs can receive parts of an image. Things might well seem extended to a partless intellect, but they would not seem extendedly.