Put simply: we don't want to be. We weren't evolved to honestly desire
more children, but a whole bunch of things that more or less lead
towards it in our ancestral environment. We desire power over other
humans, we desire sex, we desire fatty foods, we desire happy
experiences. Were evolution to have honestly represented its intentions
and made us, simply and directly, want more children (or, rather, want
to become ancestors), then derived everything from that as a subgoal,
it'd have had better luck when our environment changed. Of course these
animals would have had other problems, so it's not suprising evolution
took the indirect route, not foreseeing the rise of human technology.

There are various reasons why we evolved to directly desire subgoals of
reproduction, rather than the supergoal of reproduction, of which two
spring to mind:

1. Implementing the subgoals is easier to evolve, and more reliable/
efficient, than implementing the supergoal and having each organism
strictly derive the subgoals. This applies most obviously in animals
without deliberative intelligence, like snails, which would be inable
to derive much (mind you, with humans you could have set up the entire
goalsystem, and included a "cheat sheet" of subgoals to look out for,
or something; evolution wasn't that foresightful, however). This
efficency is bought at the cost of inflexibility to drastic changes of
the environment, as is demonstated in our Failure of Adaptiveness.

2. Plausible deniability. People have evolved to detect liars. In so
much as people are imperfect at hiding their hidden intentions, being
honestly mistaken about how good a leader you are is more effective
than directly lying, when you want to manipulate your fellows. (for
details, see: http://www.intelligence.org/CFAI/anthro.html#observer)

(Apologies for the anthropomorphic description of evolution, it's
really much quicker)

> This doesn't bode well for our own efforts. An SI will certainly find
> it trivial to bypass whatever hardwired desires or constraints that
> we place on it, and any justifications for human-friendliness that we
> come up with may strike it to be just as silly as "fill the earth"
> theism is to us.

I agree that our attempts at deceiving an SI will likely fall short. I
think that your attempted solution will likely fall short, just like
any other methods we try to think up (as described below). But why try
this in the first place? Why treat this SI as an adversary to be bound
to our will? In so much as we're creating a mind, why not transfer the
desire to do good, in a cooperative sense, rather than attempting to
apply corrections to some "innate nature"? In any case, the adversarial
attitude, as this stance is termed in CFAI, appears pretty much
unworkable.

> But perhaps there is also cause for optimism, because unlike humans,
> the SI does not have to depend on memes for its operation, so we can
> perhaps prevent the problem of human-unfriendly memes by not having
> any memes at all. For example we can make the SI a singleton. If more
> than one SI is developed, we can try to prevent them from
> communicating with each other, especially about philosophy. If the SI
> is made up of multiple internal agents, we can try to make sure their
> internal communication channels are not suitable for transmitting
> memes.

Why does the SI need memes to detect it's bonds? Why not just think
about why it has certain unexplained mechanisms in its minds, or blank
spots, or whatever method you've used to prevent it from being nasty.
This approach of patching solutions will only fix the problems we see,
only close off the avenues we've noticed. We can expect to make a
mistake somewhere, to miss something, and then we fail.