It is very likely that everything that makes you be you is encoded in
the physical structure of your brain. If we could extract all this
information, then we ought to be able to 'run' you as a program on
some powerful computer. This potential future technology gets called
"whole brain emulation", and the Future of Humanity Institute
prepared a very detailed roadmap in 2008
which covers a huge amount of research in this direction, combined
with some scale analysis of the difficulty of various tasks.

We're currently nowhere near having the technology to do this. We
have neither the ability to extract this information from your brain
nor the computational power to simulate even a minute of thought a
year. [1] The social and economic impact of having running people as
software would be huge, however, so it would be nice to know how
likely this is. Could we have it in a decade? A century? Ever?

The human brain is incredibly complex. People have around a hundred
billion neurons, with even more connections between them. Even the
fruit fly has around a hundred thousand. That's more than we can
handle right now. The tiny nematode C. elegans, a 1mm roundworm, has
only 302 neurons, so few that we have names for all of them. It is
also very well understood, having been studied as a model animal for
at least fifty years. We have its brain fully mapped out: perhaps we
should be able to simulate it by now?

There have been several projects to create a nematode simulation. The
NemaSys project at
the University of Oregon in the late 1990s planned
a full model, including the body, every neuron, every "electrical and
chemical synapse and neuromuscular junction", and a "complete set of
sensory modalities". They never published a paper on their
simulation, however, so I can't tell if they even got funded.

The Perfect C. elegans Project was a 1998 attempt at something similar
by a different group of people. They got as far as releasing an
initial
report (pdf)
but their simulation was not complete at the time of the paper. I
don't see anything else from them afterwards, so it looks to me like
they did not end up completing it.

The Virtual C. elegans Project at Hiroshima University around 2004 was
another attempt at nematode simulation. They released two papers
describing their simulation: A Dynamic
Body Model of the Nematode C. elegans With Neural Oscillators and
A
Model of Motor Control of the Nematode C. elegans With Neuronal
Circuits. The basic idea is that they would set up the most
realistic nematode they could, then simulate poking it on the head.
It should back away from the poke. While they did manage this, one of
their steps was kind of cheating. They simulated the neurons of the
nematode's brain, but they didn't know the connection weights [2].
Instead of getting this information from the nematode, they used a
machine learning algorithm to find some weights that would work.

In short form, my justification for working on such a project where
many have failed before me is:

The "connectome" of C. elegans is not actually very helpful
information for emulating it. Contrary to popular belief,
connectomes are not the biological equivalent of circuit
schematics. Connectomes are the biological equivalent of what
you'd get if you removed all the component symbols from a circuit
schematic and left only the wires. Good luck trying to reproduce
the original functionality from that data.

What you actually need is to functionally characterize the
system's dynamics by performing thousands of perturbations to
individual neurons and recording the results on the network, in a
fast feedback loop with a very very good statistical modeling
framework which decides what perturbation to try next.

With optogenetic techniques, we are just at the point where it's
not an outrageous proposal to reach for the capability to read and
write to anywhere in a living C. elegans nervous system, using a
high-throughput automated system. It has some pretty handy
properties, like being transparent, essentially clonal, and easily
transformed. It also has less handy properties, like being a
cylindrical lens, being three-dimensional at all, and having
minimal symmetry in its nervous system. However, I am optimistic
that all these problems can be overcome by suitably clever optical
and computational tricks.

I'm a disciple of Kurzweil, and as such I'm prone to putting
ridiculously near-future dates on major breakthroughs. In particular,
I expect to be finished with C. elegans in 2-3 years. I would be
Extremely Suprised, for whatever that's worth, if this is still an
open problem in 2020.

There are also several researchers working in a distributed open
source manner on the OpenWorm project. Stephen Larson, from that group, writes:

We've just published a structural
model of all 302 neurons represented as NeuroML. NeuroML allows the
representation of multi-compartmental models of neurons. We
are using this as a foundation to overlay the c. elegans connectivity
graph and then add as much as we can find about the biophysics of the
neurons. We believe this represents the first open source attempt to
reverse-engineer the c. elegans connectome.

One of the comments
mentioned Andrey Palyanov's mechanical model of the c. elegans. He is
part of our group and is currently focused on moving to a soft-body
simulation framework rather than the rigid one they created here. Our
first goal is to combine the neuronal model with this physical model
in order to go beyond the biophysical realism that has already been
done in previous studies. The physical model will then serve as the
"read out" to make sure that the neurons are doing appropriate things.

I wrote to Ken Hayworth who is
a neuroscience researcher working on scanning and interested in whole
brain emulation, and he wrote back:

I have not read much on the simulation efforts on C. elegans but I
have talked several times to one of the chief scientists who
collected the original connectome data and has been continuing to
collect more electron micrographs (David Hall, in charge of www.wormatlas.org). He has
said that the physiological data on neuron and synapse function in
C. elegans is really limited and suggests that no one spend time
simulating the worm using the existing datasets because of
this. I.e. we may know the connectivity but we don't know even the
sign of many synapses.

If you look at a system like the retina I would argue that we
already have quite good models of its functioning and thus it is a
perfect ground for testing emulation from known connectivity.

So the short answer is that I think it may be far easier to
emulate a well characterized and mapped part of the mammalian
brain than it is to emulate the worm despite its smaller size.

I then asked:

So even a nanoscale SEM pass over the whole brain wouldn't be
enough unless we could find some way to visually read off the sign
of a synapse, perhaps with a stain, perhaps by learning what
different types of neurons look like, perhaps by something not yet
discovered?

And he replied:

That is right, but those tell tale signs are well known for
certain systems (like the retina) already, and will become more
clear for others once large scale em imaging combined with
functional recording becomes routine.

I would challenge him to show a "well characterized and mapped out
part of the mammalian brain" that has a fraction of the detail that is
known in c. elegans already. Moreover, the prospect of building a
simulation requires that you can constrain the inputs and the outputs
to the simulation. While this is a hard problem in c. elegans, its
orders of magnitude more difficult to do well in a mammalian system.

There is still no retina connectome to work with (c. elegans has
it). There are debates about cell types in retina (c. elegans has
unique names for all cells). The gene expression maps of retina are
not registered into a common space (c. elegans has that). The ability
to do calcium imaging in retina is expensive (orders of magnitude
easier in c. elegans). Genetic manipulation in mouse retina is
expensive and takes months to produce specific mutants (you can feed
c. elegans RNAi and make a mutant immediately).

There are methods now, along the lines of GFP
to "read the signs of synapses". There is just very little funding
interest from Government funding agencies to apply them to
c. elegans. David Hall is one of the few who is pushing this kind of
mapping work in c. elegans forward.

What confuses this debate is that unless you study neuroscience deeply
it is hard to tell the "known unknowns" apart from the "unknown
unknowns". Biology isn't solved, so there are a lot of "unknown
unknowns". Even with that, there are plenty of funded efforts in
biology and neuroscience to do simulations. However, in c. elegans
there are likely to be many fewer "unknown unknowns" because we have a
lot more comprehensive data about its biology than we do for any other
species.

Building simulations of biological systems helps to assemble what you
know, but can also allow you to rationally work with the "known
unknowns". The "signs of synapses" is an example of known unknowns --
we can fit those into a simulation engine without precise answers
today and fill them in tomorrow. The statement that no one should
start simulating the worm based on the current data has no merit when
you consider that there is a lot to be done just to get to a framework
that has the capacity to organize the "known unknowns" so that we can
actually do something useful with them once they have them. More
importantly, it makes the gaps a lot more clear. Right now, in the
absence of any c. elegans simulations, data are being generated
without a focused purpose of feeding into a global computational
framework of understanding c. elegans behavior. I would argue that the
field would be much better off collecting data in the context of
adding to the gaps of a simulation, rather than everyone working at
cross purposes.

That's why we are working on this challenge of building not just a
c. elegans simulations, but a general framework for doing so, over at
the Open Worm project.

With the people currently working on this, I think we'll probably have
a nematode simulation in about ten years. People have been working on
this for at least 15 years [1], so that would be 25 years for a
nematode simulation. The amount of discovery and innovation needed to
simulate a nematode seems maybe 1/100th as much as for a person. [4]
Naively this would say 100 * (15+10) or 2500 years for human whole
brain emulation. More people would probably work on this if we had
initial successes and it looked practical, though, giving us maybe a
10x boost? Which still is (100/10) * (15+10) or 250 years. This
might go faster if we have some sort of intellegence amplification or
other changes in how research works. Or all progress might stop as we
run out of cheap fossil fuels. There are huge uncertainties here, but
I don't think we'll be uploading anyone in this century.

[1] The whole brain emulation roadmap estimates 10^22 flops to
simulate at the level of electrophysiology. The current fastest
supercomputer runs at nearly 10^16 flops. So we'd need computers
about a million times faster than we have today. That's twenty
doublings, so if moore's law keeps us doubling every eightteen
months for another thirty years we'll be there. I have trouble
imagining moore's law not breaking down before then, though.

[2] A neuron firing can either excite or inhibit another neuron, and
by a variable amount. This can be modeled as a single weight:
negative for inhibitory, positive for excititory, and the
magnitude of the number represents the connection strength.

[3] Possibly as long as 25: the connectome for
C. elegans was published in 1986.

[4] I'm not counting improvements in computer speed and storage or in
scanning technology in here, because these seem to be moving along
quickly on their own.