I miss the days I studied medicine in college. (I stopped because you
can't even become a _nurse_ without having to give people injections, and
I have a problem with needles.)

So about ten years ago, we figured out how to bring people back from the
dead (at least in a very limited way), and it barely even made news.

For years people weren't considered dead until they were "warm and dead".
Drowning victims in cold water were revived without obvious ill effects
after being underwater for _hours_. Great scientific discoveries don't
start with a shout of "eureka", they start with "that's odd", and this
was very odd. In the early 2000's, we finally started to figure out
what was going on. Unfortunately, the current medical establishment is
horribly set up when it comes to actually applying it.

If you google "cold therapy for heart attack" you'll find a decade's
worth of articles about how it turns out brain cells aren't really dead after
4 minutes, or 8 minutes, or even an hour without oxygen. They last about as
long as any other tissue, you just have to restore circulation before gangrene
sets in (I.E. before the cells die). How long is that? Well, detached limbs
can generally be reattched after 6 hours without circulation, longer if you keep
the part cold. Kidney transplants are packed in ice (kept very cold but not
quite allowed to freeze) and they try to stay under 30 hours from harvest
to implantation, which is more than a day. You can sometimes reattach a
limb a couple days later
if it was kept cold enough. (Even hibernating mammals don't quite go to the
level of the mushrooms and onions still alive afte a week in a plastic bag
in my refrigerator, but cutting off circulation to your arm for an hour
while sleeping just makes it numb and then pins and needles when you
restore circulation. It's not "oh no, all your nerves are dead".)

So why is the brain so different? What happens is brain cells literally
self-destruct when oxygen is restored, because the cell runs a self-diagnostic
on reboot, notices it's running _way_ out of spec, and detonates on the
assumption it's gone cancerous or has a virus or something. (This is called
"autolysis" and is fairly common, the cells in higher organisms are there to
serve the body as a whole, not to keep themselves alive. Skin and hair cells
form serve their purpose while dead. The digestive tract is lined with cells
that _get_digested_ along with the food. The placenta in mammals is made
from the exact same cells as the fetus, it's just forming scaffolding rather
than the part you keep, and the fetus itself kills more cells than it
keeps positioning tissues into organs and such.)

Cells self-destructing when they notice they're defective
is our body's first line of defense against cancer and viruses, and it's
especially important in "immunopriveleged" areas like the brain which are
mostly isolated from the rest of the body in a way that the normal immune
system can't reach quite easily: self-policing is the main option there.
That's why your central nervous system and peripheral nervous system
respond so differently here: the blood/brain barrier enforcing different
immunoprivlege domains. This filter membrane keeps most infections out,
but also prevents white blood cells from spotting and fixing problems.
Some viruses are so small you can't keep 'em out without also excluding
oxygen and nutrients, cancer is existing cells going nuts (generally due
to some sort of damage, although botched division is a possibility),
so these cells must self-police much more aggressively.

So brain cells are on a self-destruct hair trigger to ward off brain
cancer and viruses (or at least keep it down to a dull roar), and in the case
of oxygen deprivation the self-destruct mechanism gets triggered incorrectly,
which is more or less an autoimmune problem. All that "keep CPR going" stuff
is to prevent the oxygen levels in the brain from ever getting low enough to
go out of spec to trigger cell autolysis and thus brain damage when oxygen
is restored.

But the self-diagnostic can't run when the cell has no power, I.E. has
run completely out of oxygen. The cells die when oxygen is _restored_. And
it turns out if you cool the brain way down to not-quite-freezing you can
re-oxygenate those cells before the self-diagnostic can run, so that by the
time the cell wakes up it's operating within sufficiently normal parameters
that it considers itself worth salvaging. So if you can cool down a fresh
enough corpse, re-oxygenate all the cells, and then carefully warm it
up again (juggling the normal problems of hypothermia; if the body's too
cold it can't generate heat metabolically, and the heart won't beat
either)... people can turn out to be less dead than you expected.

And this
turns out to work pretty well,
except that it's fighting the body's normal response to cold (our cells
have a higher heat tolerance than cold tolerance, only a few degrees below
normal and we pass out, so the body actually cuts circulation to the
extremities to keep blood and the remaining oxygen and warmth going through
the heart, lungs and brain... exactly where we _don't_ want it if we're trying
to cool it down while keeping it oxygen-depleted until we're ready). So just
dumping ice on a person may not lower their core temperature enough to make
a difference, and will give the extremities frostbite as you cool those
cells down to freezing and ice crystals forming in the cells tear them
apart. And then if you _do_ lower the core temperature enough that the
cells aren't functioning, that means the heart's too cold to beat, so you've
then gotta treat the hypothermia...

So what you do is hook your victim up to a heart-lung machine
machine and cool their blood way down _before_ you start oxygenating it.
Then very slowly warm them back up until their heart starts beating again
on its own, and hope their brain reboots without smoke coming out. The
heart-lung machine takes care of hypothermia. It's also hugely invasive and
expensive, but it's a technology we happened to have lying around already.
(You can also do stuff with tubs of ice water if that's what
you've got, but it's much harder to control. Still, that's something
_everybody_ has.)

How cold do you want to get people? Water's maximum density is around
4 degrees celsius, above that it behaves like normal materials and shrinks as
the atoms it's made of vibrate less, below that it starts to form persistent
hydrogen bonds which keep the molecules at arm's length from each other.
Enough hydrogen bonds and you've got ice crystals, which are sharp and bigger
than the water was, and expanding sharp crystals tend to shred things they
form in, whether it's concrete or cells. So 4 celsius degrees is a decent
target. But it turns out cells stop _functioning_
much warmer: normal body
temperature is 98.6 degrees farenheit, at 89 degrees you lose the ability to
generate heat (even by shivering), at 86 you lose consciousness. Heartbeat
stops being reliable below 82 degrees and stops completely around 65
degrees (all farenheit). So it's quite possible _room_temperature_ is enough to
stop neural autolysation, we don't know. (It's kind of hard to find volunteers
to experiment on for this sort of thing.)

The weird part is that at a colder temperature chemical reactions
run more slowly, so a cold cell actually consumes very little oxygen, meaning
oxygen can actually build back up by diffusion even without the circulatory
system running. (Even a big cactus doesn't need a heart or lungs; I'd say
"tree" but dead wood surrounded by a thin layer of living tissue that oxygen
can diffuse into.)

So you get
persistentreports
of dead people "coming back to life" at funerals and such in third world
countries, which _might_ be because the cells weren't actually dead yet,
the brain cooled down below the point it could run a self-diagnostic when
oxygen was restored, oxygen diffused into the mostly
inert cells, the body slowly warmed up to the point where the heart
started beating again (people do recover from hypothermia on their own
sometimes)... and the "dead" person recovered. Maybe.

It's VERY UNLIKELY for all those things to happen just right, but one of the
open secrets in medicine is that defining "dead" is
really
hard, because people just don't cooperate. Our culture has the phrase
"left for dead" for a reason, and the elaborate medieval attempts to
avoid being "buried alive" because of this are well documented.

In fact, modern mortuary practice is built around making sure mistakenly
thinking someone was dead and being proven wrong never happens around here
anymore by "killing the corpse", I.E. sticking the body in a freezer, draining
their blood, and replacing it with embalming fluid to provide CERTAINTY that
they're dead _now_. But back when The Wizard of Oz was written (in 1900),
"not only merely dead, really most sincerely dead" was still saying something,
and it apparently still happens countries with different funeral practices
than ours.

Of course less-modern countries are even less likely than we are
to go "what happened here medically" instead of dismissing it as a miracle.
And when _we_ hear about it we naturally assume they're idiots who missed
a heartbeat, as opposed to "somebody's heart stopped for 4 hours, then started
up again, and their brain cells re-oxygenated without triggering widespread
autolysis. I would _love_ to know all the temperature thresholds and diffusion
rates involved in that".

In _both_ cases our preconcptions about what must be happening blind us to
asking the right questions. Of course we're familiar with _that_ story from
do bacteria cause
ulcers and how does
cholesterol actually work and a thousand similar things.
(Imagine if every new computer processor
generation had to go through FDA clinical trials, meaning you had to convince
a panel of octegenarians your new idea was worth ten years of funding to
see if it _might_ produce good results a decade or two from now.)

And another fun thing here is that this is a therapy, not a drug, so the
drug companies aren't interested. The most effective treatment for cystic fibrosis
(chest
percussion) is a therapy, I.E. a technique a trained person performs.
A drug company can't charge per dose for that, so they've been desperately
trying to replace it with some sort of pill or inhaler for decades now.
(This is an oversimplification, but our medical system being horked
beyond imagining is not news. We literally have
multiplecartels cornering
the market on health care services without even bringing the drug companies
or HMOs into it.)

And of course we're sitting on some known variants such as
Apoa-1 Milano,
where somebody got a patent (on a natural mutation) and is sitting on it
until it expires, and _then_ maybe progress might happen.

But this sort of "we know how to improve stuff, but it's not profitable"
problem isn't limited to medicine. It's limited to any place enough money
accumulates to attract parasites. 3 of the 4 largest companies in the Fortune
500 right now are oil companies; let's ignore the whole "hire the tobacco
institute to discredit global warming" thing, and go back to basic physics.
If you mix water and gasoline with a detergent, the resulting mix has the
same miles per gallon as pure gasoline (because the waste heat turns the water
to steam, and you get an internal combustion steam engine). Any high school
student with a lawn mower engine can demonstrate this for themselves over a
weekend. So why aren't we making use of it commercially?

In the 1960's there was a device called "the bubbler" you could retrofit
your car with. I first heard about this
from a cnn piece in 1992, some college professor had rediscovered it
mixing water and gasoline 50/50 with a simple detergent. My electronics
lab professor back in college said there was a device called "the bubbler"
back in the 1960's you could retrofit your car with to mix the
water/gas/detergent as you used them (to avoid settling back into layers like
salad dressing in the tank, with the detergents of the day). In the late 90's
Catepillar (the construction equipment people) had a patent on some sort
of microscopic sponges that formed a colloid sludge that wouldn't settle out,
and which let you go up to 80% water 20% fuel. And of course there's
ultrasonic emulsification and so on. People keep
rediscoveringthis but it never goes anywhere.

Expecting the US medical complex seriously start using something as simple
as refrigeration to keep people's brains alive is a bit like expecting Nasa to
put a colony on mars: we won't live to see it because they suck so badly they
were actually _better_ at a lot of things 40 years ago than they are today.
And for most of that time the alternatives were homepathic quacks and
astrologers, respectively, so looking outside the market the guilds have
cornered is a wasteland of scum and villainy.

At least NASA was founded with a vision (although once it put a man on the
moon and returned him safely to earth before that decade was out, it went
on to coast for 40 years of sheer bureaucratic inertia), and more recently
the X-prize kicked some life into 'em (Space-X, Dragon, woo!)

Back in college, I'd hoped the biotech companies might do some of the same
sort of cutting-edge research with medicine,
but there's actually not a lot of money in cures. The money is in _treatments_
for chronic conditions you can milk for decades, actually _curing_ people
kills your cash cow. So instead they do cargo cult programming on the genomes
of food, and wind up with cattle feed grass that produces clouds
of poison gas. Because that's where the money is. Wheee.

Oh, and another fun litle dysfunction: of course nothing in this
article should be construed as medical advice, you lawsuit-happy bastards.
(If you were wondering why Rome fell...)

I'm really look forward to the baby
boomers dying off and ceasing to have a disproportionate impact on the
country. The current republican party is the penance we pay for the 1960's.
They were teenagers when we went to the moon and invented the internet,
now they're dried up old fogies trying very hard to
take it (the country) with them and make us all pray them into an afterlife
of their choosing. (This is apparently how you make progress, old people die
and the new people use the ideas they learned before they knew everything.)

Next time, why my old Yahoo email handle back in the 90's was
"telomerase". :)

In case it isn't obvious, I went back to my other blog when the parallels gig ended. I learned a bunch and Kir is a great guy (running the OpenVZ both with him at scale was the highlight of the whole experience), but looking back I have a few observations:

1) Telecommuting for a company on the other side of the planet is challenging even when you're _not_ their only telecommuter. Being on the other side of the planet from a bunch of guys all in the same building is not a happy thing.

2) The technologies pitched in an interview (containers!) and the technologies you wind up working exclusively on the entire time (NFS) aren't always on the same continent either.

3) Culture clashes can be a thing. (The Russian consulate had a brochure warning me about the not smiling thing. It should have had one about "If someone has a complaint, they will stop talking/replying to you for months until the source of the complaint goes away. When this is your immediate supervisor, it can be a problem.")

I am sad that two round trips to the other side of the planet don't add up to enough Delta frequent flyer miles to get one domestic flight. I am happy I don't have to go there again. (When your trip preparation instructions explain the amount of cash you should carry to pay the standard police officer bribe from the random shakedowns, when the water cooler conversation is explaining _why_ the government stole an oil company, when you have to assure relatives that the bomb in the airport was a week after your trip and anyway it was the _other_ airport in the capital city, when boingboing coincidentally posts more than one long article about the murder and kidnapping of foreign entrepreneurs impacting investment in that country... Not a place I felt a huge _need_ for a third visit to.)

My todo list has once again exploded to the point where everything is distracting me from everything else and I'm forgetting what my todo items ARE, so it's time to write it down and prioritize again. (And this doesn't even include long-term stuff like containerizing the 9p filesystem or fuse, or testing LXC on non-x86 hardware platforms.)

Ok, found a workaround for the linux-2.6.39-rc1 hang that Jens Axboe's been distracted from solving for a couple weeks: disable preemption. So I can go back to testing/developing against linus's current -git tree instead of 2.6.38, which is good.

My Ottawa Linux Symposium paper submission 'Why Containers Are Awesome' has been approved, meaning I need to actually _write_ the paper now. I'm collecting a file full of links, might take a stab at it this weekend...

On the NFS front I'm pulling back from my ongoing battle with lockd/statd (which are a horrible mess from a design level and apparently always have been), because I managed to poke some people into reviewing my NFS patches (yay!), and Serge Halyn raised a good point about lifetime rules in the third patch. Which means I have to reevaluate the lifetime rules, which are always the hard part with kernel stuff. Sigh.

So I got my NFSv3 containerization patches submitted. There are three of them for the basic network namespace support for NFSv3 in what's probably correct approach. So far, nobody's cared to comment on them (even the people who were interested in the topic before, who I cc'd on the submission), although I tracked some down on skype and they promised to bump review up on their todo list.

Today, in theory I'm containerizing lockd (and statd, both of which are a horrible incestuous nightmare which is probably going to require one instance per container). In practice, I got distracted reading the LXC source code. It's... very verbose for what it does.

I've either run into a weird subtle bug in the kernel, or a weird subtle bug in kvm, and I can't tell which it is.

When I set up the "two meanings for the same IP" routing, mounting NFS inside the container (via tun/tap eth1) makes that address say "no route to host" outside the container (via -net user eth0). The two should be orthogonal, but something's getting interfered with.

I can reproduce it with a kvm "-net user" interface and a tun/tap interface, but I can't reproduce it with two tun/tap interfaces attached to kvm. I can reproduce it with nfs access in the emulated kernel, but not from userspace.

It's hard to debug a problem with the -net user interface because I can't ping or tracepath through it, so when it's failing to connect I dunno _why_. Which is why I switched eth0 to be another tun/tap interface and tried to replicate the bug there, but so far I can't. Except if the bug _is_ in qemu's -net user, I should be able to reproduce it from userspace in the emulated kernel. KVM has know way of knowing if packets come from userspace or from kernel space inside the emulated system.

Grrr. The worst kind of debugging issue is "I changed something irrelevant and the problem went away". THAT'S NOT HOW DEBUGGING WORKS. You find out what was wrong and fix it, or it resurfaces to bite you again later.

Hmmm, maybe I can upgrade qemu (switch from kvm to qemu and build from source via current qemu-git repository) and see if _that_ makes the user+tap problem go away. Fixing it via upgrading qemu is reasonably strong evidence it was a bug in qemu, and if so it's orthogonal to the NFS patch (and probably fixed upstream already anyway, ubuntu's kvm is a bit old and ubuntu has a history of breaking qemu anyway)...

To test NFS containerization, I need to set up conflicting network routing. I need to come up with something the container can access but the host can't, and vice versa. To do this, I used my three layer setup (laptop/kvm/container, described here) so I can set up routings on the laptop which are a couple hops away from the point of view of the containers and the container host (I.E. the kvm system).

Initially, to get a routing the container could access and the host couldn't, I set up a 10.0.2.15 alias on the laptop and ran an NFS server on it. The KVM's eth0 interface is using the default QEMU masquerading LAN and thus gets the address 10.0.2.15, so it can't see any _other_ 10.0.2.15, and life is good.

Then to set up a routing that the KVM system could see but the container couldn't, I ran an NFS server on the laptop's 127.0.0.1. The KVM system could access that via the 10.0.2.2 alias in the virtual LAN its eth0 is plugged into, but the tap device the container uses has no way to route out to 127.0.0.1, the container would see its own loopback interface instead.

Then right before SCALE I changed my test setup so that the mount command in the container and the mount command on the kvm system were identical, both using 10.0.2.2. On the laptop I set up a 10.0.2.2 alias, and ran the NFS server on that and another instance on 127.0.0.1. So the container and the container's host were both mount NFS on 10.0.2.2, but should be connecting to DIFFERENT servers when they did this.

The failure mode for the first test setup is "server not found", because if it uses the wrong network context, it'll route to an address that isn't running an NFS server. (The local address hides the remote address.) The failure mode for the second setup is accessing the wrong server: it's always going to route out to a remote address, the question is whether or not it gets the right one. (Side note: unfortunately you can't tell the NFS server "export this path as this path" because NFS servers are primitive horrible things configured via animal scarifice and the smoke signals from burnt offerings. What I should really do is run one the two instances in a chroot to get different contents for the different addresses. Generally I just killed one to see if that NFS mount became inaccessable or not.)

Over the past couple months I've made the first test setup work (more by adding lots of "don't do that" options to the mount -o list than by patching the kernel, but at least I got it to _work_). This second test setup spectacularly _did_not_work_, and it failed in WEIRD WAYS. Not only does the NFS infrastructure inappropriately cache data and re-use structures it throught referred to the same address (because its cacheing comparisons didn't take network context into account), but the network layer itself doesn't seem entirely happy routing to two different versions of the same IPv4 address at the same time. (After doing an NFS mount in the container, the host can't access that address anymore. Can't do an NFS mount, can't do a wget... an ssh server bound to that address can't take incoming connections. UNTIL, that is, the container opens a normal userspace connection to that address, such as running wget in the container. I have no idea what's going wrong there, but it's easily reproducible.)

The problem is, right as I started debugging this second test setup I was pulled away for several intense days of working the OpenVZ booth at the SCALE conference, and then I got sick for most of a week afterwards with the flu, and by the time I got back to working on NFS I'd forgotten I'd changed my test setup.

So now I was dealing with VERY DIFFERENT SYMPTOMS, and all sorts of strange new breakage, and I couldn't reproduce the mostly working setup I'd had before, and I couldn't figure out WHY. At first I blamed my "git pull" and tried to bisect it, but a vanilla 2.6.37 was doing this too and I KNEW that used to work. Sticking lots and lots and lots of printk() statements in the kernel wasn't entirely illuminating either. (Once you're more than a dozen calls deep in the NFS and sunrpc code, it's hard to keep it all in your head and remember what you were trying to _do_ in the first place.)

And of course the merge window opened, so I wanted to submit the patches I'd gotten working so far, but I always retest patches before submititng them and when I did that they DID NOT WORK so obviously I couldn't submit them until I figured out WHY...

It wasn't until today that I worked out why what I had USED to work, and where I'd opened a new can of worms that broke everything again. The code hadn't changed, my test sequence had. (It's a perfectly valid test sequence that _should_ work. The kernel _is_ broken. But it's not _new_ breakage, and my patch does fix something _else_ and make it work where it didn't work before, and thus I can give it a good description to help it get upstream.)