Become a Fan

June 22, 2007

One might conclude that faster processors, higher capacity disk sub-systems, bigger pipes, etc., are certain to improve an organization’s ability to make sense of the world.

Unfortunately, for the most part, I think the converse is true – organizations may in fact be getting dumber.

Let me explain.

Earlier this year I blogged about enterprise amnesia. This unfortunate condition occurs when an organization makes a decision without taking into account all other related facts the enterprise already knows. For example, hiring employees who had previously been arrested for stealing from you.

Enterprise amnesia is embarrassing because after the fact it becomes so obvious something was missed – sometimes so much so, an organization may appear grossly negligent. This is bad. When this negligence relates to financial matters, bad might involve Sarbanes-Oxley violations. When such negligence relates to health care, bad might mean surgical procedures on the wrong body part and in a national security context, bad might mean death that could have been averted.

An inability to recall and act upon what one knows is simply not a good thing.

This problem – an inability to locate and act upon what one knows – is getting worse because faster systems are producing information substantially faster than traditional sensemaking algorithms (using these same fast systems) can keep up.

Therefore consider this: if enterprise intelligence was measured based on how well one can take advantage of what one knows … if an organization were to capture progressively more information than it could make sense of … I would argue that this organization is becoming progressively less intelligent (as compared to its potential).

The bigger the distance between what is knowable and the sense one can make of it … the greater the enterprise amnesia index. Here is a picture I drew that depicts this.

Not to say fast systems are bad. Rather, I am suggesting that better sensemaking algorithms will be essential to fully leverage one’s ever-growing enterprise information assets.

This is the most comprehensive collection of my thoughts about privacy to date. And as you might expect, I covered all the usual suspects like responsible innovation, designing in support of the Universal Declaration of Human Rights (UDHR), limits of predictive data mining for counterterrorism, anonymization, immutable audit logs, watch list redress and so on.

But I also covered a number of new ideas that I have not yet had the chance to blog about including:

The one-way watch list – A special watch list exists where you can put yourself on it, but by design, you cannot take yourself off

More death in future cheaper – the cost and execution risk to wreak havoc is dropping (e.g., the killer 1918 Spanish Influenza has been brought back from the dead)

Pinhole vision caused by lens crafters – as we continue to look to technology to triage data, we are really calling for custom crafted lenses to intentionally narrow our perceptions. That is an important fact to remember.

June 03, 2007

There is a high transaction cost associated with performing Perpetual Analytics with Sequence Neutrality. Essentially, what these systems are doing is this … as every record (observation) is received (perceived) one is trying to answer the question "How does what we are observing relate to what we already know, does this change any earlier assertions, have we learned something that matters and if so, who needs to know?" I think this might be an example of what could be called an Incremental Learning System. Maybe a better term is an Aware Incremental Learning System in that such a system has the ability to publish insight as relevance is noticed – not simply wait for a user to make a query.

Incremental Learning Systems, over time, grow what they know. And what is growing is context. And when this is stored in a database I have been calling this Persistent Context.

Notably, persistent context must be constructed. One cannot take historical non-contextual data and simply bulk load such information into a database, without an assembly process, and expect to get information in context. Context must be carefully assembled.

Yeah, I know. Sounds great and all, but how fast are these systems?

Well they are getting faster that is for sure.

A few months ago, while on the road, I received a very exciting call. Our performance engineering team had just broken an all time throughput record.

In part this was made possible by an internal project we started back in 2002 or 2003 called the "small database footprint project." The notion being that at the end of the day, the pinch-point was going to be the database engine itself. Once you tip the database engine over, well, then you have reached your limit.

Our small database footprint project had the goal of externalizing as much computation off the database engine – pushing this processing into share nothing parallelizable pipelines. So we also did such things as externalized serialization (no more using the database engine to dole out unique record ID’s) and eliminated virtually all stored procedure and triggers – placed more computational weight on these "n" wide pipeline processes instead. By the way … no "table scans" (duh!) and pretty cool strategy to make sure that SELECT result sets were small.

Since large multi-terabyte systems are not going to live solely in memory and remain sustainable, much attention must also be placed on disk layout, for example many smaller capacity disk drives operating at 15,000 RPM so the data can be spread out. Raid 5? Not for read/write tables. For all the fast growing read/write tables we used Raid 10. More expense? Yep. Faster? Yep.

How fast? In short, 800 million records were loaded in four days. To boot, the performance team felt like with some more tuning they might be able to cut that down to just over two days. They were using 32 CPU’s (half of what was possible on that box).

If you are turning on one of these systems or plan on implementing one, we have a great white paper with all the configuration specifics. If so, drop me a note and I’ll get you a copy.

June 02, 2007

Most of my Ironman-related posts are simply meant to be funny and contain very little useful information – this post is meant to be both funny and useful to a few other athletes, especially those trying to hack their way through Ironman triathlons – which consists of a 2.4 mile swim, 112 mile bike and a 26.2 mile run.

When your job is your hobby, it is hard to find time to train. So following the New Zealand Ironman I did in March, I was only able to do four long hard rides (averaging about 110 miles and 8,000 feet of climbing), a few runs 10 miles or less and one long run (22 miles the Sunday before this race), and absolutely no swimming. Upon hearing my strategy, my best friend Joe, who really trains for these events, told me I had never trained less and that my strategy was in fact a tragedy. Joe knows best, he beat me by almost two hours in the France Ironman last year.

When we arrived in Florianopolis four days before the race my girlfriend and I found Joe and began to execute on a key element of my tragedy strategy. Tequila. Over dinner our table drank two bottles of wine, then over 13 shots of tequila (lost count), and then we successfully convinced Joe to top all this off with a vodka and Red Bull. Joe has never trained for this, I on the other hand …

In any case, Joe puked (no picture here) at say about 2am. I did not.

Because I had not ridden my bike at all in three weeks, Friday morning I decided to get out on the bike course for a three hour solo ride. After riding toward the airport for about 50 minutes and with all kinds of funny muscle pains, I decided to cut this training ride short and save whatever I had left for race day. One more thing about doing training rides in Florianopolis: NEVER EVER DO TRAINING RIDES IN FLORIANOPOLIS! Some of these roads are no different than riding on a two-lane freeway without a shoulder. Another athlete had already been hit earlier in the week and was in the hospital … I must not have received the memo. This was the scariest cycling I have ever done.

Well then. Maybe I’ll take a test swim in the ocean. The plan was 30 minutes down the coastline and 30 minutes back. After swimming down 20 minutes and then half way back, I felt just defeated enough to walk back to the hotel. Saturday arrives and I am now very sore from the short little swim the day before.

Nervous? Oh Yeah! What am I thinking? Can hacking an Ironman result in death?

Sunday morning the race starts at 7am. Over 1,100 athletes run for the water at the same time to embark on the 2.4 mile swim. Many athletes, especially the hackers like me, are swimming all over the place due to the unanticipated current that is moving from right to left. At the half way point, we get out of the water, run a few hundred feet on the beach and then get back in the ocean to swim another 1.2 miles. Some athletes were so worn out from the first half, they thought they were done – so they began taking their wetsuits off at the half way point. Obviously, they did not get that memo. Also worth noting is that on the last leg of the swim towards shore, there are exposed rocks on the right side. The event staff had a very busy job keeping the swimmers from being pushed into the rocks by the current. I survived the swim with only one elbow to the left eye. My swim time: 1 hour 23 minutes.

The 112 mile bike course was interesting. Don’t hit the traffic cones that seem to be in the cycling lane at times. Joe hit one (and did not fall) going about 20 miles an hour—he discovered they were made of metal! I saw no bathrooms on the cycling course. Some said they saw a few. The dudes were pulling over on busy streets, in front of police and passers-by, whipping it out and doing their thing. Note to others: Bring your own toilet paper. My bike time: 5 hour 57 minutes.

On the first loop of the run there is one super steep hill, it appeared to be 90 degrees but I am guessing it is probably less. Do not run up or down such monsters if you are a hacker like me. People were blowing their legs out on this. One guy coming down evidently lost his ability to control his speed. I was too focused on my own suffering to wait for the likely spectacular outcome and thus just pressed on.

In these races, I am always keeping tabs on whom I think I can beat. Some are ahead of you, but you think you will eventually pass them. Others are behind you and you hope to smoke them. There was this one dude, number 777 … he looked like a powerful warrior and athlete to me – and I had my mind set on beating him. I had a good feeling about this until TRAGEDY struck. There I was about 15 miles into the 26.2 mile run. On these long endurance events, I just get dumber and dumber as the day goes on (example here). On this fine day, I find myself running in one direction and all the athletes are going the other direction. It takes me a while to notice this – like 10 minutes! Suddenly, I realize that as far as I can see, no one is running in my direction. I turn to look behind me in hopes of seeing other athletes following me. No such luck! Do you have any idea how defeating it is to find yourself in this situation – having to run an extra mile or two. A helpful local offered to give me a ride back to where I took the wrong turn on his motorcycle. As tempting as that was, the idea of trying to explain why an athlete was whipping around on the back of a motorcycle to a race official who probably spoke no English – just not worth being disqualified over! Over the next 45 minutes, I had one word racing through my head – and appropriately so I think – it was the "F" word. How did this happen? I was an idiot. And, in hindsight I bet all those people yelling at me in Portuguese were only trying to help. My run time: 5 hours 10 minutes. Needless to say, Mr. Warrior Athlete Dude past me.

As night fell during the run, one must think about the movie "Turistas" – no worries here though as the police were everywhere. (Although two athletes a couple days earlier decided to walk on the beach at night and got mugged. In case you have not seen that memo, don’t do that.)

Miraculously, despite the run diversion for bonus miles, I finished this race in about 12 hours and 31 minutes. I beat my previous record of 12 hours and 55 minutes in the Western Australia Ironman! To boot, I was only about 45 minutes slower than Joe. Crazy!

Here are a few key logistics related items to keep in mind if you do this race.

Of the eight Ironman distance triathlons I have done, the Brazil Ironman is the most organized Ironman I have ever seen.

You must get a visa from the Brazilian consulate. These things take something like 10 days and are no fun. On the visa form they ask if you are in a competitive event that has cash prizes. Saying yes, means more paperwork. I said yes, did the paperwork, and got the visa. Most everyone else I spoke with (including a fast athlete that won prize money) chose to lie on his/her visa application. The airlines down in Brazil (e.g., TAM Airlines) were late – at least all of our flights were late. I’d leave at least four hours for connecting flights in the future. And the airports are a mess, maybe because so many bikes were being rolled into the baggage systems.

This is the first race I have done where I signed up with a triathlon-oriented travel company. I used Endurance Sports Travel. It’s unbelievably helpful, everything from a team of mechanics to a hospitality suite on the race-course near the finish. Use them, they are fantastic.

More generally, I learned a few other important tricks for future races. Real triathletes already know these things, because they have trainers or read books on this stuff. I, on the other hand, am a hacker.

When eating or drinking new foods during the race, only have a small amount to make sure it is compatible with your stomach. They had some kind of weird soup (not broth), so I sampled a tiny bit a few times before I chugged any down.

I did a bad job shaving under my chin, I even knew I did a bad job when I did it, I just did not care. Bad idea. A nearly six hour bike ride with stubble here, a bike helmet strap and salt makes made for a nice rash.

As I passed through the various aid stations for runners, I kept thinking the word for water in Portuguese sounded like vodka. How funny I thought. Then about halfway through the run a bilingual fellow who passed me explained they were saying "Vaca" which actually means cow – as I was being addressed based on my shirt, which was white with large black spots. Later, I realized they were also asking me if I wanted "leche" (Spanish for milk). Funny, real funny. Ha Ha!

Never try new clothes on race day. I always do – even though I know it is a bad idea. My shorts had two problems. For starters, I chaffed in new places (don’t ask, don’t tell, and especially don’t touch). And, maybe another reason so many people were hooting at me in Portuguese was that my shorts turned out to be somewhat see-through from the back. It was kind of my girlfriend and kids to wait until I was done with the race before they told me … you think?

Any finally, when I was recounting my day with my girlfriend, I mentioned Mr. Warrior Athlete Dude and how I wanted to stay ahead of him. She said she had noticed him too as he came across the finish line before me ... in fact, it crossed her mind that had hacking an Ironman resulted in my death, he would be the perfect replacement. No joke! :O