Perspective on the Business of Event Processing

June 2009

June 25, 2009

This week my company issued what I thought was an innocuous announcement
- our software now allows applications to process Twitter tweets with an application and analyze
the contents of the messages in real-time. A firestorm of debate ensued, and
the issue gained mainstream media coverage in the Financial Times, the Telegraph, CNBC,and dozens of others. The debate gives us pause to consider the role of the
sweeping adoption of Twitter as a real-time, global information source, and how
that information might be used beyond the mostly social nature of Twitter
today.

One of the examples we cited for this technology is
automated trading on Wall Street, where any of the 16 million tweets a day can be analyzed in real time as
a means to inform trading decisions.Wall Street tends to be an innovator with new technologies, so it’s
useful to use it as a petri dish to examine some of the opportunities and
issues around Twitter in a business context.

Trading on rumor is a time-honored trading technique.“Buy on rumor; sell on
news” is one of the oldest mantras on Wall Street and Main Street alike.Any tool that improves the speed or
quality of transmitting information is of interest to traders.Twitter does that.But be careful what you ask for –
lots of the tweets on Twitter are garbage – scam artists and manipulators.On the other hand, there are thousands
of credible sources of information, from the SEC (@SEC_Investor_Ed), to the New York
Times (@NYTimes), and even exchanges like CMEGroup (@CMEGroup), which has over 500,000 people subscribed to its tweets about the futures and options markets, and more.

And beyond any individual source, and probably more
uniquely, Twitter allows instant access to the sentiment of the masses.

In the Wall
Street & Technology article, Todd C. Mirabella, chief investment
officer and principal of New York hedge fund QAT, says one of the uses he has
found for Twitter is to help look at market volatility and compare it. "So
how we get our information, that's the trick. Twitter can give us information
on retail " [Tweeters] are the herd," he says.

For example, anything new that Apple does throws Apple’s
stock into volatile gyrations.The
recent launch of the iPhone 3GS set off a rash of Twitter tweets about what the herd thinks – taken together, they can reveal sentiment – good and bad. Getting
that feedback a few seconds faster, or a little better can make a big difference
in trading returns.

Business leaders, from
CEOs to inventors, use Twitter.For example, Richard Branson (@richardbranson), the CEO of
Virgin, tweets regularly.Now I’m
pretty sure the news in the first tweet about Jimmy Fallon isn’t actionable for
trading Virgin stock, but the fact that he’s ordering $2.1 billion worth of new
planes might be.

Nasir Zubairi, former
product manager for algorithmic trading and FX E-commerce, RBS, put it this way
in
the Wall Street & Technology article:"[Twitter can be] the fastest source of news,
particularly in reference to trading and algorithmic trading," he says.
"News feeds are gaining a lot of popularity. Twitter provides the medium
in that people, who are there, experiencing things, are able to broadcast their
sentiments to their user base via tweets. This provides lots of competitive
advantage to firms with trading strategies."

News about world events
moves the foreign currency markets as Mr. Mirabella also pointed out.He’s not yet trading on Twitter but
said he sees real use for Twitter in currency trading: "We like to trade
in ranges," he explains. "That helps us characterize the extreme
edges in any specific currency. We've been trying to follow more individualized
Twitter sites to see if we can get extreme views [of Tweeters]. It does give us
flavor for what types of ranges we're looking at."

But real use of Twitter has some serious issues.Technically, it wasn’t built for the
kind of quality of service that a robust news source, like Reuters (@Reuters), provides – what happens when
Twitter is unavailable?There are
questions about latency of the transmissions – is it fast enough?Can it be faster?Authentication and forgery are
challenges – there’s a big difference between the real Richard Branson (@richardbranson), and the fake
Richard Branson (@sirdickbranson)- who is responsible for the quality
and reliability of the information transmitted via Twitter?Should it be regulated?

And regulatory issues abound, as well. Trading decisions are moving faster, so
regulators have to move more quickly, too.In December the unfortunate trading run on United Airlines
(UAL) (read the story from the Wall Street Journal) stock
to drop from $12 to $3 in just 15 minutes only served to remind us how quickly
conditions can change in an era of sweeping automation. The chain
reaction was started by an erroneous trading signal from an old story that the
Google news crawler picked up from 2002.How can we guard against these unintended technology-triggered mistakes?

The adoption of Twitter in a social context has been swift,
massive, and global. As
technology like event processing makes
the content contained in Tweets readily available to business applications, the
questions of what to do with this power and responsibility looms large.

Postscript and continued debate (July 1, 2009)

Since this writing, there has been a flurry of response - positive, negative, and vitriolic.

My favorite coverage was the excellent, balanced story by Melanie Rodier at Wall Street & Technology: Algo Traders Connect to Twitter. She did her homework, including comments by some well-respected people from the trading industry. Read it first.

One blogger, who used to work for one of our competitors, even took the opportunity to predict that StreamBase will be trounced by the "big boys" as they enter our market! To that, I say: "bring it on." Dr. Michael Stonebraker is one of the inventors and thought leaders in event processing, and StreamBase is known as one of the leading businesses in the CEP market, renowned for its performance and ease of use. Are we wary of their deep pockets? Yes. But the big companies have yet to field compelling products - in fact, we rarely compete with them. Furthermore, the "big boys" have not showed signs of innovation and intrapreneurship as I wrote in When Microsoft enters your market: cause for worry or cause for celebration? And in Innovation: why size doesn't matter, I countered the argument that big companies are the most effective at innovating in a market like CEP. Richard Tibbetts, StreamBase CTO, clarified IBM's recent claims of innovation as misleading in On IBM, unstructured data, and CEP. Finally, and most importantly, our customer endorsements from CME Group, BNY ConvergEx, BlueCrest, and PhaseCapital and many more say it best - directly from our customer's mouths.

June 23, 2009

Recently, I spent a day in Chicago with CME Group, the
world’s world’s largest and most
diverse derivatives exchange, to talk about their approach to innovation. As the CEO of an innovative software company founded by forward-looking MIT techies, I’m lucky
enough to work with a lot of innovative people. But as we talked about the size and scale of CME Group’s
business, it struck me that this may be among the most technically challenging
environments on planet earth, and the numbers tell a compelling story of
challenge and opportunity for a firm that many know of, but few have seen
inside.

But I wasn’t visiting CME Group to check out their museum, I was there to dig
down on the business and technical motivation behind the news that they have selected a new development paradigm (press release, coverage in Wall Street & Technology),
event processing (CEP), on which to build future innovations. This is a watershed moment for the CEP
industry, and an important case study that gives us pause to wonder about the
future of automated business, the future of real-time, mission-critical
infrastructure, and the role of event processing as the platform for that
innovation.

So let’s look at innovation, CEP, and the challenges faced
by CME Group:

...1.2 quadrillion dollars ($1,200,000,000,000,000,000)

...The underlying value of the contracts traded by CME Group in 2008

How can a firm stay
on the innovation edge while simultaneously supporting the most mission
critical operations?

The decision to process 1.2 quadrillion dollars of value was
not taken lightly at CME Group. It’s been said that “nobody gets fired for
buying IBM” – that is, that big companies are the safe choice when a decision
is most critical.

But when innovation must ride beside reliability, decisions
get trickier. A company like CME
Group must balance innovation with mission-critical capability. And, in the case of CEP, CME Group’s
decision illustrates that CEP has come of age.

And from the customer’s point of view I asked Steve Goldman, CME
Group’s Director of Enterprise Architecture, if he had looked at IBM and
Microsoft, and his response was casual:
“The decision was very difficult but we felt that based on our
environment and needs thatStreamBase was a proven solution.”

11,000,000...

...The current average daily volume of futures contracts traded every day by CME Group...

“Our data feeds are like fire hoses,” Goldman said. “Sometimes, we process a few hundred
market data changes a second. But
when the markets get volatile, we process tens of thousands of market data
changes per second. And, to make it even more challenging, our customers demand
the fastest response time – in milliseconds – when volatility is at its
peak.”

Real-time response is becoming more and more important in
mainstream business. Any
environment that deals with events (stock market price changes, web site click
data, and so on), can benefit from event processing. And the technological challenge of managing events at high
speed, and with low-latency, requires a host of arcane tooling and expertise. These geeky areas will also repay
deeper exploration, including specialized messaging hardware and software,
clever use of modern computing models like multi-core CPUs, grids, and clouds,
FPGA’s, and efficient management techniques to keep it all running smoothly.

But modern CEP engines are becoming an essential tool for
managing enterprise-class streaming IT environments – a better way to manage
information that comes through a fire hose.

Hundreds....

...The number of products offered by CME Group...

While most businesses might have a few products, or a big
business might have a dozen products, CME Group has hundreds of products. But CME Group’s products aren’t made of
physical materials; they are built with software development tools. So the speed and effectiveness of
software development at the CME Group dictates the speed and growth of their
business.

So the news that CME Group has chosen a new enterprise-wide
development paradigm is important news. CEP, which has been rising to prominence in the past 5
years, flips traditional database-oriented computing on its head. CEP allows an application developer to
design logic that examines what’s happening now instead of what happened
in the past. Options
pricing and settlement consumes vast amounts of streaming information, so the
CEP model fits perfectly, and allows CME Group to quickly create new products
that process those streams.

Any business that has streams of data – from web
clickstreams to point-of-sale transactions to financial market data - can apply
CEP to more quickly and easily bring new products to market. When the development methodology
matches the form of the data, innovation approaches the speed of human thought.

Although everyone talks of the inevitability of a global
economy, few firms understand its
practical challenges than CME Group. CME Group offers electronic, regulated
exchange access in over 80 countries.

The host of technologies that enables a 7x24, global
operations are many. But CEP, by
making streams of data from all over the globe more intelligent, can make
network event processing smarter and faster, and is becoming an important
element in enterprise IT infrastructure.

Which leads to a critical innovation issue: how to strike the right balance between
a new technology and a proven technology?
A full answer to that question is outside the scope of this article, but
the key elements for enterprise software products include well-designed
threading architectures, bullet proof fault tolerance systems, robust
management tools, comprehensive testing and debugging tools, and profiling
tools. Consider that CME Group traces its origin to 1848, has NEVER had a
default or a loss of customer funds resulting from failure, today announced its
enterprise deployment of CEP technology. This is the clearest signal yet that event processing
is coming of age as it continues through its rapid growth as a software market.

A few milliseconds....

...the response time necessary at the most heavily loaded times....

To see the manual forms used at exchange over 50 years ago
puts the innovation in the capital markets in perspective. It took many seconds to fill out the
information about trades back then; it takes 400 milliseconds (1/1000 of a
second) for the human eye to blink;
today, CME Group’s trading decisions take place much faster than the blink of an eye – in just a few
milliseconds. How can this kind of
latency be achieved in the face of massive volumes and massive global
distribution?

We discussed the role of CEP in turning the traditional
database-oriented model on its head.
Databases, of course, rely on writing data to disk. The typical seek time for a disk is
about 9 milliseconds. CME Group’s
data can change tens of thousands of times per second; in order to make decisions in a few
milliseconds

Exchanges settling contracts, brokers splitting up and
routing client trades, electronic liquidity providers making markets, hedge
funds finding and executing arbitrage trades, banks delivering currency
rates…all in less than one hundredth of a human blink! Applications are measured in
microseconds, and the tools that do the measuring have resolution in
nanoseconds, that’s one billionth of a second.

StreamBase has been emerging as the top CEP company for some
time now due to its proven rapid development environment, standards-based
approach, high performance, low latency, and mission critical, fault tolerant
architecture. CME Group’s
selection of StreamBase is yet another proof point that demonstrates that CEP
is ready for the big time, and that StreamBase is leading the way.

That’s why CME Group made this important decision for their
real-time, low latency operations – the need for speed, and the need for CEP,
is critical at CME Group.

Massive scale.
Instantaneous response.
Constant change. Nonstop
operations. Taken individually, these are big technical challenges; taken
simultaneously, they are an extreme business risk. But risk for one is opportunity for another, and as CME
Group shows us, by keeping an eye on the three key “abilities” – usability,
reliability, and scalability –firms can keep innovating as business conditions
continue to get more complicated.

June 17, 2009

(Richard Tibbetts is Chief Technology Officer at StreamBase Systems. Follow him on Twitter as @tibbetts)

When it comes to algorithmic trading, everyone is looking for alpha, outsized returns which exceed your risk. In an efficient market, there would never be any alpha, so seeking alpha means seeking inefficiencies in the market. The problem with market inefficiencies is that they don't last. The mere act of using them depletes the available revenue - when others join in, the opportunity can go away very quickly.

So not only do you have to find alpha, you have to find sustainable alpha. Since a lot of customers use StreamBase and complex event processing for algorithmic trading, I see a lot of alpha seekers. Three attributes seem to consistently contribute to sustainable alpha: being smart, being fast, or being dirty. To create a sustainable strategy, mix in at least one, and preferably more of these components.

Smart — The best-known way to create sustainable alpha is to be smarter than the competition. Unfortunately, on Wall Street, genius is a commodity, so trading techniques must be both creative, and evolve. Industry analyst AITE Group found that some algorithms have a shelf life of only a few weeks. And, since market conditions are constantly changing, “smart” is never “static.” Finally, if you can discover something based on easily available public information, the other geniuses won't be far behind. So smart, the best way to achieve alpha, must also be creative, and must also constantly evolve and get smarter as market conditions evolve.

Fast — Another way of securing alpha is to be faster than the competition. This means making decisions quickly, then executing your trades before competitors can execute theirs. Efficiency and speed is important for any system, but if you are betting on being the fastest, ultra-low-latency execution management will be keeping you up at night. The challenge with fast is that it's easy to spend money to get faster. If you are a small company, larger competitors with will always be able to outspend you, so you must take advantage of your size. In 8 Lessons for Entrepreneurs – A Case Study Based on Phase Capital, the lessons from this small, innovative firm included leveraging automation, building incrementally and quickly, and choosing the right partner. On the other hand, large companies can fight off more nimble competitors with their scale, reach, and ability to choose and deploy bleeding edge technology that smaller IT departments cannot support. In either case, you must exploit your natural advantages to stay fast.

Dirty — Finally, you can be dirtier than the competition. I don't mean being ethically questionable. Instead, I mean working with markets, information, and technology that others avoid. Developers and quants both like systems that make sense, clean systems that have been developed by like-minded engineers and are easier to work with. Getting your hands dirty with legacy systems, systems that haven't yet been polished to industry standards, or systems which were never designed for trading exploitation is one way to distinguish yourself. For example, trading in emerging markets often requires getting your hands dirty. But it also lets you trade securities that others aren't following as closely. Trading uncommon or unknown products is one strategy for finding alpha. So is trading based on non-financial information. This has famously been done with weather data and Amazon sales information. Steve Steinberg gave a great presentation on this at O'Reilly's Money:Tech. Working with these data sources may require checking your engineering sensibilities at the door, but the rewards can be substantial.

Lots of trading firms use complex event processing as their platform to create alpha-seeking strategies. But not all CEP platforms are the same. So, when selecting an alpha-generating technology platform, it is important to know what kind of strategy you are building, and what key technical elements you’ll need to meet your requirements.

To build smart strategies, a development platform must have a flexible programming language, good development tools (debugger, profiling, testing), and a good backtesting environment so that you can design, test, deploy, and evolve smart alpha-seeking strategies. There is no one more frustrated than a smart quant who is told by developers that his new strategy can't be built in their system. For example, can your platform handle multi-security strategies? Cross-asset strategies? Plug in analytics written in your quant’s' favorite programming languages? All these features and more may be required to support smart strategies.

To build fast strategies, you will need a platform and deployment environment that supports ultra-low latency. This will generally require collocation at the site of your favorite exchange. If your strategy requires interaction with multiple exchanges, than your deployment architecture must take into account the wide area network latencies involved. Your development platform should also support low-latency data transports like Infiniband, and have a multi-threaded architecture that optimizes the processing of individual events. And for some fast strategies, throughput and scalability will also be important, so multi-core technologies, clustering, and cloud computing techniques may be required.

If you are planning to find alpha, know what your goals are. When selecting a platform, make sure it supports the kind of alpha you will generate. If you choose correctly, the platform will help secure your alpha against the competition.

June 15, 2009

I’m a CEO, and I use Twitter. I started “tweeting,” or using Twitter, as a test a few
months ago. When I told one of my board
members, he was skeptical, and wondered about my focus. So as I experimented with Twitter, I kept a list
of the pros and cons. At first, Twitter seemed like a waste of time - but
slowly, my list of pros grew, and my list of cons shrunk. Now, I’m convinced that
Twitter is an essential business
tool.

I looked, but I couldn't find a list of guidelines for CEOs from other CEOs.So, even though I
was loath to spend time writing Yet-Another-Article-About-Twitter, I thought my list
might be helpful for someone else. And best of all, it was essentially done. So here's my list of Twitter lessons for CEOs:

1. The business case
of Twitter is compelling. Twitter, when combined with other social
networking tools, deliver measurable business value.Lots of it.

Since my companies started using social networking a few years
ago, (blogging, LinkedIn, Twitter, and so on), we've reduced marketing spend
by over 50% and roughly tripled marketing effectiveness (measured by lead
generation volume, cost per lead, press coverage, and other quantitative
metrics) - at the same time. We showed the chart at the right at my
last board meeting.Marketing
spend is the green area chart in back, and leads are in the bar chart.

Although these numbers don't lie, they only tell part of the story. Twitter's great, but not if you don't use it right. The rest of these Twitter lessons are about using Twitter effectively.

2.Do use Twitter to transmit values, standards, and ethics. Peter
Drucker said an essential job of the CEO is “to set the values, the
standards, and the ethics of an organization.” Social networking helps me
communicate my values, standards, and ethics to hundreds of people at
once.It’s like having hundreds of
water cooler chats in real-time, from wherever I am - while I’m visiting a customer,
speaking at a conference, or meeting with my board.

Twitter and blogging are unique in that they are intensely personal. With care and attention, it's the perfect medium for transmitting our ethics.

3. Do use Twitter for press relations – As I wrote in press releases: dead or reborn?, I think the old ways of communicating are forever changed.Traditional PR agencies have role, but need to evolve. Recently I evaluated almost 20 press agencies. They all paid lip service to social media, but in my view only two had it deeply engrained in their culture. PR firms must evolve, or they will die (I didn't hire any of them and brought it in house).

4.Do make the mundane funny and constructive. Great tweets can be about mundane moments.Zappos CEO Tony Hsieh (@zappos) is a master at it - here are three bad tweets I recently received, and 3 tweets from Tony about the same topics:

BAD TWEET: “Having coffee”

GOOD TWEET: “Had coffee w/ my mom, told her I gave a presentation @ Tony Robbins conference. She was really excited. In her words: "Wow! Robin Williams!"

BAD TWEET: “Going into a meeting now”

GOOD TWEET: “Visited meetup.com office, met w/ CEO. Thankfully the meetup at meetup did not cause a rift in the space-time continuum”

BAD TWEET: “My allergies are killing me!”

GOOD TWEET: “Taking allergy pills is like having Snow White multiple personality disorder. You go from Sneezy/Grumpy to Sleepy/Dopey/Happy.”

6. Do use Twitter for employee communication – Social networking is like a virtual water cooler, creating a steady, informal stream of communication about market events, customer stories, and the day-to-day developments in the company.

7. Do use Twitter to augment investor and analyst relations. Industry and financial analysts like getting a view direct from the CEO. Blogging is especially good for this kind of communication, and Twitter alerts them that you've got something new to say.

8. Do use Twitter for peer CEO learning. I follow
other CEOs to see their observations - the good ones help shine a light on my own day. I made a Twitter list of the top 50 CEOs here: @mrkwpalmer/top50twitterCEOs, as ranked by Twellow.

9. Douse Twitter to promote rising stars. We recently
promoted one of our founders, Richard Tibbetts (@tibbetts)
to CTO.He’s an amazing
technologist, and getting him out and speaking is great for us, and great for
the industry. Matt Fowles, one of our top engineers, has written some great technical blog entries. Twitter helps raise their visibility, and educate the market.

10. Define your Twitter goals carefully.Signing up for Twitter is easy.Writing tweets is easy. But figuring out your goals for tweeting is hard. I already blog, write magazine articles, and speak at conferences, so it took me a while to figure out my list of goals for Twitter. My main goal, of course, was lesson #2 - to transmit StreamBase's values, standards, and ethics.

11. Do use Twitter to make you a better communicator. Strunk and White's rule:“omit meaningless words” is a great one, and Twitter forces you to omit meaningless words by its very structure - the discipline imposed by 140 characters is a good discipline.

12. Don’t use Twitter for traditional marketing. Some companies seem to think Twitter’s just another vehicle for traditional marketing.H&R Block’s Stacey Gratz, marketing manager, explained: "[On Twitter,] we soon realized that we needed to listen and share, rather than pushing out marketing messages." Read about the H&R Block Twitter case study here.

13. Douse Twitter for thought leadership.My company, StreamBase, is a leading
visionary in an enterprise software market called event processing. The market is still less than $100M, but it's growing, and controversial.In a new area such as ours, education is important, and
social networking provides another vehicle to educate prospects and customers.

14.Do use LinkedIn and Facebook to connect. I have links to my Facebook, LinkedIn, and Twitter profiles on my business cards. (If you liked this article, feel free to connect with me: on Facebook, on LinkedIn, or on Twitter)

15. Do use Twitter to transmit your brand's conscience. Habit #1 of Bruce Philip's 5 Habits of successful executives is:“[executives are] their brand’s conscience.”Michael Hyatt (@MichaelHyatt) of Thomas Nelson Publishers and Richard Branson (@sirdickbranson) of Virgin do this well. I think all leaders should establish this direct link to transmit the essence of your corporate brand and culture.

16. Do listen to Twitter for feedback. Great leaders feed on feedback. But a lot of people shy away from saying too much to "the boss." On Twitter, I'm just like everyone else, so I get unfiltered feedback whenever I want.

17.Do use Twitter to respond to rumors quickly.Recently, we had to fire a
well-known employee, and our competitor started spreading false rumors that we were
downsizing.Twitter helped me stop the rumor mill in real-time.

18.Don’t use Twitter to sell. Tempted to tweet: “Buy my product now for $19.99!” Don't.

19. Don’t say too much. Sometimes basic stuff like where I am is sensitive. For example, I
couldn’t tweet for a whole week because if the competition knew I was in
Chicago, I would have jeopardized a sales situation.

20.Remember that content is still king on Twitter.Even though Twitter
messages are only 140 characters, the quality of those 140 characters is still
king.The link between
content quality and business value (lesson #1) is direct.

21. Turn your communications
approach upside down. If you've gotten this far and any of this stuff shocks you, start reading. The classic, and first place to start, is: Cluetrain
Manifesto (about the book,
short form, the entire ebook). Cluetrain was published over 10 years ago, and boldly declared that a “powerful global conversation had begun.” And this was before Facebook and Twitter.

22.Don’t compete with SpongeBob. My 8-year-old son, Jack, asked what I was doing one day.I told him I was tweeting, and told him what Twitter is.He asked:“How many people follow you?”

I said:“about 1,200.”

Jack asked: “How many people follow Sponge Bob?”

I looked it up.“28,000.”

Jack said:“Dad - if you tell more jokes, maybe you’ll be more popular.”

I don’t want to compete with SpongeBob.I’m don’t care how many followers I have.Neither should you!

23. Don’t tweet too
much; don’t tweet too little. I think it’s important
not to tweet too much - I recently saw someone who had tweeted 26,000 times –
yikes! My rule of thumb is to only put something on Twitter that I find truly remarkable. That happens between 0-5 times a day.

24. Twitter helps
keep me informed in real-time, personally, with less effort, and dynamically.Today one of our engineers
asked:“doesn’t Twitter just give
you too much information?”My
response was that it’s the opposite – it helps me cull the information deluge down
to the stream I care about.Currently, I watch Ted Talks, Seth Godin, and David Armano (search on TweetDeck), my favorite industry analysts (@bmichelson), my competitors (you know who you are), my customers (CME
Group, and many more), my cousin (@marissapage who’s really, really funny), and my list of other CEOs.

25.Don’t start using Twitter unless you love
to communicate, and are willing to stay committed.It’s hard to get in the habit of tweeting, and CEOs are
busy!Moreover, tweeting well is an art, and takes some thinking.You have to want to communicate, and be
willing to stay committed to creating ideas on your own; you can’t outsource the
job, either.

26. Learn how to Twublish. I used to write byline articles for industry journals. Traditional publishing is static, formal, unidirectional, and controlled by an external editor. Then I started to blog, connect via LinkedIn, share photos with FlickR, and message with Twitter. Taken together - and they must be used together - these tools form a new way of publishing - I think of it as "Twublishing."

To publish with Twitter, or Twublish, involves writing to our blog. Then I tweet about points in the article on Twitter. Comments in the blog and direct messages on Twitter help me get feedback on the quality of my thinking. Often, this leads me to new places, and adjacent topics. Or I more fully develop the original idea, and more deeply hyperlink the original to new sources I discover. The feedback deepens the original thought and creates articles that evolve into portals for other information on the same topic. This post, for example, has links to 30 other information sources about the use of Twitter from a CEO's perspective.

Twublishing is dynamic, informal, bi-directional, and unconstrained by an editor. I can mix in video and images. It's hyperlinked so my thoughts can become portals to other people's ideas that were the building blocks of my own. And it's temporal - that is, as information changes over time and others respond, I can update or augment the original to refer to the new information.

Other than the whopping $250 I used to occasionally get for a magazine article, publishing with social networking tools is better in every way.

27. I’m more aware of
why I love my job because of Twitter.As I was returning from a sales trip, I tweeted:“Anyone who thinks the capital markets
are collapsing should go out and spend a day with the traders – lots of
innovation going on out here!” The act of paying attention to the ironic, funny, shocking, and curious
things that happen from day to day makes me more aware of why I love my job.

June 08, 2009

(Richard Tibbetts is Chief Technology Officer at StreamBase Systems. Follow him on Twitter as @tibbetts)

Recently IBM System S (aka Infosphere Streams) has been getting a lot of attention in the press. As is the custom in the news business, writers try to focus on what is new or novel in the technology. Some have talked about the automatic query generation or the massive parallel scalability, which are some of the contributions of the System S research project. Unfortunately others have picked up on the distinction between structured and unstructured data processing and identified that as unique to System S. Philip Howard writing at IT Director is one example, and another comes in the comments on a StreamBase blog post.

The reality is that CEP engines have been processing unstructured data for a long time, and doing it well. For example, unstructured data processing is a major aspect of applications in the defense and intelligence space. As one architect at an intelligence agency put it, you can’t standardize messaging protocols when the other side doesn’t want you to listen.

Unstructured data processing means processing data that won’t fit into a standard relational model. The most unstructured data is media, such as text, audio, or video. Other kinds of unstructured or semi-structured data include XML or packet capture data. These data formats may come as raw streams, or as part of the data payload in events that also contain structured data.

Specialized algorithms and libraries exist to extract meaning from this data, ranging from XPath queries and protocol analyzers to natural language processing (NLP) and machine vision. Different kinds of analysis require different algorithms, but they all share some characteristics. A system for real-time unstructured data processing requires four things:

Unstructured Data Objects — Data processing begins with data representation. The platform must have an efficient mechanism for representing large data object, integrated with memory management, persistence, and messaging systems.

Extensible Language — Most unstructured data processing involves specialized algorithms. The language must be extensible so that these domain-specific algorithms can coexist with built-in functionality. Since the algorithms are often already available as libraries, it is important that these APIs support mainstream languages.

Unstructured-Data Aware Authoring Tools — Authoring tools are a critical part of any modern platform, and it is important that the tools be designed for both real-time processing and for both structured and unstructured data processing.

Clustering Capabilities — Unstructured data processing applications are often resource intensive. Spreading the load across large numbers of compute servers is a critical scaling technique. Being able to scale via multi-threading across a single large server is also required.

Many commercial CEP systems already have some or all of these capabilities; StreamBase, for example, has all of them. Unstructured data objects can be ingested over the network at very high speed. Advanced text processing plugins can be developed in Java or C++ - and can be developed and debugged in the CEP development, debugging, and testing tools (i.e., StreamBase Studio.) And event processing applications that analyze structured data can be deployed and scaled across large clusters.

One reasonably infers that System S has these capabilities, though for the moment it is difficult to find information on authoring tools.

Now Philip Howard suggests that System S is more capable in unstructured data processing because it does not use SQL. This is a non sequitur. The additional power afforded StreamBase by using SQL does not harm developers of unstructured data processing applications. Instead, it increases the flexibility of their systems, and the speed with which they can learn the system. One system, developed using StreamBase, first extracts meaning from unstructured text, and then uses that meaning data to identify which analysts need to be alerted to this data. The conditional alerting is expressed best in SQL, while the text processing uses unstructured facilities. Developers can build alerting logic using StreamSQL EventFlow and integrate the text processing algorithms without learning an entirely new language.

The IBM SPADE language is not based on SQL or any other language familiar to developers. This may have given IBM researchers a lot of freedom to experiment when designing the system, but it won’t help enterprise programmers learn the language quickly, and it certainly is not a silver bullet for unstructured data processing.