We've been doing a lot of Webinars lately! This week I will be hosting one that will provide a technical overview of
IBM products now available in the AWS cloud on a pay-as-you-go basis, including
IBM DB2, IBM Informix, IBM WebSphere sMash, IBM Lotus Web Content Management,
and IBM WebSphere Portal Server.

SugarCRM software architect Majed Itani will be
on hand to demo how his company uses WebSphere sMash to make it easy for
developers to deploy CRM on Amazon EC2.

We are also pleased to announce a new central clearinghouse that lists all of the upcoming AWS Webinars and other virtual events. aws.amazon.com/resources/webinars/ even lists past events -- which is handy if you were unable to attend.

I just spent some time going through the case studies. There's a lot of really good
information and advice inside for anyone using or contemplating the use of
Mechanical Turk.

In the
first
interview you will learn how and why
CastingWords
was founded, and
how
they use the Mechanical Turk
workforce to transcribe audio and to grade
the resulting work.

From there you can
read about
how
SnapMyLife uses the same workforce to
moderate uploaded pictures within 3 minutes, 24 hours a day, 7 days a week, at an estimated
cost savings of 50% over other outsourcing methods.

Next, continue on to the
Channel Intelligence interview
and read about how they have integrated Mechanical Turk into a product classification
system at a cost savings of 85%. I really like this quote from the interview:

We estimated that it would have taken us 600 man hours to categorize the
73,000 products correctly. The Turkers ended up doing it in four days.
It was nonstop. We kept logging in to see if the work was being done
and the HITs kept getting worked. Needless to say, it was a pretty
exciting moment for us.

Moving right along, the CEO of
Knewtontalks about
how they used
Mechanical Turk to accelerate their business processes and reduce
their time to market. They needed a database of colleges:

We wanted to build a database about colleges, so we just had
students at the college collect the data for us and send it
over through MTurk. Boom, instant database. So, I was
once in a business that was developing a database for
stores in New York City. We had a team that we paid about
$10 an hour. They ran around and just knocked on every
single door. That cost us maybe $100,000 and took a year's time.

Knewton also used Turk to help them choose their company logo
and the accompanying tagline. They test their questions, and
even have their web site proofread and checked for link
errors. They even paid people to spend over three hours taking
one of their sample tests. He notes that they had earmarked
$70,000 for site QA, but ended up spending less than $10K
(via Mechanical Turk) on QA, market research, product
enhancements, and so forth.

In the
last interview,
Stanford University Ph.D. candidate
Rion Snow
talks about how he
used Mechanical Turk to compare natural language annotations
made by experts against similar work by the Turk workforce. As
he says:

What we found was, in general if you just asked a single annotator on Turk to
label all of the samples, the quality wasn't going to be nearly as good as
if you went and asked a linguist or graduate student to do the same thing.
However, if you were able to break the work up among a large group of
Turkers and ask them to perform multiple independent annotations per
question you can actually do quite a bit better than experts.
Say that we get 10 separate independent annotations for each question,
and then aggregate the responses by voting or averaging; we found this
data to be better than that of expert annotators.

Rion says that Mechanical Turk allows them to collect about 10,000 annotations
per day. In previous projects he spent a lot of time simply finding and organizing
a workforce so this is a big step forward for him. He sees the
Turk workforce as an "always-on, always-available army of workers."

New and existing Mechanical Turk Requesters will find a lot of helpful
information in the new
Amazon Mechanical Turk Best Practices Guide. This 12-page document provides guidelines for planning, designing, and testing HITs. For example, it advises the use of a comment box on each HIT. This box gives Workers the ability to provide their thoughts on the tasks that they complete.
The guide also recommends that Requester organizations designate a specific person as the Mechanical Turk administrator. This person can design the HITs, interact with the community, receive and respond to feedback, and process the results. There's information about designing HITs and the associated instructions, setting an equitable
reward for successful completion of a HIT, and lots more.

On a slightly lighter note,
Aaron Koblin
recently emailed me to make sure that
I was aware of his newest Turk-powered project, the
Bicycle Built for 2,000.
Aaron and his colleague
Daniel Massey
collected 2,088 voice recordings
(at a cost of 6 cents per recording) which were then assembled to form the
finished product. Here's a video of the process in action:

If you are doing or have done something unique, cool, and valuable using
Mechanical Turk, we'd love to hear from you. Leave a comment or send us some
email.

I spent the last two days at the International Informix User Group Conference in Overland Park, Kansas. The Conference was packed with 240+ Informix architects, DBAs, customers, system integrators and IBM employees, despite bad weather including thunder storms and tornado warnings.

Guy Bowerman, IDS Architect and Suma Vimod, IDS Data Expert, created a few demos for sessions that leverages IDS functionality in the cloud. Guy also blogged about the conference at the IBM DeveloperWorks Blog.

As a part of the conference, we built several demos:

Getting started
with IDS and Amazon Web servicesThis demo shows how you can run the Informix Dynamic Server 11.50 UC3 version on Amazon EC2. It shows you in step-by-step fashion, how you can use the AWS Management Console, spawn up one of the public IDS 32-bit AMIs and configure the instance, configure IDS, use the OpenAdmin tool, and so forth. This demo is a must-see for those who are interested in running Informix in the cloud.

High Availability Demo This demo shows off some of the advanced features of Informix in the Cloud. It shows how an IDS MACH11 cluster containing an IDS Primary server instance and an IDS High-availability Data Replication (HDR) Secondary server instance can run in the Amazon EC2 environment to create a highly available IDS Cluster. The IDS Connection Manager runs on a client instance (in the cloud or locally) allowing applications to connect to the Primary or Secondary server as needed. The demo also shows how the Connection Manager can manage automatic failover to another Amazon Availability Zone for applications connected to a Primary server, should it become unavailable.

The Informix
Spatial data in the CloudThis demo shows how the IBM Informix Spatial DataBlade can implement business logic that involves geographic data. It also shows how the Informix Web Feature Service (WFS) DataBlade can provide access to geographic web services, all in the context of multiple IDS instances running in the cloud and sharing data through replication.

.

Its really cool when you can see the latest version of IDS running in
the Cloud and also some of the advanced features of IDS (HA and
HDR) leveraging the cloud. We are also investigating the use of On-Bar/ontape (the Informix tape backup tools) can upload archives and
storage spaces directly to Amazon S3 serially or all in parallel.

One of the aspects that I really like about Amazon EC2 is that it not only gives developers the flexibility and the control to put any database in the Cloud but also provides the choice of running different commercial-grade databases (Oracle, DB2, Informix, MySQL) and take advantage of the advanced features.

If you are using Informix currently and would like to take advantage of the cloud, we would love to know more.