Crowd Research Blog

Main menu

Post navigation

AutoMan: Programming with People

People can often perform tasks, such as natural language understanding, vision, and motion planning with greater accuracy and speed than the best algorithms available to us. Computers are good for repetitive, mechanical tasks, but many AI-style tasks remain elusive. By combining humans and computers, crowdsourcing has the potential to create a new class of applications which combine the best qualities of both.

However, unlike traditional computer programs, working with people introduces number of complications:

• People don’t work for free. How much should you pay them?
• Compared to programs, people are slow. How should you write your program to minimize latency?
• People make mistakes. Between spammers and well-intentioned but mistaken workers, how do you know that your answers are correct?

We developed the AutoMan system with these concerns in mind. AutoMan abstracts away the issues of payment, scheduling, and quality control so that programmers can focus on the purpose of their applications. Formerly difficult crowdsourcing tasks become simple, declarative programs:

A simple classification task definition using AutoMan.

AutoMan allows programmers to combine off-the-shelf code written for the Java Virtual Machine with quality-controlled, high-performance human subroutines. We have focused our research primarily on the Mechanical Turk platform but the system was designed to be platform-agnostic, only requiring implementers to provide a backend driver.

The rendered task on Mechanical Turk.

AutoMan provides question answers with a statistical confidence guarantee. Often, there is a direct trade-off between the quality required by the programmer and the cost of a task. Task wages are determined dynamically, freeing the programmer from having to determine a fair wage.

Handling these concerns in the language’s runtime means that programs that have no or ad hoc quality control schemes and would normally need a trusted supervisor to periodically watch over them are now completely automatic. Freed from requiring constant supervision, programmers can integrate human judgment into large-scale, real-world applications.

Using AutoMan, we’ve explored a variety of tasks, ranging from image recognition and categorization tasks to complex, real-world tasks like automatic license plate identification. We’re continuing to explore what is possible with AutoMan while enhancing the simplicity, reliability, and performance of the system.

AutoMan is available on our GitHub page. Give it a try and tell us what you think!

About Daniel Barowy

My research focuses on developing technology to make it easy to write safe programs and hard to write unsafe programs. Currently, I am working with crowdsourcing systems, bug-finding in spreadsheets, and end-user programming. I am a third year MS/PhD student in the Computer Science department at the University of Massachusetts Amherst in the PLASMA research group. My advisor is Prof. Emery Berger.

Hi Anand! I’m not sure we can say that we’ve “solved” autopricing, but here’s how AutoMan does it: whenever there is a timeout, we double the task’s price. We reason that timeouts are likely to occur for two reasons: 1) the task does not pay enough, or 2) we did not give the worker enough time. Either way, we double both the time allotted to do the task (e.g., from 30 seconds to 1 minute) and the pay (e.g., from $0.06 to $0.12). Assuming the worker needed that extra time, the wage stays the same; but if the worker was able to do the work in the originally-allocated amount of time, their effective wage goes up.

Our paper outlines why doubling is a good strategy when the likelihood of a task being accepted at a particular step is unknown. It turns out to be optimal (i.e., it does not incentivize workers to wait for the task’s price to increase) if workers have no knowledge about the likelihood of the task sticking around long enough for us to double the wage. Pricing is something we plan to continue exploring in future research.

There are, of course, other reasons why workers might not accept your task– it might be that the particulars of the platform (e.g., users can’t find your task/they habitually look in the wrong places, or they prefer not to take “one-off” tasks) are also factors, so the above derivation relies on a little bit of an idealized crowdsourcing platform. That said, we’ve found that it works rather well.