Computational Complexity and other fun stuff in math and computer science from Lance Fortnow and Bill Gasarch

Monday, August 23, 2010

Is Scheduling a Solved Problem? (Guest Post)

(Guest Post by Ben Fulton.)

"At first glance, scheduling does not seem like a topic
that requires much attention from computer scientists".

This was how I wanted to start my review of a book on
scheduling, but Bill Gasarch called me out on it. "Really?" he
said. "I don't know any
computer scientists who think scheduling isn't important."
It's true but computer scientists aren't the ones taking first glances at
scheduling problems.

They're curriculum designers trying to determine which
instructors are needed to teach all of the classes for the fall semester. Shop
managers wondering how the widgets could be sent through the assembly line more
efficiently. Kids on the playground choosing up sides for a soccer game (a
problem identical to partitioning a set of jobs with known running times
between two processors). These are the people with scheduling problems. They'll
think about their problem for a few minutes, and come up with a good solution
in each case.

The curriculum designer - possibly Michael
Mitzenmacher - might decide that the most important goal is to keep all
instructors at around the same number of teaching hours. In that case, each
next available assignment would be given to the least busy instructor. In doing
so, he'll choose the List scheduling algorithm first proposed by Graham
in 1966 and known to be no worse than twice as slow as an optimal schedule.

The kids will choose Greedy. They'll choose a couple
of captains and alternate picking the "best remaining" player, in the
same way that a scheduler would choose the "largest remaining" job. Greedy
is a heuristic that runs in polynomial time. It's not guaranteed to find the
best solution it's The
Easiest Hard Problem but the kids are interested in having competitive
teams, not making sure that all possible sets of children can be perfectly divided
in polynomial time.

The shop manager is likely to have a lot of different
constraints to take into account, but she might notice that a station can stop
working on one widget and start on another one, if the second is likely to be
finished more quickly. She probably won't realize it, but the Shortest
Remaining Processing Time algorithm is known to optimize the average time
to completion of the widgets.

In all three cases, the algorithms they choose are simple,
easy to describe, and should work fairly well in their situations. The
schedulers aren't computer scientists - just people with problems to solve. They'll
take a first glance, and they'll solve them with a minimal amount of effort. Even
if you showed them a way to solve their problems that was twice as efficient,
but also much more difficult to understand, they'd probably reject it.

So what's the point of studying scheduling? The practical problems
are solved.

That's the first glance. You've got to dig a little deeper to
find the interesting problems in scheduling. For example, the complexity of
simply describing scheduling problems is a subject that hasn't fully
been explored yet. Problems are typically broken down three ways: the number
of processors available; the constraints on when jobs can be run; and the
criteria for determining whether one schedule is better than another. Even if
the first and third items are fairly simple, setting up a job precedence graph,
lag times, and perhaps a few rules involving a specific processor needing to
run a specific job, will likely generate a description so complex that an
engineer trying to solve the problem might not even recognize it.

Scheduling gurus Anne Benoit, Loris Marchal, and Yves Robert
also ask
this question. In response, they outline some areas of study involving
distributed-memory parallel computing that could give rise to some interesting
practical improvements.

That's where you go when you're past the first glance. And
that's why computer scientists need to pay attention to scheduling.

3 comments:

One characteristic of all the problems you describe is that they are relatively small: in all your examples a single person can handle all the input. But what about larger scheduling problems: think FedEx, airlines, major online retailers, Google. I think they all have scheduling problems where, if someone could show them an algorithm that gives a 2% better solution they'd be very happy, because that 2% translates into X millions of dollars per year.

I'd say that when you introduce elements to it like uncertainty, the problems become more difficult to solve. Look no further than the Washington Redskins and their attempt to fill a roster spot. The greedy algorithm says to choose the 53 most talented players, but suppose a few of those top 53 are injury prone (ala Donovan Mcnabb), then how does that injury affect you choosing that player or not?

Josh: No doubt, unless they're using an algorithm that's already been shown to be optimal. In which case they'd have to do some more work to make their model more precise.

Thought: Scheduling doesn't really involve itself with the evaluation routine. It's a bit like calculating possible moves in chess - you can find all the possible moves, but whether one board is better than another is still a bit subjective.