Does anybody have real life examples where they regularly solve NP complete or NP hard problems (by heuristics, or chasing a suboptimal solution or whatever) in their job? I know they occur in scheduling, planning, VLSI design, etc., but I am trying to get an idea of the major industries employing programmers or engineers today that regularly do this. If one were to develop expertise or a library, in say, combinatorial optimization where could one use that as part of a programming job?

@Conrad, well, I guess its a subjective idea. I'd say may be more than 5-10% of the effort is focused on solving np-complete or np-hard problems.
–
highBandWidthApr 26 '11 at 19:31

AI programming in games has the potential to be NP-complete, I believe.
–
Michael KApr 26 '11 at 19:35

There's lots of NP-hard problems out there (scheduling and planning with finite resources are usually NP-hard). However, combinatorial optimization is the wrong way to go. Being able to generate 100! combinations as fast as possible is much less useful than being able to apply domain-specific heuristics.
–
David ThornleyApr 26 '11 at 21:10

@David, I didn't mean generating combinations by combinatorial optimization. I was referring to a class of problems, like k-SAT or Traveling Salesman Problem etc.
–
highBandWidthApr 26 '11 at 21:17

So are there programmers employed by logistics companies that are actually devoted to solving these optimization problems, or are most of these operations generally solved once and are just repeated for most companies? +1 for a number of examples. Are you/have you been involved in any of these?
–
highBandWidthApr 26 '11 at 19:54

The first two I've written tools for, the third is something colleagues work on. I would expect that large logistics companies actively do research in this area since it can save them millions of dollars if they achieve a couple percent extra performance through some new algorithm :)
–
DeckardApr 26 '11 at 20:03

I've interviewed for a travelling salesman role. The large parent company had a room full of PhDs slaving away in the hope of getting a few tenths of a percent improvement in their routing. Which would be worth a few millions of dollars to them... each day. So those places do exist. Routing stuff and scheduling are the two biggies - imagine you have 1000 people and a factory that runs two or three shifts. Now schedule everyone to work for the next month keeping in mind these 200 rules and everyones preferences...
–
МסžApr 26 '11 at 22:29

I have used time constrained simulated annealing to solve a travelling saleman like problem in touch panel manufacturing. Every millisecond we could shave from the cycle time of the laser etching of each panel would increase the throughput, utilisation and thus profitability of the machine, so I put a lot of effort into minimising dead time (non scribing paths) between scribing paths (which obviously couldn't be optimised away).

I used a time-bounded algorithm to get around the NP hardness of the problem, as we couldn't afford the risk that optimisation calculation might take longer than the time saved by the more optimum path. While the machine was moving the panel from the loading position to the position where the laser head was over the closest corner I had the time to run some simulations. The algorithm almost never ran to completion within the few hundred milliseconds of the move, but almost always returned a better scribe path than any of the simple, non adaptive models we had always used before (such as a spiral or snake paths).

That's cool. But I thought every panel would have the same pattern, and you'd just solve the problem once rather than over and over again for every widget. Why did you have to solve it every time?
–
highBandWidthApr 26 '11 at 20:52

1

The ideal pattern was the same for each panel, but the mechanical alignment of the panel, the position of previous layers in the process and the tiled nature of the laser scribing head meant that a dynamic set of sub patterns had to be calculated for each panel individually and then optmised. It was an interesting problem to work on, especially given the time constraints.
–
Mark BoothApr 26 '11 at 21:04

I'm working (right now, actually) on the bioinformatics problem of multiple local DNA sequence alignment. The point of this is that if a lot of sequences from genes with some common property (similar expression profile or same transcription factor binding in a ChIP-chip experiment) align strongly at some point, then you've probably found the reason for their common property. Then again, I'm more interested in the statistical aspects of the problem. Even though it's NP-hard, you don't lose much by using heuristics in practice. The interesting part of the problem, IMHO, is a signal to noise ratio issue.

are you using classical combinatorial/ai approaches or statistical ones. In a way, all of modern nlp, clustering, machine learning deals with np-complete problems, but usually attacked from a statistical perspective. Its interesting and relevant nevertheless. Is this in academia or industry?
–
highBandWidthApr 26 '11 at 19:42

@highBandWidth: My approach is statistical. I'm in academia. The whole point of the research I'm doing is that if you ignore the statistical issues and just focus on the combinatorial problem, Bad Things Happen.
–
dsimchaApr 26 '11 at 20:09

after brewing, some materials must be added (leaven?), and it must be rest for 10-15 day, then you got 15-20 kind of level-2 stuff;

finally, when it's ready, some materials should be added, it's the level-3 stuff, called drinkable beer, there're cc. 30 kind of beers;

the beer can be bottled as 3 dl, 5 dl, sometimes it gets special necklacigng (level 4), then it can be packed as 5x4 box, 6-pack (level 5).

There're machine "lines" for each operation: from brewing to packaging. The machines can perform more operations, say, some packing machines can make 6-pack and 3-pack, but others can do only 6-pack. There're constraints, e.g. speed, or the big brewing kettle is for brewing min. 6000, max, 8000 l of beer, (but if the beer type is light, then the min. is 5000 l and the max. is 7000 l). And so on, on every level.

The task: as I mentioned, there's a demand plan, for the 100 kind of level-5 (the bottled, packaged stuff). Make an optimal manufactoring plan for all the 5 levels, all machines. Minimize machine switches (e.g. bottling .5, .5, .5, .3, .3, .3 is better than .3, .5, .3, .5, .3, .5, there're less swithc, less dead time for bottling machines). Priorize by customer: some customers requires to ship the beer only with more than 50% of expiration time remains. Etc, etc.

Discover bottlenecks (eh), make alternate plans with adding non-existent machines to these points, then the best virtual scenario can be used to suggest to buy a new machines.

Is it hard enough, or should I tell ya how a textile factory works?

(Personal remark: the web, the bank and the logistics are challenging areas, but they're baby toys compared to manufactoring problems.)

Disclaimer: numbers are distorted for security reason, order of magnitude is real.

Are you working on something like this or a tool to solve stuff like this for your employer?
–
highBandWidthApr 26 '11 at 19:58

1

Well, manufacturing is logistics writ large. Definately harder than finance in that respect. But at least it deals with defined problems, not random equations and loosely defined orders of operation!
–
Michael KApr 26 '11 at 20:00

1

Any kind of scheduling algorithm with best-fit of resources is probably equivalent to the knapsack problem, which is NP-complete.
–
Scott WhitlockApr 26 '11 at 20:15

A friend of mine has created a DP/SP system in Excel+VB years ago. It does not contain autoplanning, the app is too fat for Excel. So, we've just made a MySQL/PHP/AJAX-based collaborative expandable (see: dataflow - aka. flow-based programming - approach) spreadsheet framework (me), and adopted the biz logic from the XLS-version (friend). We've implemented autoplanning, too (friend). It was a crazy idea to write a spreadsheet, but it works. The best part: XLS->SQL switch is somewhat wonderful! We can do anything with the data (e.g. autoplan), using any tool/platform (PHP, Java, what we wish).
–
ern0Apr 26 '11 at 20:15

@ern0, NP-complete/NP-hard basically refers to how few short-cuts you can even assume to be able to take instead of trying all possibilities one by one. Theoreticians spend a lot of efforts figuring out short-cuts which e.g. say that if we know that the path A-B-C will always be longer than A-C directly, we can make it faster and prove to be within 50% of the optimal value. Etc.
–
user1249Apr 26 '11 at 20:20