User login

Navigation

Dynamic Eager Haskell

I was looking at at some memory profiles of a misbehaving Haskell program a while back, and the electrical engineer in me couldn't help but think that the graphs looked very much like step responses. And it got me to thinking about how lazy programs with a space leak might be thought of as being an unstable dynamic system (a pole in the RHP). You deal with problems like that with Control Theory. But how do you apply it to a Haskell program? Maybe with something like Eager Haskell. My understanding of Eager Haskell is that it tries to execute your program eagerly, and when it starts to exhaust memory (from trying to evaluate an infinite list, etc.) it resorts back to laziness. That doesn't seem very sophisticated. But what if the "eagerness" knob was available to the program itself, or maybe another agent. You could then start to think about building better controllers. What variables would we want to control? I don't know, and it would probably depend on your algorithm, but interesting candidates might be: rate of change of memory usage, speed at which characters are written to your output file, ratio of memory usage to CPU usage (i.e. we're creating a lot of thunks that aren't being evaluated), etc. I could also imagine that applying a dose of control theory to the behavior of lazy programs might make it easier to reason about their resource usage. Heck, even if it didn't lead to anything useful, it might be interesting to put PID controller into your runtime system, just to see what happens.

Comment viewing options

Honestly, I don't know enough about this particular setting to say whether such an idea makes sense, but I have an electrical engineering background, so I think I know what you're saying.

Control theory(at least in this instance) is concerned with managing the stability of a system, so perhaps a measure of responsiveness of the system(for tasks with heavy UI) or consumption of various resources could be used as the feedback in deciding whether to JIT, or do a GC cycle, or some other optimization/allocation cycle. Maybe people have already applied this sort of thing. I mean, a PID controller is pretty simple to implement. Heck, that may be what some people do without realizing it.

For those unfamiliar with EE terminology, a PID controller is an algorithm which takes feedback from a system and passes it through a second order differential equation(a Proportional term, Integral term, and a Differential term) to numerically solve for what the system output should be. The "implementation" of said algorithm is a for-next loop with a couple of multiplies, addditions and substractions, and is a remarkably successful control system, despite(or because of) its utter simplicity.

For those unfamiliar with EE terminology, a PID controller is an algorithm which takes feedback from a system and passes it through a second order differential equation(a Proportional term, Integral term, and a Differential term) to numerically solve for what the system output should be. The "implementation" of said algorithm is a for-next loop with a couple of multiplies, addditions and substractions, and is a remarkably successful control system, despite(or because of) its utter simplicity.

I didn't study EE but here's my .02:

I always think of a PID controller as a present/past/future system; i.e. adjust the input w.r.t. to the current output error (proportional term), the measured past errors (integral term) and the predicted future error (differential term).

As I said, I didn't study EE but I don't have the feeling PID controllers are that good in systems which don't have some stable optimum you want to 'steer' to.

Would the $-notation introduced by Chris Okasaki come close to what you are thinking of here? He used this to add lazy evaluation to Standard ML (which is strict) in order to be able to reason about both amortized and worst-case data structures in one language.

Or are you more interested in a way to control the evaluation behavior at runtime?

I'm more interested in controlling the evaluation process that's currently in progress, rather than adding constructs to our language in order to enable better reasoning about our program. When we're writing a Haskell program we only think about the pure, timeless aspects of our problem. That's good, and that's precisely why we want to use a language like Haskell. But somewhere, somehow, the rubber eventually meets the road, and our program has to interact with the real stateful CPU and memory system. The question becomes, what's the best way to handle it. Do we try to figure out what happens in advance, or do we deal with whatever comes up at the time it comes up. Here's a not-too-good analogy. When you decomission a satellite, you first do a lot of calculation to figure out when to fire your thrusters for the last time. You then fire the thrusters for a short time, gravity takes over, and if everthing went right, the satellite burns up over the Pacific. Contrast this with landing a plane. The pilot knows where he wants to land, but he doesn't really concern himself much with whatever crosswinds he meets, because he will actively control the plane to make sure it lands in the correct area.

As an electrical engineer myself, I'm rather surprised to hear about PID controller design being called "easy." They're only easy in the most sterile, pure-mathematical environment, and then only if they're linear and symmetrical in both directions.

In the real world, PID design is very hairy. You have a huge number of things that go wrong. Integral windup is a biggie... even a very small DC offset causes the integrator to rapidly saturate. The derivative step greatly magnifies noise, especially higher-frequency noise. You have to be really careful about any noise, inputs or outputs that may reach the limits of your amplifiers or actuators. If they do, you're sunk. Pure PID controllers certainly don't work well under those conditions.

If you're implementing a PID controller digitally, you usually find that you have quantization effects, sampling rate effects, and numerical error. These manifest themselves with integrator windup, with small (but significant) values being lost in the repeated addition of small numbers to big ones, with noise that doesn't happen to sum to zero once it's sampled and quantized, and so on.

And, of course, if your process is nonlinear, or your actuators move more easily in one direction than in another, PID controllers become orders of magnitude harder to build.

None of these errors are insurmountable, but a practical, stable, real-world implementation of a PID controller rapidly becomes something much more than pure PID control.

Almost all computer systems of interest will have these sort of non-linear responses. A program may run really well and scale very linearly up until the point that it becomes large enough that it must start swapping to disk, or hits memory or process limits, at which case it exhibits strong step effects in its performance, and then PID control is inappropriate.

On the other hand, self-tuning computer systems are interesting, particularly in learning systems like artificial neural networks, genetic programs, and genetic algorithms where factors like the optimal mutation rates, crossover rates, momentum terms, and the like are unknown. But these are often non-linear, in fact, they are often non-continuous (step) functions themselves, which don't lend themselves to PID control at all.

The work of John Koza, et al. in genetic programming solutions to control problems of this sort are interesting. They show that near-optimal solutions usually tend to drive the actuators to their limits. This, of course, is rarely if ever good in PID control, but proves to work well in providing optimal solutions.

Yeah, I mostly threw the PID controller thing out there because it sounds sexy, and it gets people thinking in the right direction. Nonlinearities might present the biggest problem, but AFAIK there is currently *no* general way to reason about the space/time performance of Haskell programs. So going from nearly impossible to merely hard, (but well understood) would be a big step in the right direction. The whole point of this thinking-out-loud-exercise, is that I'm wondering if maybe the problem doesn't have to be impossible if we can apply just the right amount of engineering effort. And we can always hope that the nonlinearities that arise could be transfered to a new domain. (i.e. phase detectors in a phase-locked loop are highly nonlinear in the voltage domain, but are exactly linear in the phase domain). Of course this just might be me wanting to use my EE hammer to beat any problem into submission ;-)

I wouldn't be so quick to make that assumption! Anyone who can make him- or her-self conversant with type theory can probably learn control theory, but it's not going to be an easy walk in the park. Type theory is basically a peculiar corner of algebra, and control theory (with all its Lagrangians and Hamiltonians) lives in analysis-land.

Mostly, I was just trying to give the 30 second explanation, and I left out all the numerical analysis and DSP stuff. But I second the comment that, for the typical reader of this board, its still probably not that hard :).

In essence you got the strengths of Eager Haskell without a huge change in behaviour. The approach was to execute rather than thunk, up to a configurable thunk depth but never across IO or other places where optimistic evaluation would change semantics. Speculative evaluation also allowed for a procedural-style debugger that worked as you would expect.

Sadly, the internal changes to GHC were drastic, so the changes weren't folded into GHC proper. Speculative evaluation is the best approach I've seen that keeps the beauty and elegance of non-strict evalution while increasing speed and predictability.