Description

Microsoft Research recently announced the availability, under
Academic Licensing, of
Dryad, an infrastructure which allows a programmer to use the resources of a computer cluster or a data center for running data-parallel programs.

A Dryad programmer can use thousands of machines, each of them with multiple processors or cores, without knowing anything about concurrent programming.

That's a pretty heady statement. What does Dryad do, exactly, to enable this level of abstraction, shielding programmers from the incredibly complex world of distributed parallel computing? Does the level of abstraction impact the degree to which sophisticated
programmers can interact with and control some of the low level mechanisms of the Dryad runtime? What is it about LINQ that made it the no-brainer managed programming abstraction for Dryad?

Simply, how does Dryad work? This is the core question that Erik and I had after
our conversation with Roger Barga (part one of this E2E mini-series on Dryad and DryadLINQ - perhaps we should focus just on DryadLINQ next time, but for now, all the information in this conversation is certain to keep you very busy and answer many questions
you may have after learning about Dryad in part one...).

The Discussion

I really, really like the idea of Dryad. Any idea when we can get our hand on the bits? Not just the client but the whole thing. In academia were are starting to test Hadoop, Eucalyptus, and the like. I would love to add Dryad to the mix. Any chance of this
happening in the near future?

The requirement is Windows HPC Server 2008. This is a framework and runtime that is for use in an HPC cluster. Good question in terms of whether or not a Windows Server cluster would suffice, but it won't.

@softwarewarrior. "So close and yet so far, implementation wise, vanilla Windows boxes would have more traction. Everyone has Windows boxes around."

I was thinking the same thing. Installing multiple HPCs is a pretty high bar to kick the tires or even for research purposes. Not even sure why that is required. Why does it have a dependancy on hpc? IMO, it should be able to work on any win system (xp
and above) by just installing a listener/worker process on a system (even dynamically via rexec.exe) and wait for directions. The high road solution would be to enlist any/all windows computers in your org and bring them up and down as needed creating virtual-compute
farms.

Damn right and license it for non-academia while you are at it. I really think it's incredibly lame to discriminate like that. Fabulous job there on completely alienating commercial developers who need clustering and would love access to something like this.
You know, the kind of people that actually make you money in the form of increased Windows sales.

Yes it does piss me off. What reaction do you expect to do when you show me something interesting and make it illegal for me to use it. Gratification? Hell no. This is really so lame.

I knew I should I sticked with Java. I'd actually have a real clustering solution today (Hadoop). Really .NET can be so much better if you all stopped with your overzealous software hoarding mindsets. But Java is looking better every day.

We are already working with the Dryad team on a version that will not require Windows HPCS. We had to make a decison early in the project whether to leverage HPCS for the initial implementation or base it on Windows and build the necessary scheduler, monitoring
utilities, and file metadata management. The decision to start with the HPCS implementation was to get Dryad released asap and we can deploy this on HPCS clusters that he have set up at universities worldwide. And since Windows HPCS is free
through the MSDN Academic Aliance, this would not cost our academic partners a dime. All in all we thought this would be an effective way to get started.

As for the choice of license, our group in Microsoft Research is responsible for university relations and partnerships with academic researchers. Hence this is the community our small team can support during this initial release. We are working on a broader
release under a more permisive license with the goal of an open source release of Dryad, in collaboration with the MSR-SV team.

Thinking about it, something like Mesh could make a good model of how to add computers into your compute "circle of trust". Also mesh already has a host process you could hook onto somehow. Now just tell the Oz-man his world is about to change again

Thank you for addressing my concern. I'm sorry if I sounded a bit rude, it's just something I am
very interested in for my own (non-academic) projects.

Hopefully it won't be long before I can use this EXTREMELY USEFUL (!!!!) technology in my own projects. The confines of a single computer is becoming really limiting, I really do not want to switch to Java, or have to re-implement my own Dyrad/Hadoop
instead of focus on the real problem. So it's quite frustrating.

Very exciting, guys. Thanks for your increasing commitment to the academic and open source communities. Hopefully, we'll see an MS-PL license for this work someday, too.

I do have (an admittedly biased) comment on the recent C9 "many-core" discussions. I really appreciate your encouraging discussion on many-core, Charles, in this interview and others. There seems, however, to be a supposition in the interviews that "one
day in the future" we'll have many-core at our disposal. We obviously already have extremely powerful many-core processors in our everyday gaming rigs - NVIDIA and AMD have revolutionized a large set of scientific computing problems, and a cross-platform
stream computing language has been ratified and adopted my many of the major players. Stream computing seems to be almost
intentionally absent from the discussion. It's especially poignant to me in these two videos when discussing Dryad's capabilities for re-structuring expression tree nodes or pipelining computations based on the skew of the data - couldn't these strategies
also help when targeting architectures such as the GPU (or Cell...I guess)?

I know you've covered the Microsoft Accelerator project in the distant past, and I know Dryad was built to target systems without shared memory, and I know parts of PFX are built specifically with CLR architectures in mind, and I knowcurrent GPU architectures
have PCI-Express bandwidth limitations...but Map-Reduce has been shown to map to the GPU quite readily for some applications, Brahma and C$ languages have shown you can efficiently map high-level lamdas in managed code to GPU code, and DX Compute (though unmanaged)
is here..

So, in the end, I guess my question is: Can we have a C9 discussion about how I might soon write something as "simple" as: