Adam Machanic : parallelism, keynotehttp://www2.sqlblog.com/blogs/adam_machanic/archive/tags/parallelism/keynote/default.aspxTags: parallelism, keynoteenCommunityServer 2.1 SP2 (Build: 61129.1)PASS 2008, Friday Keynote: Parallel Scalehttp://www2.sqlblog.com/blogs/adam_machanic/archive/2008/11/21/pass-2008-friday-keynote-parallel-scale.aspxFri, 21 Nov 2008 18:36:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:10044Adam Machanic0http://www2.sqlblog.com/blogs/adam_machanic/comments/10044.aspxhttp://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=10044Friday Morning. Alarm sounding around six hours earlier than I would prefer. A quick shower and I find myself literally running through the convention center. Settling down at the media tables, my laptop boots just in time for the opening strains of the official PASS theme song.<br><br>Guitars blaring.<br><br>Born. To. Be. Wild!<br><br>Screens filled with images of choppers manned by guys with too much facial hair. Screens, thankfully, cleared before too much frontal lobe damage is done.<br><br>Bill Graziano, PASS VP of Marketing, makes a picture-perfect PASS entry -- on a tricycle -- and gives us some important announcements:<br><br>The next PASS Summit will once again take place in sunny Seattle. November 3-6, 2009. (More Steppenwolf goodness? Can we expect the board members to emerge from the darkness of backstage riding a magic carpet? Only time will tell.)<br><br>After a brief interlude during which Bill shared with us the joys of serving on the board, he served up the election results. Congratulations to Douglas McDowell, Lynda Rab, and Andy Warren.<br><br>Next Ed Lehman and David Reed sauntered onstage to deliver the results of the SQL Heroes Contest. No huge surprises here; BIDS Helper, Extended Events Manager, ssisUnit, CDCHelper, and QPee Tools won. Congrats to all, and thanks for giving us some quality community samples!<br><br>Next we had a handful of words from Patrick Ortiz, Global SQL Server Solution Architect, Dell. How to address HA and business continuity within SSAS. I have to admit that I found this piece a bit tough to follow, so I won’t go into further detail.<br><br>Finally, the main event. David DeWitt, Technical Fellow for Microsoft at the Jim Gray Center in Madison, WI walks on stage and makes an immediate connection with the audience. A promise that there will be "no slick demos ... this will be a power lecture." (Apparently we've all had more than our fill of marketing this week.)<br><br>We're told that the topic of the talk will be key ideas behind parallel database systems. I found this to be a bit heady in my coffee-starved state, but what follows is what I was able to capture. Luckily, Mr. DeWitt is a skilled lecturer and I had no problem following along.<br><br>To begin with, a couple of key metrics that define how parallel scale should behave:<br><br><ul><li>Linear Speedup: Add twice as much hardware, get twice as much performance</li><li>Linear Scaleup: Add twice as much hardware, and you can scale the database to twice as big while maintaining the same performance characteristics.</li></ul>The end goal is, of course, to grow your hardware incrementally and scale appropriately. Project Madison, we are told, will enable the Microsoft database platform to do exactly that. So what’s the real challenge? How do we architect for a petabyte?<br><br>Some background on today's standard offering, Shared Memory. Spindles are all attached to one machine, and use the same memory. Mr. DeWitt commented that this scheme is "pretty simple ... all of the logs and all the data are accessible by all of the CPUs ... but it doesn’t scale very well."<br><br>Another technique that we've seen is Shared Disk. In this scheme commodity nodes attached to "very expensive" shared storage.&nbsp; This, according to Mr. DeWitt, ends up giving us limited scalability. It requires a "complicated distributed lock manager." The primary system that uses the architecture? None other than Oracle RAC. (Tell us how you really feel about the competition!)<br><br>Yet another technique: Shared Nothing. This one involves commodity servers, commodity disks, and commodity interconnect. Mr. DeWitt put it concisely: "A bunch of CPUs, a bunch of memory, and a bunch of storage". Simple enough, and apparently it scales "essentially indefinitely; limited only by your pocket."&nbsp; Well sure, but "the hard part is making this work."<br><br>A bit of history on the idea of shared nothing: DB2 and Informix experimented with the idea in the mid-'90s. MSN Live, Yahoo, and Google use similar architectures. And today "around 6" database vendors are working on this problem. Mr. DeWitt was quick to point out that "no, Google did not invent clusters." He proved this with an image from 1985, showing of 20 clustered VAXen.<br><br>So what are the pros of such a scheme? Commodity hardware; incremental, linear scale; fault tolerance. The primary con? Manageability. (Think anti-consolidation; huge farms of servers have interesting administrative challenges.)<br><br>Next Mr. DeWitt got in to how to actually accomplish all of this from an architectural perspective and showed us a few basic techniques. Horizontal partitioning involves distributing rows from every table evenly across all of the nodes and disks. (Think RAID 0 taken to the next level.) This can be slightly modified using a round-robin partitioning scheme, where rows assigned to disks in the order they are inserted, moving from disk to disk and spreading the data love. This ensures that every disk, on every node, ends up with the same number of tuples. (Note, for what it’s worth: Mr. DeWitt apparently prefers the "too-ple" pronunciation over "tuh-ple").<br><br>Another technique is range partitioning. This one is pretty straightforward; every node has a range of tuples assigned to it. The system knows where to go look for any given piece of data. That can be done based on IDs, or the system can be modified a bit to use hash partitioning. Instead of dividing things into ranges based on ID, use a hash function to do it. One problem with this: "Partition skew". If your hash function isn't very good, you’ll end up with some nodes with a lot more tuples than some other nodes. This leads to a less-than-ideal circumstance called "execution skew" in which one node is a lot slower than other nodes, so the other nodes need to wait for it in a parallel query. (Distributed CXPACKET waits? No thank you!)<br><br>The next interesting point was in regard to partitioned parallelism. It is necessary to use pipeline logic and avoid serializing intermediate results. (As a potentially-interesting aside, this is one of the basic ideas behind LINQ).<br><br>Mr. DeWitt went on to share various examples explaining how a parallel query can be processed using a hash partitioning scheme. Indexes partitioned along with the data, various replication schemes, and how to deal with hardware failure, skew, and other issues. I won’t go into detail here as it's rather complex to type in a blog post and you can find plenty of papers on these topics online.<br><br>Final takeaway message: "We intend to become the premier supplier of scalable warehouse technology." Sounds good; I’m looking forward to seeing how this all plays out.<br><br>And that's that. A slightly anticlimactic end but definitely a more interesting -- and intense -- talk than I expected walking in to the keynote. Hopefully PASS will do more like this in the future; it’s certainly much better than the standard marketing content we get subjected to at these kinds of events.<br><img src="http://www2.sqlblog.com/aggbug.aspx?PostID=10044" width="1" height="1">keynotePASS 2008parallelismscale out