Rant: One sided Whitepapers

I like reading technical whitepapers, and not just related to SQL Server, but anything IT that expands my knowledge and/or view of something I generally find interesting. I was reviewing my blog reader today and noticed the following blog post on the Data Platform Insider Blog:

I happen to work in a mixed SQL Server and Oracle environment, and I always get jokes about "When am I going to get into real databases?" and other prods from the Oracle guys. It's all in good humor, but I also am looking for ways to prod back, so I thought I would give this whitepaper a read. So what's my problem?

If you are going to write a "whitepaper" on a subject and "explains some of the common myths and misunderstandings about" a competitors product, you should really make sure that you aren't propagating "myths and misunderstandings about" your own product. This paper is really not much more than Marketing/PR gibberish, it isn't technical at all. While I agree with some parts of what the whitepaper claims, SQL Server is cheaper dollar for dollar to implement than Oracle is, especially with the licensing, I think the coverage of scale out grossly misrepresents how SQL Server "actually" works in the real world, and over states the complexity of an Oracle implementation while over simplifying the SQL Server concepts.

Creating a scalable SQL Server Database Implementation requires no less forethought and preplanning than an Oracle RAC OLTP implementation. Now I am not arguing for Oracle RAC by far here, but at least show both sides of this argument equally. This white paper boils down to being nothing more than a bunch of Marketing crap that isn't a true representation of either product. Don't believe me, read the paper on the link above, and then take a look at the following whitepaper:

SQL Server MVP Aaron Bertrand understands the intricacies of creating scalable SQL Server solutions that meet the short and long term needs of high volume applications. He recently blogged about it on the following post:

If you really want to have a scalable design, you have to plan for it up front. You aren't going to scale a database designed like AdventureWorks across multiple servers without some significant reworking of your database structures. Identity fields are going to be just one of the problems that have to be addressed for the scale out design.

So what's your thoughts on this? Am I off kilter today, or is this whitepaper a piece of marketing propoganda for Microsoft?

Comments

Every company has really stupid propaganda masquerading as whitepapers. Some even hire third parties to author the paper. I will read the WhyNot paper later, but I do agree scaling up with RAC is not a trivial matter.

Essentially, the arguement RAC was Larry saying: instead paying a huge amount of money for big-iron hardware of questionable scalability, buy common high volume servers, pay huge amounts of money for Oracle RAC licenses, and reap the same questionable scalability.

At the time of RAC introduction, Larry did have a reasonable case. Big-iron systems were really expensive, and often lagged in the latest processor technology. And the OS/DBMS stack had questionable behavior in big NUMA systems.

But now going forward, we can expect big Intel systems on QPI and AMD systems on HT with decent scaling.

There is no way RAC interconnects will have the same bandwidth and latency of big systems connected directly via HT or QPI. The most recent Infiniband bandwidth is 40Gbit/s or 5GB/sec. QPI is 12.8GB/sec and the HT3 (in the ProLiant DL585G6) is listed 8.8GB/sec.

Scaling out, on any database engine, has serious implications far beyond the skill of almost all DBAs.

Whatever that validity of RAC, I am saying it will be a moot point, and big iron will return (per my blog)

I had a similar situation recently reading Oracle whitepapers about the Oracle DBMS implementation of OLAP. The comparisons to SQL Server and SSAS were really far-reaching for the most part. I did see the merits of a few of the arguments made for Oracle, but for the most part it was a steady stream of hardcore propaganda. I agree that a lot of these whitepapers (the ones written by marketing staff instead of technical people) are extremely misleading.

I used to work with Oracle, DB2, and Sybase folks in the same database engineering group, and was able to cross check various claims by vendors and cut the crap quickly. It's always good to have a vendor coming in to do a presentation with attendees who are experts in competitive products. With that, we don't have to spend too much time on marketing stuff.

Joe;

Even with more powerful chips from Intel/AMD, one could still argue for plugging multiple boxes together to make a more powerful system as long as these boxes are still considered commodity, and there is a way to extract more power out of multiple boxes.

I don't think the big iron will return. Rather, scaling on commodity will continue.

Actually I am thinking big boxes will become more commodity, atleast 8-way and possibly 16-way. Intel tried this back in 1999 with the ProFusion 8-way Pentium III. After that, they screwed up on chipsets royally, they couldn't even do a 4-way. and they were working with band-aided processors instead of processors designed for big box.

but I digress.

My arguement is two fold. One: the there will be less price premium of a 8-way over two 4-way systems, and 16-way over four 4-ways. Two: nodes in the big box connect via QPI for Intel and HT for AMD with much lower latency and higher bandwidth than a RAC type solution using Infini-band or 10GbE. Think about it. The big box might employ a cross-bar chip to connect nodes. In the RAC-type solution, the interconnect involves the IO hub going to an adapter, over a cable, possibly going through a switch or two, to the other adapter to the IO hub, and then back again. Do you really think this will win over the fat pipe one-hop?

Note there is only one published Oracle RAC TPC-C (check this), and several for TPC-H. The absence of reports definitely means the news is not good, and probably bad.

there has to be a technical basis for the argument of what will scale out better, RAC-style (distinct OS image, and db engine process on each box) or big box (single OS image, single db engine process).

My argument is that big box will connect nodes via QPI or HT, with very high bandwidth and low latency, no RDMA overhead. RAC style will go through Infini-band (much better than Ethernet) at higher latency, lower bandwidth.

So the remaining argument is cost. High-volume 2-way or 4-way plus infini-band, versus 8-16-way. HP has already brought 8-way DL785 to a relatively tolerable premium over a pair of DL585's. Let see what happen with Nehalem EX.