Search results matching tags 'Cloud', 'Developer', and 'Azure Use Cases'http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&tag=Cloud,Developer,Azure+Use+Cases&orTags=0Search results matching tags 'Cloud', 'Developer', and 'Azure Use Cases'en-USCommunityServer 2.1 SP2 (Build: 61129.1)Big Data and the Cloud - More Hype or a Real Workload?http://sqlblog.com/blogs/buck_woody/archive/2011/10/18/big-data-and-the-cloud-more-hype-or-a-real-workload.aspxTue, 18 Oct 2011 13:57:36 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:39156BuckWoody<p>Last week Microsoft announced several new offerings for “Big Data” - and since I’m a stickler for definitions, I wanted to make sure I understood what that really means. What is “Big Data”? What size hard drive is that? After all, my laptop has 1TB of storage - is my laptop “Big Data”?</p> <p>There are actually a few definitions for this term, most notably those involving the <a href="http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data" target="_blank">“Four V’s” Volume, Velocity, Variety and Variability</a>. Others <a href="http://nosql.mypopescu.com/post/10120087314/big-data-and-the-4-vs-volume-velocity-variety" target="_blank">disagree with this</a> definition. I tend to try and get things into their simplest form, so I’m using this definition for myself:</p> <p align="center"><font color="#c0504d" size="3">Big data is defined as a <em>large set </em>of <em>computationally expensive </em>data that is <em>worked on simultaneously</em>.</font> </p> <p>Let me flesh that out a&#160; little. To be sure, “Big Data” has a larger size than say a few megabytes. The reason this is important is that it takes special hardware to be able to move large sets of data around, store it, process it and so on. (<font color="#c0504d">large set</font>)</p> <p>If you store a LOT of data, but only use a small portion of it at a time, that really isn’t super-hard to do. It’s mainly a storage issue at that point. But, if you do need to work with a large portion of the data at one time, then the memory, CPU and transfer components of the system have to adapt to be responsive - new ways to work with that data (game theory, knot-algorithms, map-reduce, etc.) need to be brought into play. (<font color="#c0504d">computationally expensive</font>)</p> <p>Once that data is loaded into the processing area (memory or whatever other mechanism is used) it must be worked on in parallel to come back in a reasonable time. You have two options here - you can scale the system up with more internal hardware (CPU’s, memory and so on) or you can scale it out to have multiple systems work on it at the same time using paradigms such as map/reduce and so on. Actually, when you lay this out in an architecture diagram, scale up or out doesn’t actually change the logical structure of the process - in scale out the network becomes the bus, and the nodes become more RAM and computing power. Of course, there are changes in code for how you stitch the workload back together. (<font color="#c0504d">worked on simultaneously</font>)</p> <p>So back to the original question. Is Big Data, as I have defined it here, a workload for Windows and SQL Azure? Absolutely! In fact, it’s probably one of the main workloads, and I believe it represents the latest, and perhaps also the earliest frontier of computing. Jim <a href="http://research.microsoft.com/en-us/um/people/gray/" target="_blank">Gray, a former researcher here at Microsoft and a hero of mine, was working on this very topic.</a> I believe as he did - all computing is simply an interface over data. </p> <p>Microsoft has multiple offerings on the topic of Big Data. In posts that follow from myself and my co-workers, we’ll explore when and where you use each one. Whether you are a data professional or a developer, this is the new frontier - <a href="http://www.straightpathsql.com/archives/2011/10/microsoft-loves-your-big-data/" target="_blank">don’t wait to educate yourself</a> on how to leverage Big Data for your organization. </p> <p><strong>Hadoop on Windows Azure and SQL Server&#160; </strong>- Microsoft’s <a href="http://www.hortonworks.com/the-whys-behind-the-microsoft-and-hortonworks-partnership/" target="_blank">partnership to include Hadoop workloads on Windows Azure</a> and <a href="http://www.microsoft.com/download/en/details.aspx?id=27584" target="_blank">SQL Server/Parallel Data Warehouse (PDW)</a></p> <p><strong>LINQ to HPC </strong>- Microsoft’s High-Performance Computing SKU of <a href="http://blogs.technet.com/b/windowshpc/archive/2011/05/20/dryad-becomes-linq-to-hpc.aspx" target="_blank">HPC is now in Azure</a></p> <p><strong>Windows Azure Table Storage </strong>- A <a href="http://msdn.microsoft.com/en-us/library/windowsazure/hh508997.aspx" target="_blank">key/value pair type storage with full partitioning</a> that is immediately consistent, able to handle huge loads of data and works with any REST-compatible language</p> <p>&#160;<strong>Other offerings </strong>- Including the new <a href="http://www.microsoft.com/en-us/sqlazurelabs/default.aspx" target="_blank">Data Explorer</a>, <a href="http://research.microsoft.com/en-us/news/headlines/daytona-071811.aspx" target="_blank">Project Daytona (with a Big Data Toolkit for Scientists and researchers)</a>, <a href="http://www.microsoft.com/sqlserver/en/us/future-editions/SQL-Server-2012-breakthrough-insight.aspx" target="_blank">Power View</a> and more. </p> <p>The era of Big Data is here. And you can use Windows and SQL Azure to bring it to your organization. </p>Rip and Replace or Extend and Embrace?http://sqlblog.com/blogs/buck_woody/archive/2011/09/13/rip-and-replace-or-extend-and-embrace.aspxTue, 13 Sep 2011 11:20:05 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:38437BuckWoody<p>As most of you know, I don&rsquo;t like the term &ldquo;cloud&rdquo; very<br />much. It isn&rsquo;t defined, which means it can be anything. I prefer &ldquo;distributed<br />computing&rdquo;, which is more technically accurate and describes what you&rsquo;re doing<br />in more concrete terms.</p>
<p>So when you think about Windows and SQL Azure, you don&rsquo;t<br />have to think about an entire product &ndash; you can use parts of the system<br />together or independently to accomplish what you need to do. You can use the<br />computing functions, storage, and more and more I see folks leverage the<br />Service Bus to enable current applications to expose things to the web.</p>
<p>And that brings up the point of this post. Once you decide<br />that a distributed architecture works to solve a problem, you&rsquo;re faced with a<br />decision: should you completely re-write your architecture to take advantage of<br />the current systems or should you just fold in new code that makes the data or<br />function available to the web?</p>
<p>Of course, the answer is always &ldquo;it depends&rdquo; on the situation<br />&ndash; and it does. But unless you&rsquo;re fixing a problem with current code, I usually<br />advocate a migration approach. That means at the very least retaining the<br />business logic (again, unless it&rsquo;s not currently working) and as much of the<br />code as you can. In fact, if you follow this paradigm, you&rsquo;re on your way to<br />making a Service Bus out of the functions you currently have. You can expose<br />the results of a system rather than opening the system up. Let&rsquo;s take an<br />example.</p>
<p>Assume for a moment that you have an order-taking system<br />on-premise. That system performs many functions, one of which might creating a<br />Purchase Order. Your system might be enclosed, meaning that it has an<br />application that talks to a middle-tier, and then from there to a database<br />system. A query is generated from a screen, and passed along to eventually<br />compute, store and return a Purchase Order Number, along with other<br />information. Imagine now that you wire up the code not only to return the PO<br />number to the client, but to make that number available on an endpoint &ndash;<br />actually really not that hard to do.</p>
<p>Now you can make that PO number available to the web using<br />Azure. You could restrict who can make that call to the system, or open it up<br />to a broader audience. Or instead of the PO Number, you could make a product<br />list available. And you can go further than that &ndash; EBay, for instance, uses the<br />OData protocol (which is very cool in and of itself) which you can query from<br />the web. You could compare your company&rsquo;s product catalog to what is on EBay,<br />and list the items you have there if there are no competitors in that space.<br />And on and on it goes.</p>
<p>So the point is this &ndash; where you can, retain what works.<br />Fold in systems like Azure where they make sense. Extend and Embrace.</p>