Thursday, August 4

If
you’re not familiar with it, it’s a nationwide (and NZ) competition to build
some sort of useful &/or fun product using open government data, of which
there’s a massive and interesting amount these days.

We integrated and cleaned the data and built our visualisations using Power BI.

We
worked with IP Australia’s IPGOD dataset, which contains over 100 years of
Australian patents, trademarks, plant breeders rights and designs applications
and registrations.

Wednesday, June 1

Inexplicably, SSIS doesn't natively include a method for generating new GUIDs in the Derived Column Data Flow Transformation.

The get-around is to create a custom Script Component. In SSIS 2008, we have a choice of using Visual Basic or C# to write the script code; I'm going to demonstrate a C# script.

Let's say we have a Data Flow Source and a Data Flow Destination.(you can click on these pictures to enlarge):

For the sake of this demo, the only difference between the source and the destination is that the destination has a GUID column, with no default constraint which adds its own NEWID() on insert. So we need to generate the GUID in SSIS. The usualy was we'd add a new column worth of data is by using the Derived Column transformation, but as I've mentioned (and you've probably found if you're reading this) there isn't a new GUID option.

Double click the script component, then "Inputs and Outputs", then expand Output 0 and Add Column. Call the column whatever you like, but I'm going to call it "SQLNinjaGUID" so you can see how it's referenced in the C# code. Change the data type of the column to unique identifier [DT_GUID].

Jump back to the script tab, make sure we're using Microsoft Visual C# 2008 as the ScriptLanguage, Edit Script, and replace ALL of the code with the following.

Now when you drag the Script Component's green arrow onto your Data Flow Destination, you should see SQLNinjaGUID (or whatever you called your new column) pop up as an available Input Column in the destination's Mapping tab.

Thursday, May 26

The idea is that you can upload a SQL .trc (Profiler trace) file, and Project Lucy will do some analysis for you - analysing CPU time, durations, IO etc. There's histograms of statement durations, and the ability to filter the trace easily by picking from drop downs.

All this is of course possible by uploading your trace file to a database table (or saving the trace output directly to a table) and writing your own SQL queries. However, Project Lucy's aim is to start mining the crowd sourced data to provide comparison between your trace and similar workloads uploaded by the rest of the community.

It's an interesting project, and like SQLServerPedia it needs community support to succeed. It's useful now, but to reach its potential it needs feeding with data. I've started uploading traces over the last couple of days, and I'm finding that the analyses available thus far are useful enough to keep you interested while Quest work on extending the functionalities. Each trace file you upload is its own "analysis", and all your analyses are saved for later appraisal.

Also, they're giving away a $50 Amazon voucher each day (to US residents only unfortunately), so that's a pretty good incentive to give it a try.

Friday, February 25

A reader of this blog asked me this morning: "If you wanted to design a query to make a report for management that would show which AD Groups or users have access to which reports, how would you go about it."

Wednesday, February 9

The MCM program is aimed at the uber-uber-uber SQL gurus, and from what I'm reading on various blogs it's no walk in the park for them.

The awesome thing that's popped up out of the newly revised program is that there are a bunch of free training material videos available.

While I'm not personally anywhere near thinking about attempting the MCM certification, I'm interested in hearing about what lies under the covers of SQL server from people who know it inside out. The absolute nitty gritty: like learning about DNA and mitochondria as opposed to the more systemic/anatomical view that the MCITP type courses cover.

The first video in the series covers database structures: the ways that data is stored and managed on the disk. It's presented by Paul Randal, who spent 9 years working on the storage engine for SQL server, so the info is really from the horse's mouth.

To paraphrase the summary of the video:

Records - the rows, or "slots" which make up our tables,Pages - the 8kB chunks where the records live,Extents - collections of 8 contiguous pages,Allocation bitmaps - keep an eye on the extents, and;IAM chains and allocation units - keep track of what's living where.