I will periodically be sharing my thoughts and observations on information management here in the blog. I am passionate about the effective creation, management and distribution of information for the benefit of company goals, and I'm thrilled to be a part of my clients' growth plans and connect what the industry provides to those goals. I have played many roles, but the perspective I come from is benefit to the end client. I hope the entries can be of some modest benefit to that goal. Please share your thoughts and input to the topics.

William is the president of McKnight Consulting Group, a firm focused on delivering business value and solving business challenges utilizing proven, streamlined approaches in data warehousing, master data management and business intelligence, all with a focus on data quality and scalable architectures. William functions as strategist, information architect and program manager for complex, high-volume, full life-cycle implementations worldwide. William is a Southwest Entrepreneur of the Year finalist, a frequent best-practices judge, has authored hundreds of articles and white papers, and given hundreds of international keynotes and public seminars. His team's implementations from both IT and consultant positions have won Best Practices awards. He is a former IT Vice President of a Fortune company, a former software engineer, and holds an MBA. William is author of the book 90 Days to Success in Consulting. Contact William at wmcknight@mcknightcg.com.

April 2006 Archives

What is the difference between an outsourcer, a partner and a group of people from a common third-party source? Sorry, no joke here.

Let me explain:

Data warehousing, business intelligence, master data management, or other programs are only useful when they produce ROI, either directly or, more likely, indirectly. Knowing and enabling the chain of business events that must occur to produce that ROI requires business knowledge. The direction of the program and the future business and architecture targets are business functions, most likely rightfully maintained by end-client personnel β though augmented, and often stimulated, by a consulting partner.

It is important for end clients to define precisely the roles and responsibilities of their vendors. It is more than simply labeling a vendor as your outsourcer or partner. In fact, that label is nearly meaningless without an apportionment of roles. Those terms do not carry universal definition. To think that any vendor will take care of everything your program needs is fallacious, and for the end client (employees) to not seek education on the technology and the processes involved in the program is neglectful.

Over time, consulting partners prove their ability to lead, plan and do the necessary DW/BI work for their clients. The key to success for a program manager is making sure all required tasks, including the strategic ones, are done in an efficient manner (i.e., by those competent to do so) within time expectations. The tasks should add up to the program that meets the longevity requirements and deadline constraints. DW/BI programs, with or without consulting partners, spend a lot of time handing off expectations for the strategic tasks involved. To the degree that a consulting partner can participate in the strategic tasks, great. To the degree they canβt, augment those functions and scale the consultancy back to the delivery tasks. Expect to do this unless you are working with those select consultancies with real strategists (on your account!).

Remember that fellow college student who ran the college computer system? He could see everyone's class schedules, grades, ratings, etc. if he wanted to. (I was that guy, by the way.) Everyone else with that access had titles like Dean, Professor or President. IT staff always had β and still has β special privileges and access.

With privileged access comes responsibility, and sometimes that privilege is abused. Who has the highest privileged access to data for non-business meetings other than the data warehouse team?

Consider the following: A group inside ERCOT, the Electric Reliability Council here in Texas, allegedly created bogus companies that charged more than $2 million for completing fake work. A guilty plea deal was arranged last week with a person in the scheme. His title was β you guessed it β data warehouse manager.

The Sarbanes-Oxley Act requires CFO to sign off on company numbers or face severe consequences. Is it a stretch to think that responsibility would be shifted intra-company β or even at the legal level β to whomever manages the data, the CIO? And then, perhaps, on to those individuals with privileged access to a wide range of data before anybody else sees it, such as the data warehouse team?

Privileged access requires responsibility and accountability. CIOs need education on laws affecting corporate information and need to stay alert to regulations about historical data, vendor data and all company data. The data warehouse manager is in position to help, but clearly he or she, too, needs to be controlled as well.

If you try to comprehend everything there is to read and hear on MDM these days, you will find yourself going in circles. Therefore, I'll say that whatever you believe about what MDM is, justification is going to be necessary to establish it as a discrete project, over and above establishing the master data that you will do so out of necessity for various other projects. I view MDM as not really an option, we all need to manage our master data. However, making it a discrete project is currently under consideration for many.

So why have a centralized MDM strategy instead of just establishing individual data marts/databases for each need - as would be the case without any aspects of a top-down MDM strategy. There are 3 basic reasons. The first 2 have to do with efficiency/TCO and the last has to do with ROI.

1. Systems Impact:
Impact: The inability to do the next project that needs the operational system data or the removal of an extract stream currently in operation supporting a project. In other words, numerous overlapping extracts takes its toll on those systems you extract master data from. Perform that once and reuse many times.

2. MDM methodology and tools competence:
Impact: The carrying costs for additional headcount and tools to support multiple groups with MDM competencies. As previous said, however, it is going to be a challenge sorting through the approaches out there. Look for help to a non-software vendor who has an adaptable, experience-based approach.

3. Enterprise Subject Areas:
Impact: Having one set, as opposed to many sets, of master data for important, widespread subject areas of the business provides enormous efficiency advantages for all other system development. Furthermore, it enables ROI-producing programs that would otherwise be unobtainable - as in customer-specific, information-based cross-selling.

Always a tour-de-force of data warehousing and business intelligence education, TDWI will be in Chicago May 14-19. Link. The emergence of master data management is evident by TDWI's first full day session on the topic, which will be given by yours truly on Thursday.

I'll also be at the CSI booth on Tuesday and Wednesday in the exhibit hall, from 11:15 - 2:15. Stop by and say hello. If you want to talk more in-depth, just contact me or sign up for one of my Guru sessions to be held Tuesday from 2:30 - 4:30 (1/2 hour each.)

You can learn a lot about a data warehouse program by analyzing how it uses nulls. As most of you know, null means "unknown" or "irrelevant". The nullability, or ability to take on the null value, is a dimension of every column in the data warehouse - or any database for that matter. Most columns should not be nullable, but a few should be.
Nulls add extra storage to the column so that the DBMS can record whether the value is null or not. If the null bit/byte is "on", then whatever value may reside in the actual field is irrelevant and will not display. Of course, the other factor is the proper assignment of the null value to those columns that are nullable.

Nulls do not equal zero or spaces. They actually have an entire different meaning than either of those. There may be actual values like "not supplied" or "invalid value" that should be used in place of null (or zeroes or spaces). These descriptive terms are actually more explanatory about the field than null. So nulls get overused sometimes.

But mostly nulls are underused, taking a backseat to zeros and spaces. Nulls don't equal other nulls. And the manner in which nulls participate in aggregate functions like SUM, AVG and COUNT is very logical, but can be tricky. You also can't join on nulls. These basic facts discourage many from using nulls in data warehouses at all. But a little investment in knowledge of nulls can go a long way and afford your program the power that nulls bring.

Thinking about nulls makes me think about... country music of course.

Here are some of my favorite songs when looking at the effective use of nulls in data warehousing:

Stand by Your Null
If My Heart Were Nullable
Kentucky Null
Don't be Null
The Null Road
At the Gas Station of Love, I Got the Self-Service Null
Her Cheatin' Heart Made A Null Out Of Me
You Turned my Lullaby into a Nullaby