Archives

Great Links

Buy In

Everyone who knows me knows that I absolutely love Data Warehousing. I’m one of those fortunate few who finds their work fascinating, and I tend to dive head first into new projects.

I was recruited to join a team on a multi-terabyte data warehousing project late last summer. I was leery to take the position because project had already been attempted by numerous other teams, which had failed, and subsequently no longer worked for the company. However, the Architect in charge was very persuasive and the lure of a ground breaking project, using some of the best toys out there, was something that I couldn’t resist. I knew going in that this might be a short term situation, but I calculated that the benefits outweighed the negatives.

After I started, it became obvious that the team and project was in a state of flux. The Architect, who I previously mentioned, left shortly after I started, leaving the team to push forward anyway. The executive in charge of the project was fond of saying, “Teams pick their own captains,” and that’s what happened here as I was shortly placed in charge of the project as the new Architect.

I was running a team of a dozen in-house data professionals, as well as directing the efforts of world class consultants in SSIS and Dimensional Modeling. Though I already knew these consultants personally, working directly with them were some of the highlights of my career so far. While I’ve already performed almost every individual role in the complete data warehouse life cycle, being in the position of Architect (and project leader) was particularly enlightening since it gave me a deeper appreciation of the entire process.

It was an amazing, rewarding, and exhausting experience. I got the chance to prove my skills to some of the industry’s best; finding solutions to problems that stymied some of the biggest names in our industry. However, it was not all rainbows and butterflies.

Stress and Burnout

In many data warehouse projects, the technical challenges (though formidable) can pale in comparison to the political and social challenges. As many of us have experienced, business decisions aren’t always based on technical reality. I worked 70 to 80 hours a week for more than 6 months to try to fulfill very aggressive insane deadlines; with no end in sight. I’ve always been a bit of a workaholic, but I had completely lost my work/life balance. Being the geek that I am, and all of the learning that was taking place, I didn’t see the situation clearly. It was taking a toll on all of the other aspects of my life. I wasn’t able to sleep, wasn’t eating properly. I fell way behind in my personal and professional correspondence.

Moments of Clarity

My sister, Regan, and I in 2011

My sister became critically ill the third week of March, and I caught the first flight back home to Michigan to be with family. I stayed a week, but the doctors said the crisis would be long term, and they wouldn’t be able to provide a prognosis for weeks or months. I did not want to leave, but I had a multi-million dollar project with a large team that was counting on me in DFW. This incident put a spotlight on what I was doing to myself and my family due to overwork and lack of balance. After a good deal of quiet reflection, I gave two weeks notice to my previous employer near the end of March. Tragically, my sister died early Easter morning. She was only 41, had young children, and her loss devastated my entire family.

Lessons Learned

The death of my sister really put an exclamation point on the whole mess. I’ve learned a lot through this whole ordeal, and I’ll share some of them with you in the future. However, the most important thing I learned was that while career is important, and what we do affects a lot of people, it isn’t as important as your physical and emotional well-being. Family commitments absolutely have to come first. No matter how fascinating the project is, I cannot neglect my personal life for very long. It’s not fair to myself, my family, or my clients.

New Beginnings

It’s always important to focus on what is good in our lives. My career continues to go well. As I was deciding to leave my previous gig, one of my independent consulting clients came to me needing some long term help with their data warehouse project. It’s a fascinating project, terabytes of data, challenging text file imports, and I’m learning C# to boot. This gig should last for several months and has been perfect for me as I transition to the next phase of my career. This good fortune can be directly attributed to my work in the PASS Community. I want to sincerely thank my #SQLFamily for all of their help and support.

For those who don’t know, I’m currently the Data Warehouse Architect for a large scale and complex multi-tenant data warehousing project. Fortunately, the project is being created in SQL Server 2012 and I thrilled to be using the new features. Let me preface this blog post by admitting that my experience with using the SSIS 2012 Catalog, SSISDB, is fairly limited, but then again most people are in the same boat since adoption of 2012 has been fairly slow.

However, I’ve run into some problems which aren’t getting a lot of attention yet, but I’m fairly certain they will in the near future. Let’s break them down.

Initial Catalog Setup Issues

I won’t explain step by step how to create the Integration Services Catalog, because others already have. However, you’ll notice from the following screenshot that very few settings are available when the catalog is created.

I wish that Microsoft would upgrade the Create Catalog screen to enable the user to specify SSISDB settings upon creation. For example, right click the SSISDB under Integration Services Catalogs, and select Properties. The following settings can and should be set:

Whether to Clean Logs Periodically – Defaults to True

Retention Period in Days – Defaults to 365

Default Logging Level – Defaults to Basic

Maximum Number of Old Versions Per Project – Defaults to 10

Whether to Periodically Remove Old Versions – Defaults to True

I wish we could set those at create time. As I’ll discuss in a minute, the default values can be disastrous for many installations.

SSISDB Database Ignores Model Database Settings

For some reason, when you create the SSISDB Catalog, the corresponding SSISDB database ignores the Model database settings. It appears that no matter what my Model database is set to, SSISDB defaults to set file sizes for the Data File and Log, 10% Growth Rate, and Full Recovery Model. I’ve reached out to Microsoft, but haven’t received an answer as to why.

This has become problematic for us because we’ve deliberately set up our Model Databases so that Simple Recovery is used in our Development and QA Environments while Full Recovery is used in our Production environments. I’ve had to set up periodic checking on all of our servers looking for databases in full recovery mode that aren’t getting transaction log backups to avoid runaway T-Log growth issues.

SSISDB Performance Issues

I’ve come to the conclusion that the SSISDB Database will not scale at all. I became painfully aware of this because of a problem we had in a “pseudo” Production environment. Our first nasty surprise was T-Log growth, because the database incorrectly defaulted to Full Recovery mode which I fixed by shrinking the log. Before you mentally criticize me dear reader, please keep in mind this wasn’t an actual Production environment and the data being kept in SSISDB was inconsequential so far.

Another issue became apparent shortly afterward. We were running with the default settings of Basic level logging, 365 Days Retention, and 10 Versions of the project being kept. Our SSIS project is already fairly large with around 50 packages, but it was only being run periodically in this environment approximately once or twice per week. We are not even hitting this thing very hard. However, the SSISDB database size was growing rapidly anyway. In this particular situation, it had grown to 60GB. While I realize that 60GB is not very large, SSISDB performance slowed to a crawl. In any of our environments, when the SSISDB approaches 50GB the performance degrades dramatically. The first sign of this is that the built in reports take longer and longer to return data and eventually seem to stop returning data at all.

Unfortunately this pseudo Production server was set up with very little hard drive space, 100GB. So, I started receiving free space warnings after the SSISDB database reached 60GB.

I did a little research about reducing the size and performance issues with the SSISDB. I lowered the Retention Period to 7 days, set the Default Logging Level to None, and reduced the Maximum Number of Old Project Versions to 1. We don’t need to save old versions of our project as we are using TFS source control. You are using source control right?

Keep in mind that changing your logging level to None is not advisable in most production environments, however as of this time we are using a 3rd party product to monitor our packages so this isn’t an issue for us. I suspect that most of you will use the Basic level, but set the Retention Levels back and run the cleanup job often to avoid performance issues. Anyway I changed my settings using the following script.

Then I proceeded to start the “SSIS Server Maintenance Job” automatically generated when you create the catalog. Wow, that was a mistake.

Don’t Touch That Button

The job ran for 2 hours and blew up the transaction log even though it was in Simple Recovery Model. I then started receiving emails that the server had completely run out of space. When I checked the error log, I found an error indicating that the Log was out of space and a checkpoint record could not be written.

I immediately started researching and found this article about the cleanup job causing user packages to fail. If you read between the lines, you’ll see that the SSISDB is not properly indexed to handle the cleanup job. In my case, I had to drop the SSISDB, recreate it, and apply the script above to avoid similar issues in the future. Once again, this was not a huge problem for me because it wasn’t truly a Production environment and we rely on a third party product for our SSIS logging anyway. Don’t forget that even if you do change your settings, you must also schedule the clean up job to run regularly. If you allow too much history to be accumulated, you may suffer the same fate I did when running it.

My take on all of this is that SSISDB is not ready for prime time. I don’t think I’m being unfair with that statement. I do appreciate the fact that the SSIS Catalog is a great addition to SSIS 2012, but I expect to be able to use it in Production, running packages every 5 or 15 minutes, retaining months worth of history, and for it to perform regardless.

After writing this article I ran across this excellent article on SSIS 2012 Catalog Indexing, and time permitting I will implement his suggestions and report back. Are any of you having similar issues with your SSIS 2012 Catalog (SSISDB)?

I’m in for a busy week as I have back to back presentations this Wednesday and Thursday nights.

On Wednesday the 20th, the Forth Worth SQL Server User Group has asked me to present one of my favorites. I’m constantly revising this presentation as I encounter as I come across new mistakes to share. So, even if you’d seen it, there will still be new material to shock and amaze you.

Data Warehouse Mistakes You Can’t Afford to Make

Many data professionals understand the basics of Data Warehouse design, including Dimension and Fact Tables, slowly changing Dimensions, and the use of meaningless surrogate keys. However, it isn’t until you’ve created a dimensional model and put it into production, that you realize just how much of an impact seemingly trivial mistakes can make. They can hobble performance, allow inaccuracy, and perhaps worst of all, inhibit adoption and usage of the new system.

Learn how to avoid many common mistakes, from someone who’s made them and then found ways to correct them.

And much, much more. Each session will build upon the previous sessions so they will be recorded and posted to my blog. Come watch this exciting and progressive series as I take you from padawan to Jedi Master.

“Shit!” I said, followed by even more graphic expletives. I lost you? Sorry, I forget that you don’t follow me around? Let me explain.

Like most people, I have a junk room filled with boxes and plastic storage bins. After Thanksgiving I was in the garage looking for our stored Christmas ornaments and came across a very large bin that was unlabeled. When I opened it, I found a ton of brand new clothes including many pairs of Dockers pants and all of them had the tags still on. To understand my anger we need to back up several years.

Clowning around, but not really happy.

Around 15 years ago, on my way to morbid obesity, I was outgrowing my clothes much faster than they wore out. During the rise, I bought a bunch of Dockers Slacks and other clothes for work because I got them on sale. However, because my weight rose so fast, I had pairs of them that were never worn. I never even took the tags off. I could have returned them, but I was embarrassed because of the reason. Those of you who know me, know that I’m unfailingly honest. I couldn’t look a clerk in the face and tell them that I had to return the pants because I was too fat to wear them. For these all of these years I have kept them in a plastic bin, telling myself that I would lose weight and wear them again someday.

I carted those pants from apartment to apartment. I had garage sales and didn’t sell those pants. Even as I gave away half my stuff when I packed up everything to move across country years ago, I kept that damn bin. I knew it would be feel great to be able to wear them. However, I lost track of it in my latest move.

So, why the expletives? Well, in case anyone doesn’t know I had weight loss surgery back in April. I figured that as my weight dropped I would be able to use the brand new Dockers (and other clothes) from that bin. However, by the time I found it they were already several sizes too large for me anyway. Sometime in December I passed 100 lbs of weight lost. So, my immediate reaction was anger that I had wasted the money, time, and effort carting around those stupid pants only to never get to wear the damn things.

Then, “Bang” I felt like something hit me upside the head.

Gratitude

Who’s this guy? – Photo courtesy of Tim Mitchell

What the heck was I thinking? I have so many things for which I should be grateful. I’ve lost a hundred pounds, 10 inches from my waist, and feel great. Weight Loss Surgery saved my life. I would absolutely do it all over again, and again, and again if necessary. I’ve reached a point in my career where money isn’t a problem, who cares if I had to donate a couple of hundred dollars in clothes to Goodwill? I’ve already donated thousands of dollars of clothes as I’ve gone down through the sizes. There are lots of people in my #SQLFamily who care and I’m surrounded by great people at the North Texas SQL Server User Group as well.

I have been working for the past several months for a company in Carrollton, Texas on a fascination multi-tenant data warehouse project as the Data Warehouse Architect. The project is amazing, we’re using all of the best toys (tools), and I’ve brought in some of the very best consultants to assist with it. I have my SQL Family to thank for that. More on that in a later article.

Anyway, life is too short to dwell on meaningless things like Dockers pants. Life is good.

Resolutions

This is the first year that my primary resolution is NOT to lose weight, though I intend lose another 20 to 40 pounds this year. I’m back in the gym and have been investigating Aikido Dojos in the area. I have Steve Jones (Blog/Twitter) to thank for pushing me back toward the art. I can tell you that if he likes you, the man is relentless. So, I promise to be attending one before the end of January.

I want to start getting certified in the BI Stack. I’m calling myself out right now in front of all (both) of my readers. I will get at least 3 certifications this year and will shoot for more.

Regardless of my difficult work schedule, I’ll take more time to blog and present this year. In fact, I’ve just agreed to present an ongoing series at the North Texas SQL Server User Group meetings called, “Data Warehouse Series – Basics and Beyond” and I hope to see some of you there. I love giving back to my #SQLFamily and need to do as much as I can this year. Mentoring others in our profession makes me feel great.

Having shed almost a full person, I have more energy than ever. Watch out folks, you ain’t seen nothing yet.

Changes and The Future

I’m particularly geeked about this one because so much has changed in the past year. I’ve lost almost 100 pounds and feel so much better now. Also I’ve switched jobs and am now working as a corporate employee on a huge data warehousing project which is unlike anything I’ve ever seen before. Last night I went to dinner with most of the authors of SQL Server 2012 Integration Design Patterns, shared a few tidbits of the project, and they agreed that it was something special. More on the book and my current project to come in a future article.

A Scare Brings Focus

Anyway, a few recent things really brought me back to why I am so committed to this community. I thought that I had lost the blog a few weeks ago. One of my readers notified me that my site was down and I had no idea how long it had been down. My provider did not respond immediately to my email and when they did the response was something like, “It’s down? Oh, I’d better check that out.”

Because it’s a hosted site, my backup/restore options are limited, and I had never performed a disaster recovery test of the blog. To be honest, I wouldn’t know how to. I was surprised at my visceral reaction when I thought I had lost it, even though I’ve been neglecting the blog for the past year. It turned out that the problem wasn’t data loss, but rather a Denial of Service Attack which my provider was unaware of. Anyway, everything turned out alright, but it reminded me of how important the blog and SQL Community are to me and I intend to make it a priority in the future.

My Calling

Another reminder came after the latest SQL Saturday for the North Texas SQL Server User Group. For those who don’t know, I recently gave a precon for SQL Saturday Dallas. It went extremely well because I really love presenting, and I eagerly read all of the Session Evaluations I received. I don’t think that your average attendee realizes how much we care about what they have to say and I love pouring over the comments to find ways to improve. One attendee said something which really touched me and made me re-focus on my community involvement.

“It can be easily seen that David has found his calling in life. He is incredibly knowledgeable and passionate about Data Warehousing. I only wish we had more time for a deeper dive on some topics…”

Well, this person really nailed it. I have found my calling and love presenting, blogging, and volunteering in the PASS Community. I am fascinated by Data Warehousing, Integration, and Analytics. There’s something special about sharing knowledge with your fellow database pros (and friends) which feels so good that it’s worth all of the effort.

Last night at dinner, Tim Mitchell (Blog/Twitter) summed up the feeling very well. He’s a rabid Rangers baseball fan and therefore used a baseball metaphor. I’m paraphrasing but he likened his precon experience to the feeling when a hush comes over a crowd, the batter makes solid contact with the ball, the “crack of the bat” echos through the stadium, and it’s a home run. You just know when you hit it out of the park, and it’s one of the best feelings ever.

When I’m at the top of my game, this is exactly how it feels. When you are connected to the audience, you can feel that they are learning, are entertained, and most importantly you’ve made a difference. You know that they’ll take what you’ve taught and go back to their jobs to make things better.

Join the SQL Family #SQLFamily

Well, I hope you’ll forgive my rambling. It’s 4am as I write this, and I don’t adjust well to timezone changes. The official conference starts tomorrow and I can barely contain my excitement. It’s great seeing all of my friends again, some of whom I haven’t seen since last year. If you see me out and about, and manage to recognize me since I’ve lost almost 100 lbs, please stop me even if we’ve never met before.

I’ve already met some folks who I’m sure will become part of my #SQLFamily, and am always looking to connect with more.

If you’re not here, then make plans for next year. You cannot beat the networking and educational experience you get at a PASS Summit.

Every time I attend SQL Saturdays I am re-energized, and SQL Saturday 163 BI Edition was no different. This one was particularly special because of my precon presentation, “Data Warehousing In a Day”. I just love getting up in front of a group and teaching and it’s obvious when to the attendees.

If all of the emails I’ve received are any indication, the attendees enjoyed it as much as I did. I want to thank those who attended the precons (not just mine) and attended SQL Saturday. You folks make the PASS Community worthwhile.

As anyone who reads my blog knows, my time has been ridiculously scarce lately and I wasn’t able to do much more than get my presentations ready. It’s nice to know that if I can’t give my usual effort that everyone else picks up the slack. Everything went off great. Congrats guys.

Once again folks, please accept my sincere apologies that I haven’t made enough time for my community work, especially blogging. I just recently severed ties with a project that was taking all my time and preventing me from blogging. That, and another event I’ll talk about soon, reminded me how important the SQL Community is to me. I will endeavor to make this the last time I need to apologize.

Anyway, coming up in future posts I’ll cover the following:

More on the Merge Statement including a wrinkle I ran into last week regarding foreign keys.

The Balanced Data Distributor Transform for SISS.

A series of articles regarding Dimensions, SCDs, and how to load them in SSIS 2008 through SSIS 2012.

This is just a quick reminder that I and two of my friends are presenting Precons tomorrow for SQL Saturday Dallas. While I am biased and think that you should sign up for mine, any of these sessions are a huge educational value. If you’re in the DFW area, do yourself a favor and attend a Precon and SQL Saturday. The abstracts are below.

Hope to see you there!

Become an Enterprise DBA by Sean McCown

Speaker Bio: Sean McCown is a Microsoft Certified Master in SQL Server 2008. He is also a SQL Server MVP with over 15yrs experience in databases. Sean is also founder and co-owner of the renowned website MidnightDBA.com where he records free SQL Server training videos and co-hosts the popular Webshow DBAs@Midnight
In addition to this training, Sean has decided to offer all of the Enterprise DBA precon attendees a FREE copy of the DBA Roadmap webinar! Which means, of course, that the precon pays for itself…or you get both for 50% off, or however you want to look at it. Bottom line: attend Sean’s precon, learn massive amounts of stuff, AND improve your career exponentially. Win-win-win.

Data Warehousing in a Day by Dave Stein

Speaker Bio: David Stein is a Senior Business Intelligence Consultant with WaveTwo in Irving, TX. He’s worked in SQL Server Business Intelligence for almost a decade. He’s designed Data Warehouses from scratch as well as saved projects in serious jeopardy. He is also very active in the community, having spoken at PASS events at all levels, including the PASS Summit, SQL Rally, and at numerous SQL Saturday events and user groups. He’s also given multi-day pre-conference seminars at third party events regarding Microsoft Business Intelligence. You can learn more about Data Warehousing, Business Intelligence, and other technical topics on his blog at Made2Mentor.com or follow him on Twitter @Made2Mentor
Attend this full day pre-con and let David be your guide through this process. He’ll save you time, money, and headaches.

Additional information about my presentation can be found on my blog at: http://www.made2mentor.com/2012/06/sql-saturday-dallas-and-pre-cons/

Real World SSIS: A Survival Guide by Tim Mitchell

Speaker Bio: Tim Mitchell is a business intelligence consultant, database developer, speaker, and trainer. He has been working with SQL Server for over 9 years, working primarily in business intelligence, ETL/SSIS, database development, and reporting. He has earned a number of industry certifications and holds a Bachelor’s Degree in Computer Science from Texas A&M at Commerce, and is a Microsoft SQL Server MVP. Tim is a business intelligence consultant for Artis Consulting in the Dallas, Texas area.
As an active member of the community, Tim has spoken at national, regional, and local venues including the SQL PASS summit, SQL Connections, numerous SQL Saturday events, and various user groups and PASS virtual chapters. He is a board member and speaker at the North Texas SQL Server User Group in Dallas, serves as the co-chair of the PASS BI Virtual Chapter, and is an active volunteer for PASS. Tim is an author and forum contributor on SQLServerCentral.com and has published dozens of SQL Server training videos on SQLShare.com.
Tim is a contributing author on the charity book project MVP Deep Dives 2, and is coauthoring SSIS Design Patterns (available July 16th).
You can visit his website and blog at TimMitchell.net or follow him on Twitter at @Tim_Mitchell.

Let me just start this post with an apology to all of my readers, well both of you. There have been a lot of changes in my life lately and I’m been overwhelmed with demands on my time. I won’t go into details but those demands will be letting up soon, and I will be able to get back to blogging, community involvement, and catching up on sleep.

However, this topic is important enough that I had to take a break from frantic schedule to jot down a few words about it. As I started to write this, I got a notification of a blog article by my friend and fellow North Texas SQL Server User Group member, Ryan Adams (Blog/Twitter). As I read his article I realized that our views were nearly identical about two candidates, Sri Sridhara (Blog/Twitter) and Allen Kinsel (Blog/Twitter). I urge you to read what he wrote because I echo each and every one of his sentiments.

However, I’d like to say just a few words about each candidate myself.

Allen Kinsel

I met Allen at my first SQL Saturday of which I was an active volunteer. Allen was the first PASS member that I connected with who wasn’t in one of my local groups. The guy is tireless, has a pragmatic point of view about the community, cares about member’s opinions, and I’ve never heard anyone have a negative thing to say about him. When I have something to comment (whine about) on regarding PASS at the national level Allen is the guy I go to and I can tell you that he truly cares… or at least is good at acting like he does.

Sri Sridhara

Sri is an interesting PASS leader and volunteer. I’ve worked with him for several years organizing PASS events like SQL Rally, multiple SQL Saturdays, etc. In some ways, he’s a lot like me. Some people take to him, some not so much, but what I’ve found is that people in both camps are reacting to the same thing. The man is highly driven. Nobody can question his dedication, drive, and single minded focus on pushing the community farther. Sri embodies what PASS is and should continue to be, relentlessly driven to spread the adoption of SQL Server and to raise the ability and skill set of DBAs everywhere.

He and I don’t always agree, but he’s a guy you can count on to always follow through on his commitments, to do what he says he will do. Without belaboring the point, what happened last year was shameful, yet to this day Sri hasn’t had a negative thing to say about it, even privately. Let’s not let that happen again, Vote for Sri.

Like Ryan said, I’m not commenting on the other candidates, but not because they aren’t good people. I just happen to know these two exceptionally well, and intend to vote for them. You should too.

This is just a friendly plug for my friends at RedGate regarding SQL in the City, a free series of events traveling around the country. It’s a free day of training and networking led by spectacular presenters such as Steve Jones (Blog/Twitter) and Grant Fritchey (Blog/Twitter), who will be headlining the Austin event next week.

Steve and I at my first Summit

Grant and I at SQL Sat. Chicago

I had originally planned on going myself, but I recently got a new gig (more on that in the near future) and can’t get away at the moment. However, these guys are top notch, and you really shouldn’t miss the chance to learn from them.

Hey folks, just a quick message that I’ll be presenting this evening at North Texas SQL Server User Group. I’m presenting an updated version of “Have you got the Urge to Merge?” which I gave last month in Fort Worth. I’ve made good progress with my code which generates T-SQL Merge Statements and will be able to show more of it tonight. I hope to post a beta version of the code next week. The abstract is below.

Have you got the Urge to Merge?

The T-Sql Merge statement, introduced in SQL 2008, is powerful, versatile, and performant. However, it is also one of T-SQL’s least understood statements. Most DBAs know they can use It to insert, update, and delete data in a target table in a single transaction, generally outperforming separate statements. Fewer know, however, that it can be used to load slowly changing dimensions, facilitate auditing, and for many other uses. Come to the user group meeting and learn the in and outs of T-SQL Merge. I’ll even show you how to write complex Merge statements with lightning speed.