Category Archives: Lessons Learned

After building a new cluster to replace a 2005 cluster here in the office, I discovered Analysis Services would not start. Usually I do not put Analysis Services on the same box however one of our vendors requires SSRS, SSAS, and the Database Engine to be on the same instance. Frustrating, I know.

When I installed the cluster, everything was operational. However after I applied service pack two to the 2008 R2 Enterprise nodes, Analysis Services would not start. I didn’t notice the first time that it was after the service pack was applied until I removed everything and started over fresh.

The logs were not much help and I could not find much on the internet until I eventually found a friend who suggested that the service running SSAS needed to be in the local administrators group. Problem solved. Now my lesson learned is here for you as well. Enjoy!

I have written in the past about the MAP Toolkit (Microsoft Assessment and Planning) and how it helps with licensing issues including core counts. With that being said, it is time for us to renew our Software Assurance maintenance agreement and this was the original reason that I installed the MAP Toolkit a couple of months ago.

This week I rescanned all of my instances to figure out how many cores we had licensed and for which version. I wanted to get all of my documentation ready to go to our contract manager when I noticed a new server that had 24 Enterprise cores that I had never heard of before. My supervisor had never heard of the box either.

When I logged into the box, sure enough it had SQL Server installed. The server was for a monitoring tool and when the administrators installed the software on the box instead of asking me where they could install the database they found a disk and installed SQL Server themselves. Unlicensed. On the same box as the software. Without telling anyone.

When the contract manager told them how much the 24 cores would cost their department they quickly called me and asked me to kindly move their 13 GB database to one of my other servers. It just happened recently because I scan the network every few months and had never seen this before. Not only did they put us in jeopardy with our licensing, they had no backups setup for the database. When confronted they said yeah that was taking up too much space on the disk.

The lesson here today boys and girls is to scan your instances and look for unknown installations regularly. Carry on and enjoy!

Today, I ran the MAP toolkit (Microsoft Assessment and Planning) to identify our licensing structure and found an unwelcome surprise. The four new cluster nodes I installed earlier this year showed up as developer edition and not enterprise edition. Our administrator who handles the license keys, contracts, and downloads from MS gave me the license key and the ISO. When I pointed it out today, she said there was no key for an enterprise installation. That means she gave me the wrong ISO and/or she did not know it did not require a key back ten months ago. The point of this blog is not to assess blame. But to point out a pitfall in your installation through my lesson learned.

Now the real question is how do I fix it? A friend told me that I could change a registry key, but some others have indicated that I need to do an in-place upgrade. I will keep you posted. Enjoy!

We interrupt this regularly scheduled blog series to bring you a rant. Now wait don’t click away just yet. There is a lesson here somewhere, I hope.

To give you some background, we have a vendor, henceforth known as XYZ, to whom we have worked with for several years on their financial package. The package is currently in Oracle and it was decided that we could reduce costs by upgrading to the newest version of their package but migrating it to SQL Server to reduce our Oracle licenses which makes sound financial sense. Being a predominantly Oracle shop, I have been masterminding the demise of Oracle for the year that I have been here quietly chipping away.

This project has been in the planning stages for several months. During which time we order four identical super servers to be clustered into a production and development/acceptance active-passive clusters. I cannot deny that I indeed was excited about this project whole heartedly because of the hardware as well as the chance to reduce the Oracle footprint and to champion SQL Server as the preferred database. Plus, I have never built clusters from the ground up.

We took our time setting this servers up with Windows 2008 R2 Enterprise making sure that everything was well tuned. Then we setup our SQL Server 2008 R2 Enterprise clusters on all four boxes even bringing in our Microsoft Premier Field Engineer to ensure a successful migration ensuring that best practices were in effect. Most would view bringing in help as an insult to their pride, but I welcomed the learning opportunity and it helped with the learning of our green junior DBA who has no server or SQL experience. In addition, whenever I can be around our PFE, I am the eager padawan and she is the jedi master especially since she has an extensive Oracle background as well.

Fast forward and these machines are ready to go and all of the specifications were discussed and communicated several times through planning meetings. I even spoke with their DBAs during the install process to ensure that our settings were commensurate with the project. Now on Thursday of this week, the day before we are to begin migrating some of the Oracle data to the development box, I discover on one of the documents that the only version supported is SQL Server 2008 SP3 running on Windows 2008 R2 Enterprise. Hold the phone!

Rewind, notice I said we communicated several times that we were going to install SQL Server 2008 R2 Enterprise clusters on our new boxes and the vendor was compliant offering assistance if we needed it. THEY NEVER MENTIONED THAT THEY DID NOT SUPPORT R2. Now we have to uninstall R2 and install plain jane 2008 in effect putting us two versions behind and this project does not go live for another year after extensive testing. We even offered to be a beta testing site so that they could certify and say that they supported R2 since we have a year of testing ahead of us. DENIED! They were not interested whatsoever since none of the third-party tools such as BOXI support R2, according to them. Now I have to uninstall my beautiful creations and go backwards, this is progress. :-(

The moral of the story is to get the vendor to verify and sign off that the version you are installing is indeed supported before you install it. My supervisor and I thought by telling the vendor in the meetings that this would be evident. Next time we will force the issue before proceeding.

When I returned to work this morning after my excellent getaway I learned of a power outage on Friday afternoon that took down the desktops in the Systems Administration area. Oh great!

When I logged into my machine, the login screen was different and then all of my files were missing. Oh no, my profile is hosed. No wait, there are no files under a previous profile. Oh crap!

After getting thoroughly pissed off that my day would be filled with installing software and recreating my machine, a sinking reality set it. I have a bunch of project files on my desktop that have not been saved to my share drive. I know better, but sometimes we don’t always take heed to our own best advice. Today, I thought was a good day to correct that.

The practical joke was that my buddies switched out the hard drive so no files were lost. This time!

I found a PowerShell Script to Backup Files Using Windows PowerShell and set it to backup my desktop to share drive automagically using task scheduler. I’m still testing it to see if it is the script to do what I need it to do. I could write one myself, but I am a PowerShell newbie and why reinvent the wheel.

Enjoy!

[Edited:]

The script above did not work out too well as it did not copy over folders. I created a simple script with the following line:

Last night on one of my production clusters, all of the maintenance plans died. On my drive in to work, I thought it was probably a drive space issue, but it was also strange that I did not receive an alert for that. Finally at the office looking at the Job Activity Monitor then view history, I get a lot of useless information that says simply “the step failed” during my reorganize index routine. This is not much help.

When looking at the Maintenance Plan view history, then I actually see a useful error: “Alter failed for Server <servername>.” When looking up this error ,I narrowed it down to two plausible solutions. The first being to check to make sure that “Automatically set I/O affinity mask for all processors” is checked. We don’t generally change that from the default. Next.

The second option was to make sure sp_configure “allow updates” was set to zero. I thought it was, especially since this is a 2005 box and the feature is supposed to be deprecated (books online confirmed that for me, interesting though it says the functionality is unavailable). Low and behold “allow updates” was set to one. A quick reconfigure and the maintenance plans are running again.

This is a temporary fix because the new server I am migrating these to at the end of the year will be running Ola Hallengren’s backup solution, like my other new servers. Bye bye maintenance plans!

This morning when an aggravated user contacted me that they could not get into a new database that I setup on Friday, I discovered that I had made a mistake. Well that’s never happened before, well not today anyway.

Alright, here we go. I created three new logins that would access development and acceptance copies of a production database that I had just setup on an acceptance server using Red Gate‘s SQL Compare and SQL Data Compare (which I will blog about soon, excellent product). In my possession, actually a sharepoint list, I had the password for each of these three new logins. Today I found out that the password I had listed was out of date. No problem, easy fix!

Using SSMS I would easily change the passwords and then update my sharepoint list to the new passwords, this is DBA 101 stuff. Not so fast my friend! SSMS then presented an error stating that the password did not meet our ultra-rigid fort knox password policy (see posts on security). No problem, I forgot to uncheck the ‘Enforce password policy’ check box. That happens all the time when I get in a hurry. Now it is unchecked and SSMS presents the following error:

The CHECK_POLICY and CHECK_EXPIRATION options cannot be turned OFF when MUST_CHANGE is ON.

There was my real mistake committed on Friday; I forgot to uncheck the ‘User must change password at next login’ check box when I created the logins. It was a rookie mistake, but I was in a hurry to get things done so I could go and get my root canal in the afternoon. Luckily for me this was not a resume updating event. In order to fix this we need to do the following:

Maybe your users cannot access your database, or maybe your preferred monitoring software is reporting limited or non-existent disk space on your log drive for a particular SQL Server. Maybe you get a call from a Subject Matter Expert telling you their application is wigging out, or maybe you get a call from a user or even better, from your helpdesk. We have all gotten that phone call or e-mail at one point or another. A transaction log has gone awry and wreaked havoc on your limited drive allocated to the log files. What next?

Do we know the database? We could look at the drive and find the culprit. However, we will need the logical name of the log file in order to shrink it, remember it isn’t always the same as the filename. Run the following in SQL Server Management Studio (SSMS) in order to get the log file name:

SELECT name FROM sys.database_files WHERE type_desc = ‘LOG’

At this point, we can try to shrink the database by issuing the following command:

DBCC SHRINKFILE(‘<DB log file name>’, 0)

If this does not shrink the file, then run the following command to see the log_reuse_wait_desc column (way over to the right):

SELECT * FROM sys.databases WHERE name = ‘<DB Name>’

The following values of the log_reuse_wait_desc are the most common reasons that the shrink operation has failed:

CHECKPOINT: This means that there is no database checkpoint in the transaction logs. You should manually run the CHECKPOINT command or you could also perform a database backup.

ACTIVE_TRANSACTION: This indicates that there is an open transaction. You can use DBCC OPENTRAN to find out what is going on (I will write another blog on this soon and link back here).

LOG_BACKUP: This indicates that the log file has wrapped around to the beginning of the file and the free space is somewhere in the middle of the file. Run another log backup in order to move the free space to the beginning or end of the log file.

After completing one of the three afore-mentioned steps, we can now issue the SHRINKFILE command again….And all will be well with the universe. Enjoy!

With the new 2008R2 servers that were recently built, I am hoping to institute some standard best practices on the boxes moving forward. The first one was to move the TempDB to its own drive on a separate LUN and since I have moved a TempDB before, no problems here. The second best practice was to make a separate drive on a separate LUN for the databases and log files, check and check. The third practice was to move the system databases, which sounded simple in theory but I have never done it before, uncheck–it did not work.

I followed Microsoft’s directions but it did not cover my contingency. The following excerpt is taken from Microsoft’s MSDN site for SQL Server 2008 R2 in order to Move the Master Database, with changes for my separate drives.

To move the master database, follow these steps.

From the Start menu, point to All Programs, point to Microsoft SQL Server, point to Configuration Tools, and then click SQL Server Configuration Manager.

Edit the Startup Parameters values to point to the planned location for the master database data and log files, and click OK. Moving the error log file is optional.The parameter value for the data file must follow the -d parameter and the value for the log file must follow the -l parameter. The following example shows the parameter values for the default location of the master data and log files.

Stop the instance of SQL Server by right-clicking the instance name and choosing Stop.

Move the master.mdf and mastlog.ldf files to the new location.

Restart the instance of SQL Server. Here is where I had my issue, it would not restart. Through my troubleshooting I realized that the new folder did not have the same permissions as the original folder. I added the SQLServerMSSQLUser$<machine name>$<instance name> user that was created by the SQL installer. Problem solved.

Verify the file change for the master database by running the following query.