In SQL Server, the Job Agent has its own retention that you can set. This retention determines how much execution history of SQL Server Agent jobs is being kept. But if you have a lot of instances across your domain, or if you reinstall a new server, how do you determine if this value is set correctly?

Doing it old-school: via the GUI
You can check the SQL Server agent retention via the GUI, by right-clicking on the SQL Server Agent, and clicking on properties. In the window that popped-up, click on History, and you’ll see something like this:

T-SQL / Registry
The way I wanted to check it is with T-SQL. Because I want to run the statement against a 40+ instances, I use a multi-server query for that. The reason I do that, is that this returns all the results in a single table, which makes reviewing and comparing the results a lot easier.

But there doesn’t seem to be a way to query the set values. But with a little bit of searching, I found out that you use the stored procedure below to change one of the values:

EXEC msdb.dbo.sp_set_sqlagent_properties @jobhistory_max_rows = 999

So by dissecting this procedure, I found out SQL Server just reads the value from a registry key. I took out the piece that I need to read the retention values:

xp_regread reads the literal registry path that you specify. xp_instance_regread “converts” the path you specify so that it matches the instance of SQL Server that you’re currently using.

So if you run multiple instances on your machine, you want to use the instance version of the stored procedure. If you’re just running 1 (default) instance, you can use both versions.

So eventually, I searched for the instance folder, and I did find the registry keys:

Conclusion
Especially for very large environments, it’s a good idea to write system- and/or server-checks in T-SQL. That way, you can script them in SCOM (or another monitoring tool), or if you use SQL Server Policy Based Management (and I would really like to recommend this Pluralsight course from a very good friend of mine: Boris Hristov (Blog | @BorisHristov)!), or just a multi-server query.

Last week an interesting issue popped-up when I was working on an SQL Server environment migration. A SQL Server Agent Job was throwing an exception, but we didn’t know what the issue was. And it worked in the old environment…

Finding the problem
During the installation of the new hardware, we used new accounts for everything (security reasons). This could mean a lot of issues with permissions, but gladly these issues weren’t that bad. Until we found the problem with the job, that threw this exception:

The UPDATE permission was denied on the object ‘[TableName]’, database ‘[DatabaseName]’, schema ‘[SchemaName]’.

Looking at the service account, I saw that this account had the sysadmin role. So why did it threw the exception? After some research, I found out why.

Creating the resources
In order to test this, we need to create a SQL login:

If we run the job now, and use sp_WhoIsActive to check out which user it runs under, it shows that the job runs under the SQL user:

This seems strange, because the SQL Server Agent service runs under a different account, and that account is sysadmin on the instance. But what if I add the sysadmin role to the SQL account we just created to reproduce this issue?

ALTER SERVER ROLE [sysadmin] ADD MEMBER [TEST_JEFFREY]
GO

And now run the job again:

When the SQL account is sysadmin, it does run under the configured SQL Server Agent account.

Check it to be sure
More information about this issue can be found here and here. To me, it was a bit unclear why this went wrong, but I’m glad I got some help on this (thanks Mark!). But when you encounter this yourself, or when you’re creating new jobs, make sure you check this. It could save you valuable time.

T-SQL Tuesday is a recurring blog party, that is started by Adam Machanic (Blog | @AdamMachanic). Each month a blog will host the party, and everyone that want’s to can write a blog about a specific subject.

This month the subject is “Automation”. If you want to read the opening post, please click the image below to go to the party-starter: Hemanth D. (Blog | @SqlChow).

Being a DBA, you want to automate as many processes as you can, in order to save time that you can spend on more important things. But have you ever considered over-automating your processes?

We’re safe! What can go wrong here?
At one of the companies I worked for, they thought they had everything sorted out. Indexes were rebuild every day or every week (depended on the database), databases and logfiles were shrinked, databases were checked for corruption, backups were running, etc. They felt safe, knowing that if something happened they could anticipate on any situation SQL Server would throw at them. It would blow up in their faces eventually…

One of the first things I checked were the backups. The backup job was running, but only a few databases were actually selected for backup. And the biggest database (500+ GB), which was pretty important, was skipped because it took too long to backup. And guess what, they didn’t EVER test recovering from a backup, because of a lack of disk space and time. And there you have it: a false sense of safety!

I don’t have to tell you not to shrink your database and logfiles. Everybody knows that every time you shrink your database, a kitten dies… Or an index dies… Or the soul of your database… I’m not sure which one, but take your pick. It causes (and I quote Paul Randal (Blog | @PaulRandal) on this!): “*massive* index fragmentation”. Read more about that over at Paul’s blog. Besides that, if your next query needs more space in a data- or logfile you'll see more wait time because of file growth.

The indexes were rebuild every night on the important databases, and every weekend on less used databases. But they never checked if the problem they had before was fixed when switching to this solution.

Also the corruption check was run only on user databases. They never heard of running a corruption check on system databases. The system database were in the backup process, but they never took the time checked if they could restore them or were running a backup of a corrupted database.

Focus on the important stuff
So instead of automating all your processes, maybe you should focus on what’s really important. You just automated your backup process. But does it run every time? Are the backups actually written to disk? Can you restore one of the backups you created?

What I’m trying to say is, you can automate tasks whenever and wherever you like, but don’t forget to test them. Once you’ve automated something, plan regular tests to verify if the automated process runs the way you expect it to. And is the end result really the result you want and expect?

Don’t reinvent the wheel
Another tip is: don’t reinvent the wheel. There are more people that encountered the same issue, and wrote about it or logged about a solution. So before you build your own maintenance solution, or automate health status reports, check with your community members. There’s can be found help for every problem, but the checkup on that solution is all on you.