Ask the Unicorn: Why is it Important to Know Tables in Filegroups?

Earlier this week, I wrote a quick blog post on a throw-away script I had written. The blog post was DETERMINING FILEGROUP FOR A TABLE. At the time, I didn’t explain why I needed the information; which prompted the following from one of my readers:

Read your mini-blog in “how to determine the filegroup for a table.” [...] I just would like to know why will [it] be important for me to know about it, or in what scenarios knowing this will be helpful.

For the purposes that I put the script together, I was building a data warehouse, and needed to ensure an even distribution of tables across filegroups and LUNs. The database was built using a script, so if my math was off, I could have ended up with more tables in the various filegroups than was expected. I’ve always been a fan of the phrase “Trust, but Verify.”

In general, though, there are a number of other reasons to know which filegroups tables (or rather their indexes and partitions) are stored in.

One use case, is when you need to perform a filegroup restore of a database. You’ll need to know what is in the filegroup. Or more importantly, you’ll want to know which filegroup to restore to get the table desired. Knowing this, you can then restore the right items.

Another use case is when you have Page Free Space (PFS) page contention and multiple filegroups. Typically, you’ll want to add more files to the affected filegroups, but it might also be worthwhile to split the tables apart at the source of the contention. This script can tell you which tables are stored together.

One more use case is the management of archival production data. Maybe the data over a year old isn’t accessed as often as fresher data. If so, it might make sense to push this data to slower storage. That might be the plan in place, but plans will go awry and you’ll want to verify that this is happening as expected.

One final reason, you may want to split apart non-clustered indexes from their clustered indexes or heaps. This can help alleviate disk pressure. Sure, with SANs, solid state, and flash storage this is much less of a concern, but not everyone has these resources. Some databases will always be stored on commodity storage and we’ll need to rely on old best practices for managing storage.

What do you think? What advice would offer to someone that asked the same question? Leave your replies in the comments below. Also, if you want “the unicorn” to answer a question in a future post, send me a message on LinkedIn or at AskTheUnicorn.