Monday, August 15, 2011

I was reading the Editorial from my DatabaseWeekly.com newsletter by Steve Jones (Blog|Twitter). The gist of the article was how knowledge is use-it-or-lose-it, the longer you go without using a skill the less of it you remember. One of his links was to an excellent article by Michael Swart (Blog|Twitter) about the Acid properties of a database. I am not going to try to rewrite his article, instead I'll give you a link.

I hope that every(other) Monday I can find an article somewhere on the web that goes into one of the database fundamentals that either you haven't taken the time to learn or maybe just need a quick refresher on. These posts will be in addition to and not instead of my "regular" blog posts.

Friday, August 5, 2011

Ok, so I lied last time about having a fortnightly schedule. But this time I mean it, I will post every other week. This week I am going to talk about Database Files and Filegroups. On your server your data is stored in tables and those tables are stored in a database, but where does the actual database live? The answer is that there are physical files on your server where the databases are stored. There are at least two files for each database, the primary data file and the transaction log, but you can split your database up into many files for various reasons.

The Data Files

Primary Data File - This is the default file that is created when you create your database. This is where all your database objects will reside unless you add additional data files. The primary data file usually has the file extension .mdf (SQL server doesn't force you to use the extension but its standard practice). There is one primary data file for every database you create.

Transaction log - The transaction log does just what it says, it logs transactions. All data modifications are written to the transaction log before being written to the primary (or secondary) data file. This is what is commonly referred to as "write-ahead caching". I'll go more into how data is written to disk a little further down. Remember, you must have at least one Transaction Log file for each database. The transaction log file usually has the file extension .ldf

Secondary Data Files - Secondary data files can be created to supplement the data storage space of the primary data file. These data files are optional and usually have the file extension .ndf. You can use secondary data files for various reasons. For example, you could use them to stripe you data across multiple disks to speed up data access or to physically separate different database schemas into separate files.

Writing Data to Disk

We talked about the transaction log and how it uses write-ahead caching, well here is the process for writing data to disk. First the data pages are loaded into memory (unless they are already there from some previous activity). Then a record is created in the log cache. Once the transaction has been committed it is written to the transaction log from the log cache. Transactions are then written to primary or secondary data file when a checkpoint occurs. The Checkpoint process occurs for a variety of reasons.

If you issue the CHECKPOINT statement

If you stop the SQL Server Process

If you backup the Database or do a Transaction Log backup

If you have the database in log-truncate mode and the log becomes 70% full

If you perform a minimally logged operation

Based on the "recovery interval" server configuration option.

Filegroups

You can group your database files together into filegroups to aid with administration or allocation. There are two types of file groups: Primary and User-defined. The Primary file group contains the primary data file (.mdf) and any other files not assigned to a user-defined file group. It also contains the system database and tables. User-defined filegroups are just what they sound like, a filegroup created by the user. Log files are not associated with a filegroup. You have to add files to a filegroup when they are created. You can't move files around after they have been added to the database. Filegroups are handy because they can be backed up or restored separately and even be taken offline and brought back online separately. An example of when you could use them is if you had some tables that were read from more often (like archive data) you could put those tables into a secondary data file that you put into a user-defined filegroup and then move that filegroup to a separate disk to increase performance.

Hopefully I have given you a good overview of Database Files and filegroups. There is plenty more to learn so don't be afraid to use google to dig a bit deeper into the subject. Next time I will be discussing Creating and Modifying Databases.