4 Answers
4

A backup is a physical operation, not a logical operation. It reads all extents containing allocated pages (i.e. even though only a single page from an 8-page extent is allocated, it will backup the entire 64K extent), and it does it in physical order.

A restore is a physical operation, not a logical operation. It lays down the extents in their rightful places in the data files.

Rebuilding an index (or anything like it) is a logical operation, which must be logged. Backup and restore manipulate the data files directly, without going through the buffer pool, which is one reason why this cannot be done. Another reason this cannot be done is that backup and restore have no understanding of what is contained in the data being backed up.

The main reason this cannot be done, however, is that moving pages around during a restore operation would break the b-tree pointers. If page A points to page B, but page A is moved by the restore process, how is page B updated to point to page A? If it's updated right away, then it may be overwritten by the rest of the restore process. If it's deferred-updated, what if the restore process restored some transaction log that removed page A, or page B? It simply cannot be done.

Bottom line - backup and restore are physical operations that never change the data.

Hope this helps!

PS Although it doesn't directly address this question, check out the article I wrote for the July TechNet Magazine which explains how the various backups work internally: Understanding SQL Server Backups. The September magazine will have the next in the series on understanding restores.

A native SQL backup is just a page-by-page dump of the backup files, so the answer there is "no". A Quest lightspeed backup likely uses some sort of compression compression algorithm, but it still won't "rebuild" the data files or indexes, which would take a horrendously large amount of time on a big database.

Yeah, but it has to write every page to disk anyway. Why not write them in logical sequence, instead of in a fragmented order? (Maybe this is more of a backup question instead of a restore question: does the backup write the pages in physical order, or in logical order?)
–
BradCJun 29 '09 at 19:44

assuming there was a product that wrote the data out in index order, which order would you like the table saved in? Lets say I have a table with 3 columns product_id, productname, and price. Which is the correct column to sort on to save the pages in the indexed order? BTW there is nothing stoping you from indexing on the entire table (a clustered index) or each of the rows (composite index).
–
Jim BJun 29 '09 at 20:19

@Jim B: That's easy. Tables would be saved in clustered index order. Non-clustered indexes would be stored in key order. Heaps would be kept in original (non-sorted) order. (Aaron and Paul have mentioned valid reasons that backup/restore doesn't do this. Not being able to figure out the "preferred order" isn't one of these reasons. Or else doing a full index rebuild would have the same problem.)
–
BradCJun 29 '09 at 20:40

The data is backed up in the exact same physical page order that it is saved in within the database files. When the data is restored it is restored in the same page order that it was backed up in. SQL doesn't move data around for various reasons. Including that there could be issues with transactions restoring logs and page chaining issues, not to mention the massive amounts of extra time needed to shuffle the data around on multi-TB databases.
–
mrdennyJul 26 '09 at 7:38

Backup is done regularly and very often (I hope). So designers made sure backup is as quick as possible. What is the quickest I/O? Sequential. You read blocks from disks in exact physical order, you have the best performance.

Why on Earth should database perform cumbersome random I/O operation every single night, trashing the disks' heads all over the place? The difference would be around two orders of magnitude. There is no possible gain in this.

I agree with your overall point, but depending on the underlying storage configuration, random I/O may not be orders of magnitude worse than sequential I/O (a SAN drive that is spread over dozens of spindles, for example). In fact, if the data file is fragmented on the hard drive, then even "sequential" I/O isn't really sequential at all. But Paul's points override this anyway (the problem with updating pointers, and that defragmentation should be a logged operation)
–
BradCJun 29 '09 at 20:58

Hmmm. BradC, do you have worked with Firebird/Interbase before - where the main backup/restore utility/API is more alike the "Copy Database..." of the SSMS/EM ? If so, know that MS SQL Server is NOT like it.

A SQLServer Backup is more a database dump which is restored "AS-IS" - so it's more like a confortable online shortcut for an "detach-copy-reattach on other place" operation. The restored database is almost an exact copy of the original database file (almost because you can change the placement of database files of an restored database)...