Author: jdrussel
Date: 2011-03-26 00:48:35 +0100 (Sat, 26 Mar 2011)
New Revision: 25562
Log:
Added some basic conceptual info about row formats.
Rearranged and changed headings so the chapter covers all 4 current row formats,
the Barracuda ones first.
Modified:
trunk/innodb-1.1/innodb-row-format.xml
Modified: trunk/innodb-1.1/innodb-row-format.xml
===================================================================
--- trunk/innodb-1.1/innodb-row-format.xml 2011-03-25 21:49:17 UTC (rev 25561)
+++ trunk/innodb-1.1/innodb-row-format.xml 2011-03-25 23:48:35 UTC (rev 25562)
Changed blocks: 4, Lines Added: 108, Lines Deleted: 85; 11918 bytes
@@ -7,7 +7,7 @@
]>
<chapter id="innodb-row-format">
- <title>Storage of Variable-Length Columns</title>
+ <title>How InnoDB Stores Variable-Length Columns</title>
<!-- JDR: chapter title tries to put "row format" into plain English; probably need to be a bit clearer that we are talking about row format -->
@@ -15,109 +15,100 @@
<!-- JDR: hard to summarize this chapter even after reading the first couple of sections; come back to this intro text after finishing other reorgs -->
+<!-- JDR: link here from ALTER TABLE section. -->
+
<para>
This section discusses how certain InnoDB features, such as table
- compression and off-page storage of long columns, are controlled by
- the <literal>ROW_FORMAT</literal> clause of the <literal>CREATE
- TABLE</literal> statement.
+ <link linkend="glos_compression">compression</link> and off-page
+ storage of long columns, are controlled by the
+ <literal>ROW_FORMAT</literal> clause of the <literal>CREATE
+ TABLE</literal> statement. It discusses considerations for choosing
+ the right row format and compatibility of row formats between MySQL
+ releases.
</para>
<section id="innodb-row-format-overview">
-<!-- JDR: 'overview' != 'internal details' -->
+ <title>Overview of InnoDB Row Storage</title>
-<!-- JDR: this chapter needs so much TLC, I'll mostly skip over it and come back later -->
+ <para>
+ The storage for rows and associated columns affects performance
+ for queries and DML operations. As more rows fit into a single
+ disk page, queries and index lookups can work faster, less cache
+ memory is required in the InnoDB buffer pool, and less I/O is
+ required to write out updated values for the numeric and short
+ string columns.
+ </para>
- <title>Overview</title>
-
<para>
- All data in InnoDB is stored in database pages comprising a B-tree
- index (the <link linkend="glos_clustered_index">clustered
- index</link> or <link linkend="glos_primary_key">primary
- key</link> index). The nodes of the index data structure contain
- the values of the key columns for each primary key value, plus the
- values of the remaining columns of that row. In some other
- database systems, a clustered index is called an
- <quote>index-organized table</quote>. Secondary indexes in InnoDB
- are also B-trees, containing pairs of values of the index key and
- the value of the primary key, which acts as a pointer to the row
- in the clustered index.
+ All data in InnoDB is stored in database pages that make up a
+ <link
+ linkend="glos_b_tree">B-tree index</link> (the
+ <link linkend="glos_clustered_index">clustered index</link>
+ organized according to the
+ <link linkend="glos_primary_key">primary key</link> columns).
+ Table data and indexes both use this type of structure. The nodes
+ of the index data structure contain the values of all the columns
+ in that row (for the clustered index) or the index columns and the
+ primary key columns (for secondary indexes).
</para>
<para>
- Variable-length columns are an exception to this rule. Such
- columns, such as <literal>BLOB</literal> and
- <literal>VARCHAR</literal>, that are too long to fit on a B-tree
- page are stored on separately allocated disk
- (<quote>overflow</quote>) pages. We call these <quote>off-page
- columns</quote>. The values of such columns are stored on
+ Variable-length columns are an exception to this rule. Columns
+ such as <literal>BLOB</literal> and <literal>VARCHAR</literal>
+ that are too long to fit on a B-tree page are stored on separately
+ allocated disk pages called
+ <link linkend="glos_overflow_page">overflow pages</link>. We call
+ such columns <link linkend="glos_off_page_columns">off-page
+ column</link>. The values of these columns are stored on
singly-linked lists of overflow pages, and each such column has
its own list of one or more overflow pages. In some cases, all or
- a prefix of the long column values is stored in the B-tree, to
+ a prefix of the long column value is stored in the B-tree, to
avoid wasting storage and eliminating the need to read a separate
page.
</para>
+<!-- JDR: does KEY_BLOCK_SIZE have a meaning even for non-compressed tables? If so, need to show
+examples in this chapter. -->
+
<para>
- The Barracuda file format provides a new option
- (<literal>KEY_BLOCK_SIZE</literal>) to control how much column
- data is stored in the clustered index, and how much is placed on
- overflow pages.
+ The <link linkend="glos_barracuda">Barracuda</link> file format
+ provides a new option (<literal>KEY_BLOCK_SIZE</literal>) to
+ control how much column data is stored in the clustered index, and
+ how much is placed on overflow pages.
</para>
</section>
- <section id="innodb-row-format-antelope">
+ <section id="innodb-row-format-specification">
- <title><literal>COMPACT</literal> and <literal>REDUNDANT</literal> Row Format</title>
+ <title>Specifying the Row Format for a Table</title>
<indexterm>
- <primary>ROW_FORMAT</primary>
- <secondary>COMPACT</secondary>
+ <primary>CREATE TABLE</primary>
+ <secondary>ROW_FORMAT</secondary>
</indexterm>
<indexterm>
- <primary>ROW_FORMAT</primary>
- <secondary>REDUNDANT</secondary>
+ <primary>ALTER TABLE</primary>
+ <secondary>ROW_FORMAT</secondary>
</indexterm>
- <para>
- Early versions of InnoDB used an unnamed file format (now called
- Antelope) for database files. With that format, tables were
- defined with <literal>ROW_FORMAT=COMPACT</literal> (or
- <literal>ROW_FORMAT=REDUNDANT</literal>) and InnoDB stored up to
- the first 768 bytes of variable-length columns (such as
- <literal>BLOB</literal> and <literal>VARCHAR</literal>) in the
- index record within the B-tree node, with the remainder stored on
- the overflow page(s).
- </para>
+<!-- JDR: show brief examples and list the row formats with 1-line explanations. -->
<para>
- To preserve compatibility with those prior versions, tables
- created with the &innodb_plugin; use the prefix format, unless one
- of <literal>ROW_FORMAT=DYNAMIC</literal> or
- <literal>ROW_FORMAT=COMPRESSED</literal> is specified (or implied)
- on the <literal role="stmt">CREATE TABLE</literal> statement.
+ You specify the row format for a table with the
+ <literal>ROW_FORMAT</literal> clause of the
+ <literal role="stmt">CREATE TABLE</literal> and
+ <literal role="stmt">ALTER TABLE</literal> statements.
</para>
- <para>
- With the Antelope file format, if the value of a column is 768
- bytes or less, no overflow page is needed, and some savings in I/O
- may result, since the value is in the B-tree node. This works well
- for relatively short <literal>BLOB</literal>s, but may cause
- B-tree nodes to fill with data rather than key values, thereby
- reducing their efficiency. Tables with many
- <literal>BLOB</literal> columns could cause B-tree nodes to become
- too full of data, and contain too few rows, making the entire
- index less efficient than if the rows were shorter or if the
- column values were stored off-page.
- </para>
-
</section>
<section id="innodb-row-format-dynamic">
- <title><literal>DYNAMIC</literal> Row Format</title>
+ <title>Barracuda File Format: <literal>DYNAMIC</literal> and
+ <literal>COMPRESSED</literal> Row Formats</title>
<indexterm>
<primary>ROW_FORMAT</primary>
@@ -129,6 +120,8 @@
<secondary>COMPRESSED</secondary>
</indexterm>
+<!-- JDR: innodb_file_format accept literal string Barracuda or barracuda, or only 0-1 as shown in examples elsewhere? -->
+
<para>
When <literal role="sysvar">innodb_file_format</literal> is set to
Barracuda and a table is created with
@@ -159,39 +152,69 @@
needed for any given row.
</para>
+ <para>
+ The <literal>COMPRESSED</literal> row format uses similar internal
+ details for off-page storage as the <literal>DYNAMIC</literal> row
+ format, with additional storage and performance considerations
+ from the table and index data being compressed and using smaller
+ page sizes. For full details about the
+ <literal>COMPRESSED</literal> row format, see
+ <xref
+linkend="innodb-compression"/>.
+ </para>
+
</section>
- <section id="innodb-row-format-specification">
+ <section id="innodb-row-format-antelope">
-<!-- JDR: this is the good stuff, but gets short shrift; needs to come earlier, relate to examples elsewhere -->
+ <title>Antelope File Format: <literal>COMPACT</literal> and
+ <literal>REDUNDANT</literal> Row Formats</title>
-<!-- JDR: also, nobody thinks "I want to specify the row format" as a goal; retitle and subdivide into the real purposes behind the row formats -->
-
- <title>Specifying the Row Format for a Table</title>
-
<indexterm>
- <primary>CREATE TABLE</primary>
- <secondary>ROW_FORMAT</secondary>
+ <primary>ROW_FORMAT</primary>
+ <secondary>COMPACT</secondary>
</indexterm>
<indexterm>
- <primary>ALTER TABLE</primary>
- <secondary>ROW_FORMAT</secondary>
+ <primary>ROW_FORMAT</primary>
+ <secondary>REDUNDANT</secondary>
</indexterm>
<para>
- The row format used for a table is specified with the
- <literal>ROW_FORMAT</literal> clause of the
- <literal role="stmt">CREATE TABLE</literal> and
- <literal role="stmt">ALTER TABLE</literal> statements. Note that
- <literal>COMPRESSED</literal> format implies
- <literal>DYNAMIC</literal> format. See
- <xref linkend="innodb-compression-usage"/> for more details on the
- relationship between this clause and other clauses of these
- commands.
+ Early versions of InnoDB used an unnamed file format (now called
+ <link linkend="glos_antelope">Antelope</link>) for database files.
+ With that format, tables were defined with
+ <literal>ROW_FORMAT=COMPACT</literal> (or
+ <literal>ROW_FORMAT=REDUNDANT</literal>) and InnoDB stored up to
+ the first 768 bytes of variable-length columns (such as
+ <literal>BLOB</literal> and <literal>VARCHAR</literal>) in the
+ index record within the B-tree node, with the remainder stored on
+ the overflow pages.
</para>
+<!-- JDR: what's "prefix format"? -->
+
+ <para>
+ To preserve compatibility with those prior versions, tables
+ created with the newest InnoDB use the prefix format, unless one
+ of <literal>ROW_FORMAT=DYNAMIC</literal> or
+ <literal>ROW_FORMAT=COMPRESSED</literal> is specified (or implied)
+ on the <literal role="stmt">CREATE TABLE</literal> statement.
+ </para>
+
+ <para>
+ With the Antelope file format, if the value of a column is 768
+ bytes or less, no overflow page is needed, and some savings in I/O
+ may result, since the value is in the B-tree node. This works well
+ for relatively short <literal>BLOB</literal>s, but may cause
+ B-tree nodes to fill with data rather than key values, reducing
+ their efficiency. Tables with many <literal>BLOB</literal> columns
+ could cause B-tree nodes to become too full of data, and contain
+ too few rows, making the entire index less efficient than if the
+ rows were shorter or if the column values were stored off-page.
+ </para>
+
</section>
</chapter>
-<!-- vim: set sw=2 tw=72: -->
+<!-- vim: set sw=2 tw=100: -->

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.