4.9 Tabular data

4.9.1 The table element

In this order: optionally a caption element,
followed by zero or more colgroup elements, followed
optionally by a thead element, followed optionally by
a tfoot element, followed by either zero or more
tbody elements or one or more tr
elements, followed optionally by a tfoot element (but
there can only be one tfoot element child in
total), optionally intermixed with one or more script-supporting elements.

The table element takes part in the table
model. Tables have rows, columns, and cells given by their descendants. The rows and
columns form a grid; a table's cells must completely cover that grid without overlap.

Precise rules for determining whether this conformance requirement is met are
described in the description of the table model.

Authors are encouraged to provide information describing how to interpret complex tables.
Guidance on how to provide such information is given
below.

Tables should not be used as layout aids.
Historically, many Web authors have tables in HTML as a way to
control their page layout making it difficult to extract tabular
data from such documents.
In particular, users of accessibility tools, like screen readers,
are likely to find it very difficult to navigate pages with tables
used for layout.
If a table is to be used for layout it must be marked with the
attribute role="presentation" for a user agent to properly represent
the table to an assistive technology and to properly convey the
intent of the author to tools that wish to extract tabular data from
the document.

There are a variety of alternatives to using HTML
tables for layout, primarily using CSS positioning and the CSS table
model. [CSS]

The border
attribute may be specified on a table element to
explicitly indicate that the table element is not being
used for layout purposes. If specified, the attribute's value must
either be the empty string or the value "1".
The attribute is used by certain user agents as an indication that
borders should be drawn around cells of the table.

Tables can be complicated to understand and navigate. To help
users with this, user agents should clearly delineate cells in a
table from each other, unless the user agent has classified the
table as a
layout table.

Authors and implementors
are encouraged to consider using some of the table design techniques
described below to make tables easier to navigate for users.

User agents, especially those that do table analysis on arbitrary
content, are encouraged to find heuristics to determine which tables
actually contain data and which are merely being used for layout.
This specification does not define a precise heuristic, but the
following are suggested as possible indicators:

The position is relative to the rows in the table. The index −1 is equivalent to
deleting the last row of the table.

If the given position is less than −1 or greater than the index of the last row, or if
there are no rows, throws an IndexSizeError exception.

The caption IDL attribute must return, on
getting, the first caption element child of the table element, if any,
or null otherwise. On setting, if the new value is a caption element, the first
caption element child of the table element, if any, must be removed, and
the new value must be inserted as the first node of the table element. If the new
value is not a caption element, then a HierarchyRequestError DOM
exception must be thrown instead.

The createCaption() method must return
the first caption element child of the table element, if any; otherwise
a new caption element must be created, inserted as the first node of the
table element, and then returned.

The deleteCaption() method must remove
the first caption element child of the table element, if any.

The tHead IDL attribute must return, on
getting, the first thead element child of the table element, if any, or
null otherwise. On setting, if the new value is a thead element, the first
thead element child of the table element, if any, must be removed, and
the new value must be inserted immediately before the first element in the table
element that is neither a caption element nor a colgroup element, if
any, or at the end of the table if there are no such elements. If the new value is not a
thead element, then a HierarchyRequestError DOM exception must be thrown
instead.

The createTHead() method must return the
first thead element child of the table element, if any; otherwise a new
thead element must be created and inserted immediately before the first element in
the table element that is neither a caption element nor a
colgroup element, if any, or at the end of the table if there are no such elements,
and then that new element must be returned.

The deleteTHead() method must remove the
first thead element child of the table element, if any.

The tFoot IDL attribute must return, on
getting, the first tfoot element child of the table element, if any, or
null otherwise. On setting, if the new value is a tfoot element, the first
tfoot element child of the table element, if any, must be removed, and
the new value must be inserted immediately before the first element in the table
element that is neither a caption element, a colgroup element, nor a
thead element, if any, or at the end of the table if there are no such elements. If
the new value is not a tfoot element, then a HierarchyRequestError DOM
exception must be thrown instead.

The createTFoot() method must return the
first tfoot element child of the table element, if any; otherwise a new
tfoot element must be created and inserted immediately before the first element in
the table element that is neither a caption element, a
colgroup element, nor a thead element, if any, or at the end of the
table if there are no such elements, and then that new element must be returned.

The deleteTFoot() method must remove the
first tfoot element child of the table element, if any.

The tBodies attribute must return an
HTMLCollection rooted at the table node, whose filter matches only
tbody elements that are children of the table element.

The createTBody() method must create a
new tbody element, insert it immediately after the last tbody element
child in the table element, if any, or at the end of the table element
if the table element has no tbody element children, and then must return
the new tbody element.

The rows attribute must return an
HTMLCollection rooted at the table node, whose filter matches only
tr elements that are either children of the table element, or children
of thead, tbody, or tfoot elements that are themselves
children of the table element. The elements in the collection must be ordered such
that those elements whose parent is a thead are included first, in tree order,
followed by those elements whose parent is either a table or tbody
element, again in tree order, followed finally by those elements whose parent is a
tfoot element, still in tree order.

The behavior of the insertRow(index) method depends on the state of the table. When it is called,
the method must act as required by the first item in the following list of conditions that
describes the state of the table and the index argument:

If index is less than −1 or greater than the number of elements in
rows collection:

If the rows collection has zero elements in it, and the
table has no tbody elements in it:

The method must create a tbody element, then create a tr element,
then append the tr element to the tbody element, then append the
tbody element to the table element, and finally return the
tr element.

4.9.1.1 Techniques for describing tables

For tables that consist of more than just a grid of cells with headers
in the first row and headers in the first column, and for any table in general where the reader
might have difficulty understanding the content, authors should include explanatory information
introducing the table. This information is useful for all users, but is especially useful for
users who cannot see the table, e.g. users of screen readers.

Such explanatory information should introduce the purpose of the table, outline its basic cell
structure, highlight any trends or patterns, and generally teach the user how to use the
table.

For instance, the following table:

Characteristics with positive and negative sides

Negative

Characteristic

Positive

Sad

Mood

Happy

Failing

Grade

Passing

...might benefit from a description explaining the way the table
is laid out, something like "Characteristics are given in the
second column, with the negative side in the left column and the
positive side in the right column".

There are a variety of ways to include this information, such as:

In prose, surrounding the table

<p id="summary">In the following table, characteristics are
given in the second column, with the negative side in the left column and the positive
side in the right column.</p>
<table aria-describedby="summary">
<caption>Characteristics with positive and negative sides</caption>
<thead>
<tr>
<th id="n"> Negative
<th> Characteristic
<th> Positive
<tbody>
<tr>
<td headers="n r1"> Sad
<th id="r1"> Mood
<td> Happy
<tr>
<td headers="n r2"> Failing
<th id="r2"> Grade
<td> Passing
</table>

In the example above the
aria-describedby attribute is used to explicitly associate the information
with the table for assistive technology users.

Authors may also use other techniques, or combinations of the
above techniques, as appropriate.

The best option, of course, rather than writing a description
explaining the way the table is laid out, is to adjust the table
such that no explanation is needed.

In the case of the table used in the examples above, a simple
rearrangement of the table so that the headers are on the top and
left sides removes the need for an explanation as well as removing
the need for the use of headers attributes:

4.9.1.2 Techniques for table design

In visual media, providing column and row borders and alternating row backgrounds can be very
effective to make complicated tables more readable.

For tables with large volumes of numeric content, using monospaced fonts can help users see
patterns, especially in situations where a user agent does not render the borders. (Unfortunately,
for historical reasons, not rendering borders on tables is a common default.)

In speech media, table cells can be distinguished by reporting the corresponding headers before
reading the cell's contents, and by allowing users to navigate the table in a grid fashion, rather
than serializing the entire contents of the table in source order.

Authors are encouraged to use CSS to achieve these effects.

User agents are encouraged to render tables using these techniques whenever the page does not
use CSS and the table is not classified as a layout table.

A caption can introduce context for a table, making it significantly easier to understand.

Consider, for instance, the following table:

1

2

3

4

5

6

1

2

3

4

5

6

7

2

3

4

5

6

7

8

3

4

5

6

7

8

9

4

5

6

7

8

9

10

5

6

7

8

9

10

11

6

7

8

9

10

11

12

In the abstract, this table is not clear. However, with a caption giving the table's number
(for reference in the main prose) and explaining its use, it makes more sense:

<caption>
<p>Table 1.
<p>This table shows the total score obtained from rolling two
six-sided dice. The first row represents the value of the first die,
the first column the value of the second die. The total is given in
the cell that corresponds to the values of the two dice.
</caption>

This provides the user with more context:

Table 1.

This table shows the total score obtained from rolling two
six-sided dice. The first row represents the value of the first
die, the first column the value of the second die. The total is
given in the cell that corresponds to the values of the two dice.

The position is relative to the rows in the table section. The index −1 is equivalent
to deleting the last row of the table section.

If the given position is less than −1 or greater than the index of the last row, or if
there are no rows, throws an IndexSizeError exception.

The rows attribute must return an
HTMLCollection rooted at the element, whose filter matches only tr
elements that are children of the element.

The insertRow(index)
method must, when invoked on an element table section, act as follows:

If index is less than −1 or greater than the number of elements in
the rows collection, the method must throw an
IndexSizeError exception.

If index is −1 or equal to the number of items in the rows collection, the method must create a tr element,
append it to the element table section, and return the newly created
tr element.

Otherwise, the method must create a tr element, insert it as a child of the table section element, immediately before the indexth
tr element in the rows collection, and finally
must return the newly created tr element.

The deleteRow(index)
method must remove the indexth element in the rows collection from its parent. If index is
less than zero or greater than or equal to the number of elements in the rows collection, the method must instead throw an
IndexSizeError exception.

This example shows a thead element being used. Notice the use of both
th and td elements in the thead element: the first row is
the headers, and the second row is an explanation of how to fill in the table.

The position is relative to the cells in the row. The index −1 is equivalent to
deleting the last cell of the row.

If the given position is less than −1 or greater than the index of the last cell, or
if there are no cells, throws an IndexSizeError exception.

The rowIndex attribute must, if the element has
a parent table element, or a parent tbody, thead, or
tfoot element and a grandparenttable element, return the index
of the tr element in that table element's rows collection. If there is no such table element,
then the attribute must return −1.

The sectionRowIndex attribute must, if
the element has a parent table, tbody, thead, or
tfoot element, return the index of the tr element in the parent
element's rows collection (for tables, that's the HTMLTableElement.rows collection; for table sections, that's the
HTMLTableRowElement.rows collection). If there is no such
parent element, then the attribute must return −1.

The cells attribute must return an
HTMLCollection rooted at the tr element, whose filter matches only
td and th elements that are children of the tr element.

The insertCell(index)
method must act as follows:

If index is less than −1 or greater than the number of elements in
the cells collection, the method must throw an
IndexSizeError exception.

If index is equal to −1 or equal to the number of items in cells collection, the method must create a td element,
append it to the tr element, and return the newly created td
element.

Otherwise, the method must create a td element, insert it as a child of the
tr element, immediately before the indexth td or
th element in the cells collection, and finally
must return the newly created td element.

The deleteCell(index)
method must remove the indexth element in the cells collection from its parent. If index is less
than zero or greater than or equal to the number of elements in the cells collection, the method must instead throw an
IndexSizeError exception.

User agents, especially in non-visual environments or where displaying the table as a 2D grid
is impractical, may give the user context for the cell when rendering the contents of a cell; for
instance, giving its position in the table model, or listing the cell's header cells
(as determined by the algorithm for assigning header cells). When a cell's header
cells are being listed, user agents may use the value of abbr
attributes on those header cells, if any, instead of the contents of the header cells
themselves.

The th element may have a scope
content attribute specified. The scope attribute is an
enumerated attribute with five states, four of which have explicit keywords:

The row keyword, which maps to the
row state

The row state means the header cell applies to some of the subsequent cells in the
same row(s).

The col keyword, which maps to the
column state

The column state means the header cell applies to some of the subsequent cells in the
same column(s).

The rowgroup keyword, which maps to
the row group state

The row group state means the header cell applies to all the remaining cells in the
row group. A th element's scope attribute must
not be in the row group state if the element is not
anchored in a row group.

The colgroup keyword, which maps to
the column group state

The column group state means the header cell applies to all the remaining cells in the
column group. A th element's scope attribute must
not be in the column group state if the element is
not anchored in a column group.

The auto state

The auto state makes the header cell apply to a set of cells selected based on
context.

The th element may have an abbr
content attribute specified. Its value must be an alternative label for the header cell, to be
used when referencing the cell in other contexts (e.g. when describing the header cells that apply
to a data cell). It is typically an abbreviated form of the full header cell, but can also be an
expansion, or merely a different phrasing.

The td and th elements may also have a rowspan content attribute specified, whose value must
be a valid non-negative integer. For this attribute, the value zero means that the
cell is to span all the remaining rows in the row group.

These attributes give the number of columns and rows respectively that the cell is to span.
These attributes must not be used to overlap cells, as described in the
description of the table model.

A th element with IDid is
said to be directly targeted by all td and th elements in the
same table that have headers attributes whose values include as one of their tokens
the IDid. A th element A is said to be targeted by a th or td element
B if either A is directly targeted by B or if there exists an element C that is itself
targeted by the element B and A is directly
targeted by C.

Returns the position of the cell in the row's cells list.
This does not necessarily correspond to the x-position of the cell in the
table, since earlier cells might cover multiple rows or columns.

Returns −1 if the element isn't in a row.

The colSpan IDL attribute must
reflect the colspan content attribute. Its
default value is 1.

The rowSpan IDL attribute must
reflect the rowspan content attribute. Its
default value is 1.

The headers IDL attribute must
reflect the content attribute of the same name.

The cellIndex IDL attribute must, if the
element has a parent tr element, return the index of the cell's element in the parent
element's cells collection. If there is no such parent element,
then the attribute must return −1.

4.9.12 Processing model

The various table elements and their content attributes together define the table
model.

A table consists of cells aligned on a two-dimensional grid of
slots with coordinates (x, y). The grid is finite, and is either empty or has one or more slots. If the grid
has one or more slots, then the x coordinates are always in the range 0 ≤ x < xwidth, and the y coordinates are always in the
range 0 ≤ y < yheight. If one or both of xwidth and yheight are zero, then the
table is empty (has no slots). Tables correspond to table elements.

A cell is a set of slots anchored at a slot (cellx, celly), and with
a particular width and height such that the cell covers
all the slots with coordinates (x, y) where cellx ≤ x < cellx+width and celly ≤ y < celly+height. Cells can either be data cells
or header cells. Data cells correspond to td elements, and header cells
correspond to th elements. Cells of both types can have zero or more associated
header cells.

It is possible, in certain error cases, for two cells to occupy the same slot.

A row is a complete set of slots from x=0 to x=xwidth-1, for a particular value of y. Rows usually
correspond to tr elements, though a row group
can have some implied rows at the end in some cases involving
cells spanning multiple rows.

A column is a complete set of slots from y=0 to y=yheight-1, for a particular value of x. Columns can
correspond to col elements. In the absence of col elements, columns are
implied.

A row group is a set of rows anchored at a slot (0, groupy) with a particular height such that the row group
covers all the slots with coordinates (x, y) where 0 ≤ x < xwidth and groupy ≤ y < groupy+height. Row groups correspond to
tbody, thead, and tfoot elements. Not every row is
necessarily in a row group.

A column group is a set of columns anchored at a slot (groupx, 0) with a particular width such that the column group
covers all the slots with coordinates (x, y) where groupx ≤ x < groupx+width and 0 ≤ y < yheight. Column
groups correspond to colgroup elements. Not every column is necessarily in a column
group.

A table model error is an error with the data represented by table
elements and their descendants. Documents must not have table model errors.

4.9.12.1 Forming a table

To determine which elements correspond to which slots in a table associated with a table element, to determine the
dimensions of the table (xwidth and yheight), and to determine if there are any table model errors, user agents must use the following algorithm:

Let the table be the table represented
by the table element. The xwidth and yheight variables give the table's
dimensions. The table is initially empty.

If the table element has no children elements, then return the
table (which will be empty), and abort these steps.

Associate the first caption element child of the table element with
the table. If there are no such children, then it has no associated
caption element.

Let the current element be the first element child of the
table element.

If a step in this algorithm ever requires the current element to be advanced to the next child of the table when
there is no such next child, then the user agent must jump to the step labeled end, near
the end of this algorithm.

While the current element is not one of the following elements, advance the current element to the next
child of the table:

If yheight > ystart, then let all the last rows in the table from y=ystart to y=yheight-1 form a new row
group, anchored at the slot with coordinate (0, ystart), with height yheight-ystart, corresponding
to the element being processed.

If parsing that value failed or if the attribute is absent, then let rowspan be 1, instead.

If rowspan is zero and the table element's
Document is not set to quirks mode, then let cell grows
downward be true, and set rowspan to 1. Otherwise, let cell grows downward be false.

If xwidth < xcurrent+colspan, then let xwidth be xcurrent+colspan.

If yheight < ycurrent+rowspan, then let yheight be ycurrent+rowspan.

Let the slots with coordinates (x, y) such that xcurrent ≤ x < xcurrent+colspan and ycurrent ≤ y < ycurrent+rowspan be covered by a
new cellc, anchored at (xcurrent, ycurrent),
which has width colspan and height rowspan,
corresponding to the current cell element.

If the current cell element is a th element, let this new
cell c be a header cell; otherwise, let it be a data cell.

If any of the slots involved already had a cell covering
them, then this is a table model error. Those slots now have two cells
overlapping.

If cell grows downward is true, then add the tuple {c, xcurrent, colspan}
to the list of downward-growing cells.

Increase xcurrent by colspan.

If current cell is the last td or th element child in
the tr element being processed, then increase ycurrent by 1, abort this set of steps, and return to the algorithm
above.

Let current cell be the next td or th element child
in the tr element being processed.

Return to the step labeled cells.

When the algorithms above require the user agent to run the algorithm for growing
downward-growing cells, the user agent must, for each {cell, cellx, width} tuple in the list of downward-growing cells, if any, extend the cellcell so that it also covers the slots with
coordinates (x, ycurrent), where cellx ≤ x < cellx+width.

4.9.12.2 Forming relationships between data cells and header cells

Each cell can be assigned zero or more header cells. The algorithm for assigning header
cells to a cell principal cell is as follows.

Let header list be an empty list of cells.

Let (principalx, principaly) be the coordinate of the slot to which the principal
cell is anchored.

Take the value of the principal cell's headers attribute and split it on spaces, letting id list be the list of tokens
obtained.

For each token in the id list, if the
first element in the Document with an ID equal to
the token is a cell in the same table, and that cell is not the
principal cell, then add that cell to header list.

If the principal cell is anchored in a row group, then add all header cells that are row group headers and are anchored in the same row group
with an x-coordinate less than or equal to principalx+principalwidth-1 and a y-coordinate less than or
equal to principaly+principalheight-1 to header
list.

If the principal cell is anchored in a column group, then add all header cells that are column group headers and are anchored in the same column
group with an x-coordinate less than or equal to principalx+principalwidth-1 and a y-coordinate less than or
equal to principaly+principalheight-1 to header
list.