Sign up to receive free email alerts when patent applications with chosen keywords are publishedSIGN UP

Abstract:

A highly flexible and extensible structure is provided for physically
storing tabular data. The structure, referred to as a compression unit,
may be used to physically store tabular data that logically resides in
any type of table-like structure. Techniques are employed to avoid
changing tabular data within existing compression units. Deleting tabular
data within compression units is avoided by merely tracking deletion
requests, without actually deleting the data. Inserting new tabular data
into existing compression units is avoided by storing the new data
external to the compression units. If the number of deletions exceeds a
threshold, and/or the number of new inserts exceeds a threshold, new
compression units may be generated. When new compression units are
generated, the previously-existing compression units may be discarded to
reclaim storage, or retained to allow reconstruction of prior states of
the tabular data.

Claims:

1. A method comprising: storing, within a compression unit, data that
logically belongs to a row of a table; wherein at least a portion of the
data is compressed; and in response to a request to delete the row from
the table, storing data that indicates the row is deleted without
deleting the data for the row from the compression unit; wherein the
method is performed by one or more computing devices.

2. The method of claim 1 further comprising repackaging data from the
compression unit into one or more new compression units in response to
the number of deleted rows of the compression unit exceeding a threshold.

3. The method of claim 1 further comprising, in response to a request to
store data in the table, storing the data in an overflow area external to
the compression unit.

4. The method of claim 3 further comprising repackaging data from the
overflow area into one or more compression units in response to the
amount of data in the overflow area exceeding a threshold.

5. The method of claim 1 wherein the step of storing data that indicates
the row is deleted includes changing a bit, within a delete vector, that
corresponds to the row.

6. The method of claim 5 wherein the delete vector is stored within an
uncompressed section of the compression unit.

7. The method of claim 1 further comprising, in response to a request to
update data stored in the table, storing the update in an overflow area
external to the compression unit and updating a delete vector which
corresponds to the data stored in the table to indicate that the data
stored in the table is deleted.

8. The method of claim 7 further comprising repackaging updates from the
overflow area into one or more compression units in response to the
amount of updates in the overflow area exceeding a threshold.

9. The method of claim 1 further comprising, in response to a request to
store data in the table, storing the data in an uncompressed compression
unit which has space available.

10. The method of claim 9 further comprising, compressing the
uncompressed compression unit in response to the amount of data in the
uncompressed compression unit exceeding a threshold.

11. A non-transitory computer-readable storage storing instructions
which, when executed by one or more processors, cause performance of:
storing, within a compression unit, data that logically belongs to a row
of a table; wherein at least a portion of the data is compressed; and in
response to a request to delete the row from the table, storing data that
indicates the row is deleted without deleting the data for the row from
the compression unit.

12. The non-transitory computer-readable storage of claim 11 further
comprising instructions for repackaging data from the compression unit
into one or more new compression units in response to the number of
deleted rows of the compression unit exceeding a threshold.

13. The non-transitory computer-readable storage of claim 11 further
comprising instructions for, in response to a request to store data in
the table, storing the data in an overflow area external to the
compression unit.

14. The non-transitory computer-readable storage of claim 13 further
comprising instructions for repackaging data from the overflow area into
one or more compression units in response to the amount of data in the
overflow area exceeding a threshold.

15. The non-transitory computer-readable storage of claim 11 wherein the
step of storing data that indicates the row is deleted includes changing
a bit, within a delete vector, that corresponds to the row.

16. The non-transitory computer-readable storage of claim 15 wherein the
delete vector is stored within an uncompressed section of the compression
unit.

17. The non-transitory computer-readable storage of claim 10 further
comprising instructions for, in response to a request to update data
stored in the table, storing the update in an overflow area external to
the compression unit and updating a delete vector which corresponds to
the data stored in the table to indicate that the data stored in the
table is deleted.

18. The non-transitory computer-readable storage of claim 17 further
comprising instructions for repackaging updates from the overflow area
into one or more compression units in response to the amount of updates
in the overflow area exceeding a threshold.

19. The non-transitory computer-readable storage of claim 10 further
comprising instructions for, in response to a request to store data in
the table, storing the data in an uncompressed compression unit which has
space available.

20. The non-transitory computer-readable storage of claim 19 further
comprising instructions for, compressing the uncompressed compression
unit in response to the amount of data in the uncompressed compression
unit exceeding a threshold.

Description:

CROSS-REFERENCE TO RELATED APPLICATIONS; BENEFIT CLAIM

[0001] This application is a divisional of U.S. application Ser. No.
12/617,669 filed Nov. 12, 2009, entitled "Structure of Hierarchical
Compressed Data Structure for Tabular Data", which claims benefit of
Provisional Application No. 61/174,447, filed Apr. 30, 2009. The entire
contents of all the above-referenced applications are incorporated by
reference as if fully set forth herein, under 35 U.S.C. §119(e).

[0003] Computers are used to store and manage many types of data. Tabular
data is one common form of data that computers are used to manage.
Tabular data refers to any data that is logically organized into rows and
columns. For example, word processing documents often include tables. The
data that resides in such tables is tabular data. All data contained in
any spreadsheet or spreadsheet-like structure is also tabular data.
Further, all data stored in relational tables, or similar database
structures, is tabular data.

[0004] Logically, tabular data resides in a table-like structure, such as
a spreadsheet or relational table. However, the actual physical storage
of the tabular data may take a variety of forms. For example, the tabular
data from a spreadsheet may be stored within a spreadsheet file, which in
turn is stored in a set of disk blocks managed by an operating system. As
another example, tabular data that belongs to a relational database table
may be stored in a set of disk blocks managed by a database server.

[0005] How tabular data is physically stored can have a significant effect
on (1) how much storage space the tabular data consumes, and (2) how
efficiently the tabular data can be accessed and manipulated. If
physically stored in an inefficient manner, the tabular data may consume
more storage space than desired, and result in slow retrieval, storage
and/or update times.

[0006] Often, the physical storage of tabular data involves a trade-off
between size and speed. For example, a spreadsheet file may be stored
compressed or uncompressed. If compressed, the spreadsheet file will be
smaller, but the entire file will typically have to be decompressed when
retrieved, and re-compressed when stored again. Such decompression and
compression operations take time, resulting in slower performance.

[0007] The best compression/performance balance is particularly difficult
to achieve when tabular data includes various different types of data
items. For example, a spreadsheet may include some columns that contain
character strings, some columns that contain images, and yet other
columns that contain binary Yes/No indications. The character strings may
be highly compressible using a particular compression technique, but
applying the same compression technique to the other types of data in the
spreadsheet may yield no benefit. On the other hand, the images contained
in the spreadsheet may be highly compressible using a compression
technique that yields no benefit when used on character strings. Under
circumstances such as these, whether the user chooses to compress the
spreadsheet file using one of the techniques, or not at all, the result
is inevitably sub-optimal.

[0008] The approaches described in this section are approaches that could
be pursued, but not necessarily approaches that have been previously
conceived or pursued. Therefore, unless otherwise indicated, it should
not be assumed that any of the approaches described in this section
qualify as prior art merely by virtue of their inclusion in this section.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] The present invention is illustrated by way of example, and not by
way of limitation, in the figures of the accompanying drawings and in
which like reference numerals refer to similar elements and in which:

[0010] FIG. 1 is a block diagram of a compression unit, according to an
embodiment of the invention;

[0011]FIG. 2 is a block diagram of a table that is referred to in
examples provided herein;

[0012] FIG. 3 is a block diagram showing two levels of compression units,
according to an embodiment of the invention;

[0013] FIG. 4 is a block diagram showing how tabular data from the table
illustrated in FIG. 2 may be stored in the compression units illustrated
in FIG. 3;

[0014] FIG. 5 is a block diagram illustrating how child compression units
may themselves have child compression units, according to an embodiment
of the invention;

[0015] FIG. 6 is a block diagram illustrating how a compression unit
header is split into two portions, one of which is uncompressed and one
of which is compressed, according to an embodiment of the invention; and

[0016]FIG. 7 is a block diagram of a computing device upon which
embodiments of the invention may be implemented.

DETAILED DESCRIPTION

[0017] In the following description, for the purposes of explanation,
numerous specific details are set forth in order to provide a thorough
understanding of the present invention. It will be apparent, however,
that the present invention may be practiced without these specific
details. In other instances, well-known structures and devices are shown
in block diagram form in order to avoid unnecessarily obscuring the
present invention.

General Overview

[0018] A highly flexible and extensible structure is provided for
physically storing tabular data. The structure, referred to herein as a
"compression unit", may be used to physically store tabular data that
logically resides in any type of table-like structure. For example,
compression units may be used to store tabular data from spreadsheets,
relational database tables, or tables embedded in word processing
documents. There are no limits with respect to the nature of the logical
structures to which the tabular data that is stored in compression units
belongs.

[0019] According to one embodiment, compression units are recursive. Thus,
a compression unit may have a "parent" compression unit to which it
belongs, and may have one or more "child" compression units that belong
to it. There is no limit to the number of recursive levels of compression
units that may be used to store tabular data. For the purpose of
explanation, a compression unit that has no parent is referred to herein
as a "top-level" recursive unit, while a compression unit that has no
children is referred to herein as a "bottom-level" compression unit.

[0020] According to one embodiment, each top-level compression unit stores
data for all columns of the corresponding table. For example, if a table
has twenty columns, then each top-level compression unit for that table
will store data for different rows, but each of those rows will have data
for all twenty columns. However, in alternative embodiments, even at the
top-level, data from a table may be divided among compression units based
on columns. Thus, some top-level compression units may store data for the
first ten columns of a table, while other top-level compression units
store data for the second ten columns of the table. In such an
embodiment, a single row of the table may be spread among several
top-level compression units.

[0021] In one embodiment, compression units include metadata that
indicates how the tabular data is stored within them. The metadata for a
compression unit may indicate, for example, whether the data within the
compression unit is stored in row-major or column major-format (or some
combination thereof), the order of the columns within the compression
unit (which may differ from the logical order of the columns dictated by
the definition of their logical container), a compression technique for
the compression unit, the child compression units (if any), etc.

[0022] Techniques are also described hereafter for storing tabular data
into compression units, retrieving data from compression units, and
updating tabular data in compression units. According to one embodiment,
techniques are employed to avoid changing tabular data within existing
compression units. For example, deleting tabular data within compression
units is avoided by merely tracking deletion requests, without actually
deleting the data. As another example, inserting new tabular data into
existing compression units is avoided by storing the new data external to
the compression units. If the number of deletions exceeds a threshold,
and/or the number of new inserts exceeds a threshold, new compression
units may be generated. When new compression units are generated, the
previously-existing compression units may be discarded to reclaim
storage, or retained to allow reconstruction of prior states of the
tabular data.

Compressed and Uncompressed Sections

[0023] FIG. 1 is a block diagram of a compression unit 100 according to
one embodiment. In the embodiment illustrated in FIG. 1, compression unit
100 has two primary sections: an uncompressed section 102 and a
compressed section 104. In general, the uncompressed section 102 includes
metadata about the contents and format of the compressed section 104.
Uncompressed section 102 may indicate, for example, what compression
technique (if any) was used to compress the contents of the compressed
section 104, and how the contents of uncompressed section 102 are
organized.

[0024] For example, assume that compression unit 100 is used for storing
tabular data from the table 200 that is illustrated in FIG. 2. Table 200
has three columns A, B, C and ten rows R1-R10. For the purpose of
explanation, assume that all of the data from Table 200 is stored in
compression unit 100, and that compression unit 100 is both a top-level
compression unit (has no parent) and a bottom-level compression unit (has
no children). Under these circumstances, the uncompressed section 102 of
compression unit 100 may simply include: [0025] an indication of the
compression technique (if any) used to compress the contents of
compressed section 104; and [0026] an indication that compression unit
100 is a bottom-level compression unit (and therefore has no children).

[0027] While these two pieces of information may be sufficient to allow
use of compression unit 100, alternative embodiments include several
additional pieces of metadata to provide greater flexibility and
extensibility. For example, in one embodiment, within any compression
unit, tabular data may be stored in column-major format or row-major
format. When stored in row-major format, the tabular data would be stored
within compressed section 104 in the sequence IMAGE1A, NAME1, IMAGE1C,
IMAGE2A, NAME2, IMAGE2C, etc. On the other hand, when stored in
column-major format, the tabular data would be stored within compressed
section 104 in the sequence IMAGE1A, IMAGE2A, IMAGE3A . . . NAME1, NAME2,
NAME3 . . . IMAGE1C, IMAGE2C, IMAGE 3C, etc. In an embodiment that allows
the column-major/row-major selection to be made on a
compression-unit-by-compression-unit basis, uncompressed section 102 may
further include an indication of whether the tabular data contained in
the compressed section 104 is stored in row-major or column-major format.
In one embodiment, to conserve space, a compression unit does not include
the names of the columns whose data is contained in the compression unit.
Further, a compression unit may or may not store the rowids of the rows
whose data is contained in the compression unit.

Recursive Structure

[0028] As mentioned above, embodiments shall be described herein in which
compression units are recursive structures. Thus, a compression unit may
have a parent compression unit and any number of child compression units.
In the example given above, compression unit 100 did not have any child
compression units. However, in situations in which compression unit 100
has child compression units, the compression unit 100 may include a
header that has information about the child compression units. The header
for compression unit 100 may be stored in the uncompressed section 102,
or split between the uncompressed section 102 and the compressed section
104.

[0029] In the situation illustrated in FIG. 3, compression unit 100 has
two child compression units 300 and 310. As illustrated, child
compression units 300 and 310 have the same general structure as their
parent compression unit 100. That is, similar to compression unit 100,
child compression units 300 and 310 include uncompressed sections and
compressed sections. Further, compression units 300 and 310 reside
entirely within the compressed section 104 of their parent compression
unit 100. Consequently, whatever compression is applied by to compressed
section 104 at the level of compression unit 100 applies to the entirety
of compression units 300 and 310.

[0030] Because the compression of parent compression units applies to the
entirety of their child compression units, even the uncompressed sections
302 and 312 of child compression units may in fact be compressed. Thus,
the "uncompressed" section of a compression unit is only uncompressed
relative to the level in which the section resides (but may be compressed
based on compression applied at higher level compression units). In
contrast, the compressed section of a compression unit is compressed
relative to the level in which the section resides (in addition to any
compression applied at higher level compression units).

[0031] According to one embodiment, when compression unit 100 is the
parent of one or more child compression units, the header of compression
unit 100 includes additional information. For example, in one embodiment,
the header of compression unit 100 indicates (a) an offset at which each
child compression unit begins, and (b) which data is contained in each
child compression unit.

[0032] For example, assume that a particular compression technique CT1 is
particularly good at compressing images. Under these circumstances, it
may be desirable to compress the images in columns A and C of table 200
using compression technique CT1, while compressing the strings of column
B with a different compression technique CT2. To achieve this compression
combination using the two child compression units 300 and 310,
compression unit 300 may be used to store the images from columns A and
C, while compression unit 310 is used to store the strings from column B.
This distribution of data is illustrated in FIG. 4

[0033] According to one embodiment, to indicate the distribution of data
illustrated in FIG. 4, the header of the parent compression unit 100
would indicate that the data within compressed section 104 is stored in
column-major format, and that columns A and C are stored in compression
unit 300 while column B is stored in compression unit 310. The
uncompressed section 302 of compression unit 300, in turn, would indicate
that compression technique CT1 applies to compressed section 304.
Similarly, the uncompressed section 312 of compression unit 310 would
indicate that compression technique CT2 applies to compressed section
314.

[0034] Because of the recursive nature of compression units, the
compression units 300 and 310 may themselves be parents to one or more
child compression units. For example, in FIG. 5, compression unit 300 is
shown as having two child compression units 500 and 510. Compression unit
500 stores the images from columns A and C for rows R1 to R5, while
compression unit 510 stores the images from columns A and C for rows R6
to R10. Because the data within compressed portion 304 is distributed
between compression units 500 and 510 based on rows, the uncompressed
section 302 of compression unit 300 would indicate that, at the level of
compression unit 300, the tabular data is organized in row-major format.

[0035] In this example, compression units 500 and 510 are bottom-level
compression units that are two levels below the top-level compression
unit 100. On the other hand, compression unit 310 is a bottom-level
compression unit that resides one level below the top-level compression
unit 100. Thus, in one embodiment, bottom-level compression units that
store tabular data for the same table may be at different depths,
depending on how the tabular data has been spread among compression
units.

Metadata Describing Internal Organization of Compression Units

[0036] Because the information within compression units may be organized
in a virtually infinite number of ways, metadata is maintained to
indicate how each compression unit is organized. Depending on the
implementation, the metadata about the organization of tabular data
within a compression unit may be stored external to the compression unit,
or within the compression unit. When stored within the compression unit,
the metadata may be stored in the uncompressed section, the compressed
section, or divided between both. The actual manner in which the metadata
is stored may vary from implementation to implementation.

[0037] According to one embodiment, the metadata that describes the
organization of tabular data within a compression unit is stored in a
header within the compression unit, and includes both an uncompressed
header portion 600 and a compressed header portion 630, as illustrated in
FIG. 6. It should be understood that the embodiment illustrated in FIG. 6
is merely one example of how the uncompressed header portion 600 may be
organized, and the types of metadata that the uncompressed header portion
600 may contain.

[0038] In the embodiment illustrated in FIG. 6, the initial "length" field
602 stores metadata that indicates the compressed size of the compression
unit. In this context, the "compressed size" means the amount of storage
occupied by the compression unit before any data contained there is
decompressed. However, some compression units may not actually compress
data. In such cases, the "compressed size" would be the same as the
uncompressed size.

[0039] In the embodiment illustrated in FIG. 6, the length field 602 is
followed by a series of flags 604. The flags 604 indicate whether or not
the header contains certain fields. When the flag associated with a field
indicates that the field is not present, then the field is either not
relevant to the particular compression unit, or some "default" value is
assumed for the field. The flags 604, and their corresponding fields,
shall be discussed hereafter in greater detail.

Example Flags and Fields

[0040] According to one embodiment, flags 604 include a flag that
indicates whether a version number field 606 is present in the header.
The version number field 606 may be used in situations where the
application that is managing the tabular structure (e.g. a spreadsheet
program, word processor, or relational database system) supports
versioning. In systems that support versioning, the version number field
606 is used to store a value that indicates the version of the tabular
data contained within the compression unit. According to one embodiment,
it is assumed that child compression units are associated with the same
version as their parents, so the version number field 606 need only be
used in top-level compression units.

[0041] In one embodiment, flags 604 include a flag indicates whether the
compression unit includes fields relating to child compression units. In
the embodiment illustrated in FIG. 6, such fields include the "contained
unit information" stored within the compressed header portion 630. If a
compression unit is a bottom-level compression unit, then the compression
unit will not have any child compression units, and therefore will not
have nor require any header fields relating to child compression units.

[0042] In one embodiment, flags 604 include a flag that indicates whether
the header contains a column order vector 612. If the flag is false, then
it is assumed that the columns are organized within the compression unit
in the same column order as the "parent column order". For child
compression units, the parent column order is the column order specified
by its parent compression unit. For top-level compression units, the
column order is the column order defined by the tabular structure itself.

[0043] For example, the column order defined for table 200 is A, B, C.
Therefore, the parent column order for compression unit 100, which is a
top-level compression unit, is A, B, C. If the column order flag for
compression unit 100 is false, then it would be assumed that the column
order within compression unit 100 is A, B, C. However, as illustrated in
FIG. 4, within compression unit 100 the columns are ordered A, C, B
(where columns A and C are stored in child compression unit 300. Thus,
for compression unit 100, the column order flag would be true, and
compression unit 100 would have a column order vector 612 to indicate
that the mapping between the parent column order A, B, C and the new
column order A, C, B.

[0044] The column order vector 612 may indicate the mapping between column
orders in a variety of ways. According to one embodiment, the positions
in the column order vector 612 correspond to the columns in the parent
column order. Thus, the first, second and third positions within the
column order vector 612 correspond to columns A, B, and C, respectively.
However, the values stored at those positions in the column order vector
612 indicate the new order of the columns. For example, in the new column
order (A, C, B) imposed by compression unit 100, column A is still the
first column. Thus, the first position of the column order vector would
store a "1".

[0045] On the other hand, in the new column order (A, C, B) imposed by
compression unit 100, column B is now third in the sequence.
Consequently, the second position in the column order vector 612 would
store the value "3".

[0046] Finally, in the new column order (A, C, B) imposed by compression
unit 100, column C is now second in the sequence. Consequently, the third
position in the column order vector 612 would store the value "2".

[0047] Thus, the column order vector "1, 3, 2" within compression unit 100
would indicate that compression unit 100 has changed the order of the
columns from the parent column order A, B, C, to the new column order A,
C, B.

[0048] Metadata that remaps the parent column order in this manner is
merely one example of metadata that may be used to indicate the column
sequence used within a compression unit. Numerous alternatives may be
used. For example, the header may simply store a sequence of column
identifiers, where the column identifiers uniquely identify columns and
the sequence of the identifiers indicates the sequence of the column data
within the compression unit.

[0049] According to one embodiment, flags 604 include an "uncompressed"
flag that indicates whether the unit is compressed or uncompressed. If
the uncompressed flag is true, then the "compressed portion" of the
compression unit is not actually compressed at the current level.
However, as pointed out above, even an "uncompressed" compression unit
may be compressed if it is the descendent of any compression unit that
does apply compression. Similarly, an "uncompressed" compression unit may
store data in child compression units that do apply compression. Thus,
the uncompressed flag only indicates whether compression occurs relative
to the level of the compression unit to which the flag belongs.

[0050] If the uncompressed flag is true, then the header of the
compression unit will not have a compression algorithm field 614. On the
other hand, if the uncompressed flag is false, then the header of the
compression unit will include a compression algorithm field 614. When
present, the compression algorithm field 614 indicates the compression
algorithm that was used to compress the compressed section of the
compression unit.

[0051] The compression algorithm used to compress the compressed section
of a compression unit is distinct from any compression that may be
applied by any parent compression unit, and from any compression that may
be applied by any child compression unit. For example, the header of
compression unit 100 may indicate that compression technique X was used
to compress compressed section 104 of compression unit 100. The header of
compression unit 300 may indicate that compression technique Y was used
to compress compressed section 304 of compression unit 300. Finally, the
header of compression unit 310 may indicate that the compressed section
314 of compression unit 310 is actually uncompressed. Under these
conditions, the data within compressed section 304 will actually be
double compressed, first as part of compressed section 304 using
compression technique Y, and then as part of compressed section 104 using
compression technique X.

[0052] In one embodiment, metadata indicating the decompressed length of
compressed data is also stored in the header of the compression unit.

[0053] In one embodiment, flags 604 include a "number-of-columns" flag
that indicates whether the unit contains information on the number of
columns contained in the unit. The number-of-columns flag may be false,
for example, if the compression unit has exactly the same number of
columns as its parent. For top-level compression units, the
number-of-columns flag may be false if the compression unit contains all
of the columns of the spreadsheet and/or table for which the compression
unit is storing tabular data.

[0054] In the example illustrated in FIG. 4, the number-of-columns flag of
compression unit 100 would be false because compression unit 100 has all
of the columns of table 200. However, the number-of-columns flag of
compression units 300 and 310 would both be true, because they do not
have the same number of columns as their parent compression unit 100.

[0055] In one embodiment, flags 604 include a number-of-rows flag that
indicates whether the unit contains information on the number of rows
contained within the compression unit. Similar to the number-of-columns
flag, the number-of-rows flag may be false if (a) the compression unit
stores all of the rows of its parent compression unit, or (b) the
compression unit is a top-level compression unit that stores all of the
rows of the spreadsheet and/or table for which the compression unit is
storing tabular data.

[0056] In the example illustrated in FIG. 4, the number-of-rows flag of
compression units 100, 300 and 310 would all be false, because all of
them have all rows of table 200. However, in compression units 500 and
510 of FIG. 5, the number-of-rows flag would be true, because compression
units 500 and 510 have subsets of the rows of their parent compression
unit 300.

[0057] In one embodiment, flags 604 include a flag that indicates whether
there is a delete vector field 618 in the header. As shall be described
in greater detail hereafter, the delete vector field 618 may be used to
store a delete vector that indicates that information has been deleted
from the compression unit, without actually deleting the corresponding
data.

[0058] In one embodiment, flags 604 include a checksum flag that indicates
whether there are row checksums in the compression unit. Row checksums
may be used to determine whether data has become corrupted. However, row
checksums consume space, and therefore may be omitted under some
situations or implementations.

[0059] In one embodiment, flags 604 are extensible. Consequently, new
flags may be added to flags 604 as needed.

Contained Unit Information

[0060] If a compression unit contains no smaller units, then the
(compressed) data for the unit is at the start of the compressed section
of the unit, immediately following the compression unit header 600. On
the other hand, if the compression unit does contain lower-level units,
then instead of starting with the data, the compressed section of the
unit starts with a (compressed) data structure with information on the
contained units. One embodiment of such a contained units structure is
illustrated in FIG. 6 as contained unit information 630.

[0061] In the illustrated embodiment, the contained unit information 630
starts with flags 622. In one embodiment, the first flag indicates
whether the unit is divided based on rows or columns. The second flag
indicates whether there is one column per unit. Thus, if contained unit
information 630 is for a compression unit that contains three columns A,
B and C, and each of the columns is in a different child compression
unit, then the first flag of flags 622 would indicate that the data is
divided based on columns, and the second flag of flags 622 would indicate
that there is one column per child compression unit.

[0062] On the other hand, if contained unit information 630 is for a
compression unit that contains three columns A, B and C, but columns A
and C are in the same child compression unit, then the first flag of
flags 622 would indicate that the data is divided based on columns, and
the second flag of flags 622 would indicate that there is not one column
per child compression unit.

[0063] In the illustrated embodiment, the flags 622 are followed by a
number of units field 624. The number of units field 624 indicates the
number of child compression units. While the illustrated embodiment
includes a number of units field 624, such a field need not be present in
alternative embodiments.

[0064] The number of units field 624 is followed by a map 626 either from
rows to units, or from columns to units, depending on whether the data is
divided by rows or by column. For example, map 626 for compression unit
100, illustrated in FIG. 4, would indicate that columns A and C are
stored in child compression unit 300, and that column B is stored in
child compression unit 310. On the other hand, map 626 for compression
unit 300, illustrated in FIG. 5, would indicate that rows R1-R5 are
stored in child compression unit 500, and that rows R6-R10 are stored in
child compression unit 510.

[0065] According to one embodiment, in both column major and row major
situations, the map 626 is a vector with length equal to the number of
contained units. In one embodiment, each entry in the vector is the
number of rows or columns in the corresponding child compression unit.
Thus, if the column map has entries 2, 5 and 3, then the first unit
contains the first two columns in the order specified previously in the
header, and then the second unit contains the next five columns, and the
third unit contains the next three columns. If there is one column per
unit, then both the number of units and column mapping may be eliminated.

[0066] The contained unit information 630 concludes with pointers 628 to
the headers of each of the contained compression units. According to one
embodiment, these pointers are relative to the start of the uncompressed
unit. The pointers are relative to the start of the uncompressed unit
because, in order to make use of the contained unit information 630,
including the pointers 628, the compressed section of the compression
unit would have already been uncompressed.

Obtaining Tabular Data Stored in Compression Units

[0067] The recursive nature of compression units allows tabular data to be
compressed at each of many levels. For example, within a bottom-level
compression unit, data may be compressed using run-length encoding. That
bottom-level compression unit may be a child of an intermediate-level
compression unit that compresses the bottom-level compression unit (and
everything else in its compressed section) using LZO compression. That
intermediate-level compression unit may be a child of a top-level
compression unit that compresses the intermediate-level compression unit
(and everything else in its compressed section) using BZIP2 compression.

[0068] To obtain tabular data, the various compression operations have to
be undone in reverse chronological order. In the example given above, the
data must be decompressed using BZIP2 decompression, then decompressed
using LZO decompression, and then uncompressed using run-length decoding.
Because each decompression operation consumes resources, some operations
may be performed directly on compressed data (without decompressing it).
Eg: Run Length encoding. In situations where decompression is necessary,
it is desirable to only perform the decompression operations necessary
for any particular operation.

[0069] For example, assume that a request is made for the names associated
with rows R1 to R10 of table 200. As illustrated in FIG. 4, those names
are in column B, which is stored in child compression unit 310. Thus, to
obtain the names, the compressed section 104 would be decompressed. Once
decompressed, the contained unit information within compressed section
104 can be read to determine that column B is stored in compression unit
310. The pointer to compression unit 310 is follow to find the header for
compression unit 310. The header, which is stored in uncompressed section
312, contains metadata that indicates how compressed section 314 was
compressed. Compressed section 314 may then be uncompressed to obtain the
names.

[0070] Significantly, during the process of obtaining the names from
column B, the compressed section 304 of compression unit 300 was not
uncompressed, because compressed section 304 did not have any data or
metadata necessary to obtain the names from rows R1 to R10. Conversely,
if the request was for images and not names, compressed section 304 of
compression unit 300 would have to be decompressed, while compressed
section 314 of compression unit 310 would not be decompressed.

Mixing Compressed and Uncompressed Data

[0071] According to one embodiment, the system may store data in
compression units in uncompressed form or in compressed form. The system
may, based on how many rows are in the compression unit, or based on the
compressibility of the data, choose not to compress the compression unit.

[0072] According to one embodiment, a table may contain compression units
and rows which are stored external to compression units. A row may be
stored in conventional row-major disk blocks, or a row-based compression
technique, such as the technique described in U.S. patent application
Ser. No. 11/875,642 entitled "ON-LINE TRANSACTION PROCESSING (OLTP)
COMPRESSION AND RE-COMPRESSION OF DATABASE DATA" filed on Oct. 19, 2007,
the entire contents of which are incorporated herein by reference. When
some tabular data for a table is stored in compression units, and other
tabular data for the same table is stored external to compression units,
the location of the data that is stored external to compression units is
referred to herein as the "overflow area".

[0073] In one embodiment, in response to the data in the overflow area
exceeding a particular threshold, the overflow data may be automatically
moved into one or more new compression units. For example, several DML
operations may result in the overflow area having thousands of rows. In
response to detecting that the size of the data in the overflow area has
exceeded some threshold, the data from the overflow may be repackaged
into one or more new compression units. Similar to the bulk load
situation, the new top-level compression units that are created to store
the data from the overflow area may have the same internal structure as
compression.

[0074] According to one embodiment, tabular data is deleted, inserted and
updated directly into compression units, in response to operations
performed on the corresponding table. In the case where the compression
unit contains compressed data, performing such operations on the data
itself, overhead is incurred due to the need to decompress the data
before making the change, and then recompress the data after making the
change. In the case where the compression unit contains uncompressed
data, no such cost is incurred and the data may be acted upon directly.

Deleting Tabular Data Stored in Compression Units

[0075] In one embodiment, the delete vector in delete vector field 618
(illustrated in FIG. 6) is used to delete rows from a table without
actually deleting, from the compression unit, the data that the rows
contain. For example, assume that a particular compression unit stores
data for 1000 rows. The corresponding delete vector may include 1000
bits, where the position of the bit indicates the row to which the bit
corresponds. If a request is received to delete the 10th row from
the compression unit, then the 10th bit of the delete vector is set
to indicate that the corresponding row is deleted. However, the actual
data for the 10th row is not actually deleted from the compression
unit.

[0076] Various benefits result from treating deletions in this manner. For
example, by using the delete vector, deletions do not incur the overhead
associated with decompressing the compressed section of a compression
unit (and any lower-level compression units contained therein), because
the delete vector is in the uncompressed section of the compression unit.

[0077] Further, the decompression overhead is not the only overhead
avoided by using the delete vector. Specifically, if the compressed
section was decompressed to remove the deleted row, then the compressed
section would have to be recompressed after the row data was removed,
thereby incurring more overhead. In addition, deletion of data from a
compressed set of data may, under some circumstances, increase the
compressed size of the data.

[0078] In one embodiment, rather than include a delete vector in the
header of all compression units, the delete vector is only included at
the top-level compression units. Inspection of the top-level delete
vector indicates which rows have been deleted without having to access
the headers of any lower-level compression units.

[0079] According to one embodiment, if the number of rows that are deleted
exceeds a particular threshold, then the entire compression unit is
rewritten. For example, if the bit vector indicates that more than some
threshold percentage of the rows within a compression unit has been
deleted, the compression unit may be decompressed, and the
not-yet-deleted rows may be stored in a new compression unit. If there
are sufficiently few rows remaining the system may store the compression
unit in uncompressed form to avoid further overhead decompressing the
compression unit. Alternatively, during this process, the data from many
compression units may be combined into a new, smaller set of compression
units which may be compressed.

Inserting Tabular Data

[0080] According to one embodiment, the insertion of data into a
compression unit may be done directly. However, the addition of data into
a compressed compression unit could incur significant overhead penalties,
due to the decompression and recompression that would be required.
Further, the resulting compression unit may be larger than desired. In
the case that the compression unit contains data in uncompressed form,
and the block contains sufficient space, the data may be inserted
directly without such overhead.

[0081] According to one embodiment, newly added tabular data is not
inserted into existing compression units. Instead, the newly added
tabular data is either stored in the overflow area or stored in newly
formed compression units which may be compressed or uncompressed
depending on the amount of data inserted so far.

[0082] In one embodiment, if a small number of rows are being inserted
into table 200, these rows may be stored external to compression units in
the overflow area or they may be inserted into an uncompressed
compression unit that has space available. If the insertion results in
that compression unit exceeding some threshold, the system may compress
the data in the compression unit.

[0083] In one embodiment, when the amount of data to be inserted into
table 200 exceeds a threshold, then the data is not stored in the
overflow area or existing uncompressed compression units. Rather, the new
data is stored in new compression units. For example, if a bulk load
operation is performed to add thousands of rows to table 200, then one or
more new compression units may be created to store the tabular data for
the new rows. According to one embodiment, the new top-level compression
units would automatically inherit the same internal structure as
compression unit 100, including the structure and organization of the
compression units that descend from compression unit 100.

Updating Tabular Data

[0084] According to one embodiment, data may be updated directly within a
compression unit. However, the addition of data into a compression unit
could incur significant overhead penalties, due to the decompression and
recompression that would be required. Further, the resulting compression
unit may be larger than desired. In the case that the compression unit
contains data in uncompressed form, and the block contains sufficient
space, the data may be updated directly without such overhead.

[0085] According to one embodiment, updates are treated as deletions
combined with inserts. Thus, when a value is updated in a row of table
200, the delete vector in compression unit 100 is updated to indicate
that the row is deleted, and a row with the updated values is stored in
the overflow area.

[0086] Frequently, there will be some columns of an updated row that are
not changed by an update operation. Consequently, prior to storing the
updated row in the overflow area, the compressed section of the
compression unit (and any child compression units) may have to be
decompressed to recover the pre-update values of the row. The new row
stored in the overflow area includes the pre-update values of the columns
of the row that were not changed, and new values for the columns of the
row that were changed.

Reading Tabular Data

[0087] In an embodiment that uses an overflow area, table scans must read
both the data that is stored in the overflow area, and the data that is
stored in compression units. Thus, a single table scan may involve
combining data from several differently organized compression units, from
compressed data in the overflow area, and from uncompressed data in the
overflow area.

Hardware Overview

[0088] According to one embodiment, the techniques described herein are
implemented by one or more special-purpose computing devices. The
special-purpose computing devices may be hard-wired to perform the
techniques, or may include digital electronic devices such as one or more
application-specific integrated circuits (ASICs) or field programmable
gate arrays (FPGAs) that are persistently programmed to perform the
techniques, or may include one or more general purpose hardware
processors programmed to perform the techniques pursuant to program
instructions in firmware, memory, other storage, or a combination. Such
special-purpose computing devices may also combine custom hard-wired
logic, ASICs, or FPGAs with custom programming to accomplish the
techniques. The special-purpose computing devices may be desktop computer
systems, portable computer systems, handheld devices, networking devices
or any other device that incorporates hard-wired and/or program logic to
implement the techniques.

[0089] For example, FIG. 7 is a block diagram that illustrates a computer
system 700 upon which an embodiment of the invention may be implemented.
Computer system 700 includes a bus 702 or other communication mechanism
for communicating information, and a hardware processor 704 coupled with
bus 702 for processing information. Hardware processor 704 may be, for
example, a general purpose microprocessor.

[0090] Computer system 700 also includes a main memory 706, such as a
random access memory (RAM) or other dynamic storage device, coupled to
bus 702 for storing information and instructions to be executed by
processor 704. Main memory 706 also may be used for storing temporary
variables or other intermediate information during execution of
instructions to be executed by processor 704. Such instructions, when
stored in storage media accessible to processor 704, render computer
system 700 into a special-purpose machine that is customized to perform
the operations specified in the instructions.

[0091] Computer system 700 further includes a read only memory (ROM) 708
or other static storage device coupled to bus 702 for storing static
information and instructions for processor 704. A storage device 710,
such as a magnetic disk or optical disk, is provided and coupled to bus
702 for storing information and instructions.

[0092] Computer system 700 may be coupled via bus 702 to a display 712,
such as a cathode ray tube (CRT), for displaying information to a
computer user. An input device 714, including alphanumeric and other
keys, is coupled to bus 702 for communicating information and command
selections to processor 704. Another type of user input device is cursor
control 716, such as a mouse, a trackball, or cursor direction keys for
communicating direction information and command selections to processor
704 and for controlling cursor movement on display 712. This input device
typically has two degrees of freedom in two axes, a first axis (e.g., x)
and a second axis (e.g., y), that allows the device to specify positions
in a plane.

[0093] Computer system 700 may implement the techniques described herein
using customized hard-wired logic, one or more ASICs or FPGAs, firmware
and/or program logic which in combination with the computer system causes
or programs computer system 700 to be a special-purpose machine.
According to one embodiment, the techniques herein are performed by
computer system 700 in response to processor 704 executing one or more
sequences of one or more instructions contained in main memory 706. Such
instructions may be read into main memory 706 from another storage
medium, such as storage device 710. Execution of the sequences of
instructions contained in main memory 706 causes processor 704 to perform
the process steps described herein. In alternative embodiments,
hard-wired circuitry may be used in place of or in combination with
software instructions.

[0094] The term "storage media" as used herein refers to any media that
store data and/or instructions that cause a machine to operation in a
specific fashion. Such storage media may comprise non-volatile media
and/or volatile media. Non-volatile media includes, for example, optical
or magnetic disks, such as storage device 710. Volatile media includes
dynamic memory, such as main memory 706. Common forms of storage media
include, for example, a floppy disk, a flexible disk, hard disk, solid
state drive, magnetic tape, or any other magnetic data storage medium, a
CD-ROM, any other optical data storage medium, any physical medium with
patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any
other memory chip or cartridge.

[0095] Storage media is distinct from but may be used in conjunction with
transmission media. Transmission media participates in transferring
information between storage media. For example, transmission media
includes coaxial cables, copper wire and fiber optics, including the
wires that comprise bus 702. Transmission media can also take the form of
acoustic or light waves, such as those generated during radio-wave and
infra-red data communications.

[0096] Various forms of media may be involved in carrying one or more
sequences of one or more instructions to processor 704 for execution. For
example, the instructions may initially be carried on a magnetic disk or
solid state drive of a remote computer. The remote computer can load the
instructions into its dynamic memory and send the instructions over a
telephone line using a modem. A modem local to computer system 700 can
receive the data on the telephone line and use an infra-red transmitter
to convert the data to an infra-red signal. An infra-red detector can
receive the data carried in the infra-red signal and appropriate
circuitry can place the data on bus 702. Bus 702 carries the data to main
memory 706, from which processor 704 retrieves and executes the
instructions. The instructions received by main memory 706 may optionally
be stored on storage device 710 either before or after execution by
processor 704.

[0097] Computer system 700 also includes a communication interface 718
coupled to bus 702. Communication interface 718 provides a two-way data
communication coupling to a network link 720 that is connected to a local
network 722. For example, communication interface 718 may be an
integrated services digital network (ISDN) card, cable modem, satellite
modem, or a modem to provide a data communication connection to a
corresponding type of telephone line. As another example, communication
interface 718 may be a local area network (LAN) card to provide a data
communication connection to a compatible LAN. Wireless links may also be
implemented. In any such implementation, communication interface 718
sends and receives electrical, electromagnetic or optical signals that
carry digital data streams representing various types of information.

[0098] Network link 720 typically provides data communication through one
or more networks to other data devices. For example, network link 720 may
provide a connection through local network 722 to a host computer 724 or
to data equipment operated by an Internet Service Provider (ISP) 726. ISP
726 in turn provides data communication services through the world wide
packet data communication network now commonly referred to as the
"Internet" 728. Local network 722 and Internet 728 both use electrical,
electromagnetic or optical signals that carry digital data streams. The
signals through the various networks and the signals on network link 720
and through communication interface 718, which carry the digital data to
and from computer system 700, are example forms of transmission media.

[0099] Computer system 700 can send messages and receive data, including
program code, through the network(s), network link 720 and communication
interface 718. In the Internet example, a server 730 might transmit a
requested code for an application program through Internet 728, ISP 726,
local network 722 and communication interface 718.

[0100] The received code may be executed by processor 704 as it is
received, and/or stored in storage device 710, or other non-volatile
storage for later execution.

[0101] In the foregoing specification, embodiments of the invention have
been described with reference to numerous specific details that may vary
from implementation to implementation. Thus, the sole and exclusive
indicator of what is the invention, and is intended by the applicants to
be the invention, is the set of claims that issue from this application,
in the specific form in which such claims issue, including any subsequent
correction. Any definitions expressly set forth herein for terms
contained in such claims shall govern the meaning of such terms as used
in the claims. Hence, no limitation, element, property, feature,
advantage or attribute that is not expressly recited in a claim should
limit the scope of such claim in any way. The specification and drawings
are, accordingly, to be regarded in an illustrative rather than a
restrictive sense.