Sample Files

Implementation Strategy

Eventually, the code handling SpreadsheetML import will merge with the existing Excel binary filter. As such, the new code needs to be designed with this in mind. It's always desirable to understand how the existing binary filter works when implementing the XML filter to make the future merging work less painful.

Code Organization

Source files for handling the SpreadsheetML format are located in inc/oox/xls and source/xls under module oox. A good place to start tracing the code would be ExcelFilter::Import and follow the calls it makes.

A substream in the XML package is called "fragment", and each fragment has an associated *fragment.hxx header file. For instance, the code for loading of the workbook.xml fragment is found in workbookfragment.hxx, and so on.

A nested element is called "context", and, like the fragments, each context has an associated *context.hxx. For instance, the code for parsing the <sheetData> element is found in sheetdatacontext.hxx.

The term workbook in this context refers to an entire document which includes worksheets and other document metadata, whereas the term worksheet refers to each individual sheet in the workbook.

The existing binary Excel filter is located in sc/source/filter/(inc|excel)/xi(page|view).(c|h)xx. The XclImpTabViewSettings class handles importing sheet's view settings, which corresponds to worksheet/sheetViews context in the XML format.

The UNO interface code is found in sc/source/ui/uno.

Handling Fragment

WorkbookFragment (class)

Handles loading of workbook.xml fragment. It loads the associated relationship file (xl/_rels/workbook.xml.rels) in the constructor.

Handling Context

In most cases the fragment handler will handle all nested contexts by itself to increase performance. For this, some helper classes have been implemented that do all needed work to deal with nested contexts (ContextHelper, FragmentBase, and ContextBase respectively in contexthelper.hxx, excelfragmentbase.hxx, and excelcontextbase.hxx). In general, for implementing a new fragment or context handler, the interface of the ContextHelper class from contexthelper.hxx has to be implemented. The classes FragmentBase (excelfragmentbase.hxx) and ContextBase (excelcontextbase.hxx) already provide default implementations of all virtual functions, but a derived class is free to implement them as well.

Relation (class)

Holds three string data for ID, Type and Target (need more info).

AddressConverter (class)

converts strings to addresses and ranges, and tracks invalid addresses (e.g. a not-importable cell at address ZZZ1000000). Later, this information will be used to generate a "Imported document contains data outside of sheet limits" warning box after loading. Header: addressconverter.hxx

UnitConverter (class)

provides basic unit conversion, including font dependent stuff such as calculating column width from a specific number of characters. Header: unitconverter.hxx

Development

Feature

Developer

Status

OOo 3.0

OOo 3.2

OOo 3.3

Comments/Missing

Framework, fragment handling

cl/dr

done

yes

yes

yes

Password, decryption

cmc/dr

done

yes

yes

For all filters: Word, Excel, Powerpoint

Workbook, worksheet fragment, sheet names

tbe/dr

done

yes

yes

yes

Simple cell contents (values, strings)

dr

done

yes

yes

yes

missing Calc feature: error cells

Shared strings

dr

done

yes

yes

yes

Simple cell formatting (alignment, protection, borders, fill)

dr

done

yes

yes

yes

Builtin number formats

dr

done

yes

yes

yes

Font handling for cells

tbe/dr

done

yes

yes

yes

Cell styles (names, formatting)

dr

done

yes

yes

yes

Column settings (format, width, outlines)

tbe/dr/kohei

done

yes

yes

yes

missing Calc feature: outline symbol position

Row settings (format, height, outlines)

tbe/dr/kohei

done

yes

yes

yes

missing Calc feature: outline symbol position

Rich text in cells

dr

done

yes

yes

yes

Scheme fragment, scheme colors

dr

done

yes

yes

yes

Cell formulas, array formulas, shared formulas

dr

done

yes

yes

yes

Conditional formatting

dr

done

yes

yes

yes

External references/links

dr/er

done

partly

yes

yes

Print ranges, builtin defined names

dr

done

yes

yes

yes

Page/print settings

dr

done

yes

yes

yes

Header/footer

kohei/dr

done

yes

yes

yes

Column/row breaks

kohei

done

yes

yes

yes

Sheet/document view settings

dr

done

yes

yes

yes

Cell hyperlinks

dr/kohei

done

yes

yes

yes

because a cell hyperlink is a in-cell text field object in Calc, as opposed to a cell property in Excel, a hyperlink will not get imported when the cell is a value cell.

External References

xl/_rels/workbook.xml.rels contains path(s) to external link fragments, that are usually xl/externalLinks/externalLink*.xml.

xl/externalLinks/externalLink*.xml contains summary content information of the external book being referenced. It should provide enough information to display external cell's content. It also contains reference to the path to the external file by rID.

xl/externalLinks/_rels/externalLink*.xml.rels contains the path to the external file, stored by rID.

Inside a cell that contains an external cell reference, the format is given

<f>[1]Sheet1!A1</f>

as formula.

Data Validations

ScConditionEntry needs new methods to allow setting pFormula1 and pFormula2 that are of type ScTokenArray. ScValidationData is an immediate child class of ScConditionEntry, and it represents the data validation attribute of a cell.

ScTableValidationObj implements UNO wrapper for ScValidationData, and this class needs to implement the css::sheet::XFormulaTokens interface so that the client code can pass formula tokens to this class, and that formula token set needs to eventually find its way into ScValidationData as ScTokenArray.