OOXML, Macros and Security

As we all know, rich desktop editors, such as those provided in Microsoft Office, offer a range of end-user programming options, such as Visual Basic macros. These can be used to automate repetitive clerical tasks, such as a mail merge, or to add a custom user interface over a data entry form. These capabilities have existing in personal productivity applications since the late 1980’s — so 20 years now. This is a not cutting-edge feature.

Such scripting capabilities are essential for the creation of high-value scripted documents. These features are essential in modern applications. Almost every word process or spreadsheet today has automation capabilities. Even open source applications like OpenOffice have macro features. So, considering the popularity and value of scripting in a productivity application, it is much lamented that DIS 29500 does not define how scripts or macros are to work. This lack will cause serious interoperability concerns, as each vendor, lacking standards guidance, will implement these features in incompatible ways.

Specifically, in order to have any interoperability among scripted documents, it is necessary to define:

How and where a script is stored and located within the Open Packaging Convention (OPC) container file.

How is the script bound to the document. In other words, how does the document content associate itself with the macro?

What is the runtime language of the script?

What is the core and extension API’s available to the script?

What is the security model?

OOXML defines none of these. So how can it meet its goal to “represent faithfully the existing corpus of word-processing documents, spreadsheets and presentations that have been produced by Microsoft Office applications (from Microsoft Office 97 to Microsoft Office 2008 inclusive)”? How can it do that and ignore the macros that have been around for decades?

Note that there is ample precedent for a markup standard answering these questions in a flexible and interoperable manner. For example the common web paradigm would be:

Script is located via URL specified in a “src” attribute of a script element, or is given inline

The script is invoked by a function call at a particular point in the document, or triggered from a standard event such as onLoad().

Multiple runtime languages are supported, often EcmaScript

The API’s allowed are defined by the W3C’s DOM API

There is a defined security model to deal with hazards such as cross-frame scripting, etc.

OOXML provides none of this, so interoperability of these high value documents is not possible. Note again that scripting is widespread and has been around for 20 years. So it is especially unfortunate that a newly proposed standard lacks this capability.

Note however that scripting is not without its problems. We all remember the Word Macro Viruses of several years ago, such as Melissa. Portable code has well-known risks, and these risks have well-known counter-measures. For example, it is common for anti-virus software to scan Word documents for viruses. It is also common for mail servers to scan incoming emails for attachments with viruses, and even remove the macros or block documents with macros, according to admin policy. So there is a need toenable 3rd party applications that can locate, retrieve, scan and delete scripting elements from documents. However, since OOXML does not define even where the scripts are stored, or how they can be located, such 3rd party applications cannot be written in general for a document described by this specification. The standard provides an insufficient foundation for implementing a reasonable security policy around OOXML documents.

For example, take Ecma Response 101, approved in Geneva in a 9-4 vote as part of a large batch 0f 1027 changes, without discussion or opportunity for dissent. Four NB’s, in their ballot comments from last September, pointed out that Section 2.16.5.41 of DIS 29500’s Part 4 defines a “MACROBUTTON” field that allows the definition of a button in the document that will trigger a macro. But nothing is said about how the macro is stored, bound, what API’s are available, what the security model is, etc.

The request from one NB was to “Describe this feature to a level where cross-platform, cross-application interoperability is possible.” However, what Ecma provided in their draft Disposition of Comments report, approved in batch by the BRM without discussion or opportunity for objection, was something quite different. They merely added the the following text:

The mechanism by which the command specified by text in field-argument-1 is located and/or executed by an application is implementation-defined

So not only is it impossible to have cross-platform interoperability of this feature, it is not even possible to implement a reasonable security policy to detect, scan or block macros. Even the location of the macro is outside the scope of the standard. It could be just another file in the Zip. It could be a binary blob with an obscure content type that varies from application to application. It could be base64Encoded in the XML. Or it could be steganographically encoded in low-order bits of an image file. The OOXML standard is singularly unhelpful in telling us how to deal with this risks of this macro function.

Finally, note that this lack of information on how to locate macros within a document makes it impossible for anyone to programmatically combine or divide OOXML documents which may contain macros. For example, imagine a 2-page spreadsheet, with a macro on sheet one only. How can it be split into two one-page documents, if there is no defined way to locate the script associated with page one? This is the type of automated composition and document manipulation that OOXML should be enabling. Similarly, how can one combine two single documents containing macros into one document, if there are no defined rules for locating and naming macros? Many basic types of applications,such as merging slide shows, etc., will break in the presence of macros.

The above topic was of interest to several NB’s in Geneva, but could not be discussed for lack of time at the BRM.

Some countries seem to be breaking silence where they can about the BRM… hopefully the NBs will get the message the DIS 29500 spec is not ready to be an ISO standard.

Incidentally on groklaw there is a very interesting discussion about the ISO press release. Read one way it sounds like no or abstain votes can be changed to yes but it doesn’t comment on yes to no votes. Any comments on that Rob? AFAIK countries are free to change their vote any way they want for the next 30 days? If a country changes to yes and others don’t realise they can change to no (such as the US) it could still lead to the spec being railroaded through…

Thank you for posting this comment about OOXML not documenting macros and its impact on interoperability – especially with MS documents that are free to use MS’ closed and proprietary scripting languages.

I don’t think that’s been brought up anywhere as a natural consequence of the MACROBUTTON comment to the DIS. It certainly opened my eyes.

I agree with your statement that it appears that ISO is making up the rules as it goes along and that nothing would surprise me at this point.

…Well – almost nothing. I think I’d be very surprised of MS/ECMA pulled the DIS from consideration for ISO standardization. — but I don’t think I’d be surprised by anything less….

Further to my previous comment (and having now read the meeting notes and resolutions accepted) I am of the distinct impression the prevailing attitude seems to have been:

NB1: “Gah this is awful… It should have been done this way in the first place as it is the ‘proper’ way to do it”

NB2:”I agree it would be a ‘better’ spec if we do it that way”

Rest:”Agreed – The editor is instructed to incorporate resolution X”

So the end result is that yes the spec was improved but only to the extent that those who deal with standards creation often feel the original was awful and near enough ‘any’ change had to be an improvement. Thus to deny a resolution would be to leave the text as was and as professionals they felt an obligation to at least get an improved text out – even if the result was the final text still wouldn’t be up to scratch for a standard.

Consequently most of the time at the BRM (including in the evening out of session) seems to have been spent getting it into a shape where it could ‘start’ being discussed as opposed to final polish – this, of course, resulting in no time to actually work on ‘worthwhile’ changes to polish it up and make it worthy of an ISO stamp….

James, those appear to be edited versions of the notes which were taken during the meeting. In particular, the “resolutions” document appears to be a list of the resolutions that were approved, stripping out the ones that either positively failed or died for lack of time to bring them to a vote.

There were a mix of NB views expressed. I’m not sure the meeting notes give a good flavor of that. For example, some NB’s did not raise any points and just said “We are delighted with DIS 29500” when it was their turn. Since they did not propose any text changes, their view was not recorded in the resolutions. Similarly, those NB’s who, in the view of the Convenor, had proposals that were not achievable within the time constraints of the BRM, these could not be brought up for discussion. So you won’t see them in the meeting notes or resolutions. The net result is the resolutions are a slice of the more moderate opinions in the room. But the atmosphere was far more charged than these notes would suggest.

I really wish I could have been a fly on the wall last week then… or at least the meetings could have been recorded for archive later…

The most frustrating thing being the results of this will directly affect me in my professional life (Linux Systems Administrator) as well as my private life but there is very little that seems possible outside of hoping that the UK NB sticks by the No vote and that more NBs see the sense of the matter….

Incidentally over at Brian Jones’ blog there are dispersions being cast over your macro comments and the state of ISO 26300 unsurprisingly.