If you insert comments into a Microsoft Office Word 2007 document or set of documents, you might want to remove those comments before publishing or distributing your documents. You could open each document individually to remove the comments, or you can use the Office Open XML Formats to remove the comments programmatically, without opening the documents in Word 2007. This technique requires a significant amount of programming code, but the code is efficient and provides the best performance. Working with the Office Open XML File Formats requires knowledge of how Word 2007 stores the content, the System.IO.Packaging API, and XML programming.

To create a Microsoft Windows Application project in Microsoft Visual Studio 2005, open the code editor, right-click, select Insert Snippet, and select the Word: Remove Comments snippet from the list of available Office 2007 snippets. If you use Microsoft Visual Basic, inserting the snippet inserts a reference to WindowsBase.dll and adds the following Imports statements.

If you use Microsoft Visual C#, you must add the reference to the WindowsBase.dll assembly and corresponding using statements, so that you can compile the code. (Code snippets in C# cannot set references and insert using statements.) If the Windowsbase.dll reference does not appear on the .NET tab of the Add Reference dialog box, click the Browse tab, locate the C:\Program Files\Reference assemblies\Microsoft\Framework\v3.0 folder, and then click WindowsBase.dll.

The WDDeleteComments snippet loads the contents of the Word document and removes the comments from the document. To test it, create a sample document that contains comments, and save your sample document somewhere easy to find (for example, C:\Comments.docx). In a Windows application, insert the WDDeleteComments snippet and then use the following code example to call it, modifying the names to meet your needs. After you finish, open the Word document to verify that you removed all the comments.

This code creates constants that it uses to refer to the various schemas and namespaces required by the procedure, and retrieves a reference to the package itself by calling the Package.Open method. The code also creates variables that it uses to refer to the document part and the document Uniform Resource Identifier (URI).

Copy the following code and replace the Code removed here… comment in the previous code.

// Get the main document part (document.xml).foreach (System.IO.Packaging.PackageRelationship relationship in
wdPackage.GetRelationshipsByType(documentRelationshipType))
{
documentUri = PackUriHelper.ResolvePartUri(
new Uri("/", UriKind.Relative), relationship.TargetUri);
documentPart = wdPackage.GetPart(documentUri);
// There is only one document.break;
}
// Code removed here…

Given a reference to the package, the code then finds the document part, by calling the Package.GetRelationshipsByType method, and passing in the constant that contains the document relationship name (see Figure 1). The code loops through all the returned relationships and retrieves the document URI, relative to the root of the package. You must loop through the PackageRelationship objects to retrieve the one you want. This loop executes only one time.

Copy the following code and replace the Code removed here… comment in the previous code.

This code performs an important task: it finds the relationship for the comments part (see Figure 2), deletes the relationship, and then deletes the comments part. Note that as in the previous search for a particular relationship type, the code must loop through all the matching relationships, even though there is only one relationship to a comments part in a well-formed Word 2007 document.

Copy the following code and replace the Code removed here… comment in the previous code.

// Manage namespaces to perform Xml XPath queries.
NameTable nt = new NameTable();
XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
nsManager.AddNamespace("w", wordmlNamespace);
// Get the document part from the package.// Load the XML in the part into an XmlDocument instance:
XmlDocument xdoc = new XmlDocument(nt);
xdoc.Load(documentPart.GetStream());
// Code removed here…

When it finds the document part, the code creates an XmlNamespaceManager instance loaded with the namespace used to perform searches of the code, and creates anXmlDocument instance to contain the contents of the document. The code then loads the XML content into the XmlDocument instance.

Copy the following code and replace the Code removed here… comment in the previous code.

The Word 2007 document contains a start element and an end element for each comment in the document. This code retrieves a collection of nodes corresponding to each type of element, and deletes all of the commentRangeStart and commentRangeEnd nodes.

Copy the following code and replace the Code removed here… comment in the previous code.

This code block handles the CommentReference attributes. These attributes contain the references to the comments in the comments part, and the code must remove these as well. Just as described earlier, the code retrieves a collection of nodes that match the XPath expression that defines the correct reference nodes, and then deletes each of the nodes.

Copy the following code and replace the Code removed here… comment in the previous code. This code saves the XML content back to the document part.

It is important to understand the file structure of a simple Word 2007 document, so that you can work with the comments. To do that, create a Word 2007 document, and add some comments to the document. Save the document in a convenient location, and close Word. (This how-to topic assumes that you named your document C:\Comments.docx.)

To investigate the contents of the document

In Windows Explorer, rename the document Demo.docx.zip.

Open the ZIP file using Windows Explorer or a ZIP-management application.

View the _rels\.rels file, shown in Figure 1. This document contains information about the relationships between the parts in the document. Note the value for the document.xml part, as highlighted in the figure—this information allows you to find the specific part you need.

Figure 1. References to top-level document parts in the .rels file

View the \word\_rels\document.xml.rels file. You will find the relationship between the document, and the associated comments (see Figure 2). This relationship makes it possible to find the comments part, so that the code can delete it.

Figure 2. References to document-related parts in the document.xml.rels file

View the document specified in the .rels file, \word\document.xml. Locate the commentRangeStart, commentRangeEnd, and commentReference elements (see Figure 3). The code in this article shows how to remove these items.

Figure 3. Comment-related elements

Close the tool you are using to investigate the presentation, and rename the file with a .docx extension.