DB2 CommonStore – Some sage advice for eDiscovery and email archving

I have a customer who is archiving data using IBM DB2 CommonStore for Lotus Domino, and have recently upgraded their environment. There are some things to know as you may not easily be able to glean advice from the IBM support portal.

Analysis and Planning

Setting up a corporate email archiving solution can be a much larger undertaking then you think. There are storage, legal, regulatory, communal, and of course cost concerns. All of these will factor into the total price. Before you commit to setting up an archive system of ANY kind, sit back and ask yourself the following questions:

What is the maximum time I’m willing to keep emails?

What is the minimum time I’m required to keep emails? (think regulatory compliance)

Can I depend on my users to abide by our archiving standards? (hint: no, you cannot)

What is the risk of not being able to find key emails if we are sued?

What is the legal risk of keeping email?

What is the cost of keeping email for 1 year, 3 years, 7 years? (think storage costs)

How can I keep a user from deleting email younger than our retention period?

If you only want to reduce your storage costs, then most likely the email platform you are using already has archiving features available to you. Lotus Domino has a decent archiving system that lets you centralize a policy and offload archived emails to local desktops or to remote desktop servers. Exchange and GroupWise have similar features. Google Apps has a very rudimentary system of archiving. If you don’t forsee your organization ever having to perform what is called an ‘eDiscovery’ for a litigation, then this may be the most economical solution. However, many industries are required to retain any communication with customers for a specified amount of time (SEC in particular). SOX only requires you to define a time period and stick with it. HIPPA has privacy and security concerns you need to address, as well as long term retention demands. Bottom line, you need to define a policy first. This should be both an archiving policy and an email acceptable usage policy. Consider also archiving of instant messaging data as well. That too is admissible as evidence, and I promise you your employees are using instant messaging whether you “allow” it in your organization or not! Its best to allow it so you can collect it. Collecting it means controlling it.

DB2 CommonStore & InfoSphere Content CollectorBoth of these products are considered high end email collection tools. CommonStore has been around for a few years and is the elder brother of the two. InfoSphere was acquired by IBM along with a whole portfolio of products. CommonStore has three flavors: one for Domino, one for Exchange, and one for SAP. InfoSphere has one flavor. If you have a choice between CommonStore, and InfoSphere, go with InfoSphere. The price is the same, but its easier to configure, and there are less tentacles into the Domino server. The largest inconvenience I see with CSLD is that it requires custom updates to the Notes mail templates, which need to be upgraded with each CSLD patch, and also must be reconciled with new mail templates with each Domino release. PITA. InfoSphere Content Collector however, can crawl both Domino and Exchange systems, which is ideal if you have a mixed environment. ICC is definitely the more flexible of the two. Both however require a content mangement backend. Namely, you need either DB2 Content Manager (CommonStore only) or Filenet P8 (both products). CommonStore can also crawl a Tivoli Storage Manager archive, and Content Manger onDemand, but (pardon the pun), demand for the latter two archiving styles is waning.

eMail Search vs. eDiscovery Manager

First off, if you have purchased eMail Search, you should have found that this has been discontinued, and you have been upgraded to using the InfoSphere eDiscovery Manager in your Passport Advantage center. You will want to rip out eMail search as soon as you can and get running on eDM. eDM is a very slick Web2.0 type interface that is MUCH easier to use than eMail Search. The online help is superb, and it is much easier to configure. The product runs on top of WebSphere App Server. You will want to run this on a separate server. Avoid the tempation to just stack it on top of CommonStore which in turn is on top of Content Manager, which also is on top of WebSphere App Server and DB2 Enterprise. The tool can crawl both CommonStore and InfoSphere Content Collector stores for several email back ends.

CommonStore Setup

Follow the documentation to the letter and read it carefully without distractions. There is a lot to configure. One thing I will add to the documentation, is that when you are setting up your item types in Content Manager, set up attributes for the CC and BCC fields. If you take the default route, you’ll find out later that these were not part of the defaults. Also take care to patch the system to the latest releases of DB2, WebSphere App Server, Content Manager, and CSLD. There are several interim fixes for CSLD. If installing new, install DB2 9.5, CM 8.4.2 with the latest fixpack. Install WebSphere App Server 7.0.0.11 (the latest build as of this blog post).

If you are using Commonstore, you will not find any InfoCenter on it. You’ll have to use the published PDFs. I’ve included both of those below for convenience as well as a few links to get you going.

There is not much of a community or forum for Commonstore like there is for many other products, but feel free to post questions here and we’ll see what we can do to answer them. You may find posts in the Lotus forums on IBM developerworks, but I’ve found them riddled with innaccurcies or the posts were terribly old.

About Us

We are an information technology consulting firm specializing in the areas of Enterprise Modernization, DevOps, and process improvement. We are experts at iterative (agile) software development frameworks such as Scrum and SAFe, and in enterprise architectures such as Spring, Java EE, and other such frameworks. We have successfully helped IT organizations reduce their total cost of operations by refactoring their enterprise IT architecture, updating legacy applications, and moving to more scalable architectural patterns of development.