The World at Your Fingertips: Federation Plus Localization

Traditionally, companies spend a tremendous amount of time and money making sure their structured repositories are backed up, secure and in compliance. These repositories consist of various ERP systems, such as accounts payable, accounts receivable and CRM. Consistently delivering the right information to the right people at the right time increases profitability and competitive advantage – while also ensuring compliance.

Unfortunately, many companies do not understand that these structured data repositories make up only 15 percent of the total critical data within the company. The rest of a company's critical data – nearly 85 percent – is unstructured. To put this in perspective, research also shows that 80 percent of all critical business decisions are made from this unstructured data. Unstructured content includes word processing documents, email, images, scanned documents, instant messages, spreadsheets, PDFs, presentations and more. In addition, it is commonly known that almost half of a business' unstructured data is "noise" – no longer of any use to the organization. Yet global users must still sift through that noise, pulling it back and forth across the network, to find the meaningful information necessary to do their jobs effectively.

As mid- to large-sized global companies are discovering, the bulk of data management and IT costs accrue through locating this unstructured content, managing out of control email and archiving mission-critical information. News headlines of insiders leaking confidential information and trade secrets remind us daily that this information is truly vulnerable. Management costs are compounded should the company find itself in litigation: The average cost to defend a corporate lawsuit now exceeds $1.5 million per case. According to the the “Law360 2009 Litigation Almanac,” litigation in U.S. federal courts rose 9 percent in 2008 – and this trend is expected to continue into 2011 and beyond. Furthermore, as companies compete globally, they find themselves hiring a diverse workforce using different languages and bringing their own cultural sensibilities to data management.

Federated Content: The Future of Data Management

Federating content turns the storage of unstructured content into a distributed model by setting up servers at each location and connecting servers via networks. Yet unlike maintaining separate regional file shares, the central repository is aware of all of the distributed servers containing unstructured content and the specific content and metadata stored in those repositories.

To remain competitive and compliant in today's global marketplace, businesses need to bring their unstructured data under control, making it secure, compliant, auditable and easy to locate. They also need to find and remove duplicate content, provide version management, and enable federated content and search. This is a daunting task by any standard.

According to a 2003 study by UC-Berkeley, content is growing exponentially to the tune of 5 exabytes (one quintillion bytes) of data per year globally. The study equates that to the information contained in 37,000 libraries the size of the Library of Congress book collection. When Mark Hurd was CEO of Hewlett-Packard in June 2009 he put it this way, "More data will be created in the next four years than in the history of the planet." Marissa Mayer, VP of search products and user experience at Google Inc., noted in her presentation at PARC, intriguingly entitled "The Physics of Data," that there were 5 exabytes of data online in 2002, which had risen to 281 exabytes by 2009. That's a growth rate of 56 times over seven years.

No longer is a single, static repository a cost-effective solution for managing data. Federated content management, on the other hand, allows existing content to remain in its place and existing applications to remain intact, while at the same time providing a single access point for users to manage content across the entire enterprise, regardless of their physical location or the physical location of the data. Without federation, giving users access to the content they need when they need it requires huge expenses to increase bandwidth. Otherwise, productivity suffers tremendously.

Federation reduces expensive costs for maintaining ongoing high levels of bandwidth. In addition, it eliminates the lag time to get data to the desktop, so the knowledge worker can effectively work with the information in a timely and efficient way, reducing frustration in the process. For instance, consider a life sciences company with offices in New York, Paris and Tokyo. The French documents, used every day in the Paris office, are stored locally on the network in Paris. These documents can be searched from the end-user federated application in New York or Tokyo as if they were stored locally, then transmitted by fractal T1 line for viewing or editing.

Federation with Localization

Federation alone, however, is an incomplete answer. Although approximately 375 million people speak English as their first language, there are more than 6.6 billion people in the world. Federating content without consideration to local language and culture presents huge barriers. Adding localization is critical to successful federation in a global business.

"Localization" means adapting a product or service to a particular language. Well-developed content management applications already present menus and navigation in the local language being used on the desktop. This can be accomplished on the fly or by installing a localized version of the application. For instance, Microsoft provides 37 different language interface packs.

In our earlier example, content is distributed around the globe so that knowledge workers in the regional offices have the quickest access to the documents they work with most frequently. Obviously, people work most effectively and efficiently in their native language. Software solution providers have long understood the concept of localization, and it has been built into most of the content management solutions used for federating content. Effective localization strategies tie into the desktop language and then render the user interface in the native language. More advanced systems on the market today not only render the UI at each federated location, but can also apply the native language into the search indices so that searches are translated on the fly.

But language translation of the user interface is not enough. True localization must include the translation of search terms, metadata and indices. In short, localization must facilitate understanding and foster cooperation and efficiency in a global company with multinational office locations.

For example, if you are searching for a "last will and testament" for George Smith, the search terms may need to be translated if that document is stored in Germany. Furthermore, George Smith might actually spell his name “Georg Smith” or “Jorge Smith,” depending on if the document is stored in Germany or Puerto Rico. In this example, the end user doesn't need to know the language of the original document or where it is stored. The search term can be automatically translated to the other local languages and searched across all of the federated content. User access with a single sign-on and in a single language becomes very powerful. Before federated content, the searcher would have to log on separately into each region's system and search them individually. With federation, the user does not need to know in advance where the content is located, and with localization, the user does not even need to know the language of the original document.

Sophisticated engines working across federated content also have the ability to return a real-time translated version of the document. This can be particularly important during e-discovery. In the event of a lawsuit, queries from New York done across a global federated system would result in all relative documents across all repositories translated into the local language. This minimizes risk and exposure and reduces the cost of collecting relevant documents in a timely fashion. Relevant is the key word here: Producing too much material significantly adds to the cost of the e-discovery, because lawyers must be retained to sift through all of the returned documents. Searching and extracting just the necessary document can save companies millions of dollars versus "imaging" or copying down all documents from the regional repositories.

As corporations expand globally, federated content and localization are a must. Truly federated and localized repositories – functioning as a single cohesive database regardless of language – result in happier, more effective and efficient knowledge workers. Streamlining content through federation and localization eliminates the cultural barriers that reduce collaboration, mitigates risk and goes right to a company’s bottom line.