Semantic Publishing Platform

Improving Content Production Through a Unified Semantic Publishing Platform

Euromoney Institutional Investor PLC (Euromoney) is one of the largest publishers of business and financial publications in Europe. With 84 different brands and business units, the company publishes more than 100 magazines, newsletters and journals, including the leading financial market titles Euromoney and Institutional Investor.

The Goal

With more than 100 different publications being produced throughout the company, Euromoney realized that the substantial creation of content by its 84 business units could be dramatically streamlined through the implementation of a single, consolidated platform for creating and presenting content. The goal of this new platform was to have the ability to easily reuse and repurpose content within and between the business units, as well as provide rich navigation and analytics functionality.

Initially, the solution will be adopted by Euromoney’s BCA Research unit, a provider of research, analysis and forecasts for global economies. Eventually, it will be rolled out company-wide.

The Challenge

Euromoney has grown, in part, by acquiring other firms in the financial sector publishing space. As a result of these acquisitions, Euromoney has inherited a wide variety of production, content management and presentation systems with no ability to communicate between systems. A single, group-wide platform would need to address the wide variety of domains covered by the different business units – from trading in commodities to macro-economic research. Addressing this content diversity requires flexible domain modeling and different content enrichment mechanisms

In order to provide rich content analytics and interaction to subscribers, the content needs to be enriched with metadata according to a well formed domain model. In other words, the content had to be semantically analyzed, enriched and integrated in a streamlined production environment with complete visibility across the various business units. The cost to provide this manually for over 100 publications is prohibitively expensive.

The Solution

To help tackle these challenges, Euromoney contracted software and information architecture consultants who had worked with BBC and the Press Association on similar projects. The consultants recommended a semantic approach to publication production including triplestores and automated semantic annotation based on text analytics. This approach would focus on seven key aspects:

Triplestore with flexible schema support and reasoning

Domain modeling through ontologies for both the horizontally valid concepts and for the industry sector specific ones

Sourcing internal and external data and transforming it according the domain models

Text analytics driven by the domain models and instance data in the triple store and automatically enriching the content

Subject matter expert curation of the annotations and the instance data

Semantic hybrid index over content, data and annotations linking them

Query-driven definitions of new publication aggregates and rich search and navigation capabilities

Euromoney selected Ontotext GraphDB Enterprise (formerly OWLIM) in its enterprise setup as a replication cluster and the semantic publishing platform based on proprietary text analytics engines. The GraphDB cluster was also enriched with semantic annotation plugins for dynamic updates of the text analytics pipelines whenever data in the triplestore changes to ensure a “single version of the truth”. This resulted in the first RDF database providing semantic annotation of content as an out of the box capability.

The entire text analytics pipeline was developed from scratch in the context of the GATE framework and Ontotext proprietary components to cover sophisticated entity, composite entity, relation and event extraction including Markets, Financial Instruments, Regions, Indicators, Currency Pairs, People, Key phrases and Economic Conditions and Views expressed by the macro-economists. Automatic recognition of concepts would lead to improved author productivity and context based search results.The first product in production based on the new platform provided live charts to the BCA Research customers. One example of the automatic recognition of concepts can be seen in the next snapshot to the right (the snapshot is not from the production system):

The various annotations of entities, concepts and macro-economic conditions can be seen over the next snippet:

Each of the entities and concepts were not only identified automatically using text mining, but they were also stored in GraphDB as semantic facts linked to the originating document. This represents semantic analysis that can inform authors as they write new articles in real time. The resulting intelligence is used to deliver targeted reader recommendations to visitors based on their search.

“In 2013, Euromoney Institutional Investor PLC, the international online information and events group and its subsidiary, BCA Research, set out to create a new publishing and information platform which would include the latest authoring, storing, and display technologies including semantic search and a triple store repository. Following a successful proof of concept using the OWLIM (now GraphDB) semantic repository and KIM text extraction (which is using Gate), Euromoney commissioned Ontotext to create a semantic database for both their macroeconomic investment research reports and their more market oriented content, like merger and acquisition deals news, IPOs, Equity prospects, Bonds issuance, etc.“