Artificial Intelligence Powers Quality Journalism

Since its early days in the 1950s, artificial intelligence (AI) has always been a linguistic technology, meaning it’s always been about understanding natural speech in written or spoken form, converting one to the other, or translating content into another language. It goes without saying that AI was also a child of the Cold War, its development driven by intelligence agencies and mistrust between the rivaling blocs. Despite these misgivings about its lineage as a defense-related technology, civilians have for a long time been able to enjoy the benefits of AI.

There’s much to admire about AI improving our everyday life, just look at digital assistants such as Siri or Alexa. It also plays an important role in journalism. Writing articles, though, is so complex a task that it has remained a marginal feature of AI systems. For the time being, we don’t have to worry that robo-journalism will replace creative minds.

Algorithms can describe what has happened, but they can’t interpret why things happened. “The capabilities of algorithms to observe society are limited, as are their capabilities to perform journalistic tasks such as putting things in context and shaping public opinion”. [1]

Even if creating quality content is too difficult, robo-journalism can accomplish something else: “It can contribute to democratisation (...) via content automation, create publicity and generate additional advertising income. Text that flows well makes you want to read more. Robo-journalism can cultivate new readers, generate information and in the process make it easier to understand content and to disseminate it on a broader scale. This has beneficial consequences for reporting practices and cultural habits since those articles deal with local events that no journalist and no medium would have covered due to the costs involved.” [2]

Big data research

There are many examples for the type of content AI systems can author: hyperlocal weather reports, real-time sports news and stock market updates. Those types of articles are straightforward and don’t have to sound particularly catchy or witty — and nobody would ever expect them to. Complex issues are a different story. Readers want some guidance and expect information that’s both well researched and offers thorough analysis. As communication scientist Prof. Andreas Graefe explains: “It will be difficult to fully automate the creation of articles around issues which are really relevant for society and really interest people.” [3]

But if we look at the reasons why automated articles work well for topics such as the stock market and weather, we can clearly see one of the major advantages that AI offers. It’s uniquely capable of evaluating complex data in enormous quantities, a new currency that journalists have to deal with. Journalists have to navigate an increasingly complex information landscape, filtering a large number of news sources, evaluating them and continuously processing information across multiple channels.

“News Stream” - the new analysis tool

In the “News Stream” research project, Fraunhofer IAIS and Neofonie are working together with Deutsche Welle and dpa to develop new research and analysis tools. With just a few clicks, journalists can aggregate thousands of content pieces from video platforms, RSS feeds, media archives, or social media on their screens and get a bundled view of current events in real time.

Take an editor who wants to produce a story on the controversial issue of toll roads. He or she can easily generate a comprehensive overview of the subject combining many different data sources, tracking how the issue is covered or discussed on blogs, Twitter and other social media. “News Stream” also makes ongoing research easier. As soon as a keyword such as “car tolls” pops up, say in a debate in the German parliament or a news report, the analysis updates automatically.

German news agency dpa wants to test and further develop this platform in its day-to-day business. “The goal is to have an end-to-end, rapid, highly focused overview of all this information. In many cases, that gives our editors and by extension our customers an enormous research advantage,” explained Sven Gösmann, editor-in-chief of dpa. Deutsche Welle also plans to upgrade its editorial offices with such state-of-the-art research and analysis tools. “A major benefit for our journalists is that they can discover new stories in the enormous flood of data. Bundling information coming from social networks allows us to watch important issues in greater detail through our users’ eyes,” said Gerda Meuer, program director of Deutsche Welle. [4]

Neofonie is a leading partner in the News Stream project and is responsible for building new text analysis services. The current and future text-mining components by the Berlin-based company will be integrated into News Stream. One key goal is to improve the processing speed for big data applications and boosting the error tolerance when processing multi-modal data. The work is based on TXT Werk, an API provided by Neofonie to analyse German language texts. Core features include proper name recognition, automatic keywording and report classification.

Golf: A car or a game?

In the case of the latter, the input text is classified using a small group of pre-defined general topics. They correspond most closely to the various sections of a newspaper (e.g. “Arts & Culture”). TXT Werk API’s auto-tagging service identifies keywords and phrases which are characteristic for a particular text. The potential candidates are identified via linguistic patterns, and keywords are selected by a support vector machine using various features.

Dates and times form a key category of referenced expressions, as do mentions of entities such as persons, organisations, or places. Here, too, the use of machine learning (e.g. conditional random fields) is mandatory, as lexical methods quickly reach their limits. The word “Golf,” for instance, can refer either to a car or the game. In addition, even large knowledge bases such as Wikidata or Dbpedia are incomplete by their very nature. They are, after all, only digital models of the world as it’s currently known. The problem of ambiguity, or multiple meanings, poses another problem for entity recognition. Is “Peter Müller” the German politician or the skier? Or are we talking about an entirely different person by the same name who hasn't yet become popular enough to be listed in one of the knowledge bases? When a knowledge base allocates one clear resource to each reference, it’s called entity linking (also known as “named entity disambiguation”). [5]

Networked thinking

“The methods used by artificial intelligence, such as natural language processing and machine learning, could allow new journalistic formats to be created, for example platforms for local content,” said Peter Adolphs, head of Neofonie’s research department.

What does the rise of new tools and services mean for news professionals? “Networked thinking is a major challenge for journalists,” said Nathalie Wappler-Hagen, program director of MDR. Vishal Sikka, CEO of the Indian IT group Infosys, believes that this trend will bring major human strengths even more important. “The corollary of the age of AI is that we can focus on what makes us human: the ability to learn, discover new things, and structure them. We can use our creativity and the power of our imagination to find problems, gather new experiences, and create valuable items which currently don’t exist. We need to become life-long learners”. [6]