People make sense of a text by identifying the semantic relations which connect the entities or concepts described by that text. A system which aspires to human-like performance must also be equipped to identify, and learn from, semantic relations in the texts it processes. Understanding even a simple sentence such as "Opportunity and Curiosity find similar rocks on Mars" requires recognizing relations (rocks are located on Mars, signalled by the word on) and drawing on already known relations (Opportunity and Curiosity are instances of the class of Mars rovers). A language-understanding system should be able to find such relations in documents and progressively build a knowledge base or even an ontology. Resources of this kind assist continuous learning and other advanced language-processing tasks such as text summarization, question answering and machine translation.

The book discusses the recognition in text of semantic relations which capture interactions between base noun phrases. After a brief historical background, we introduce a range of relation inventories of varying granularity, which have been proposed by computational linguists. There is also variation in the scale at which systems operate, from snippets all the way to the whole Web, and in the techniques of recognizing relations in texts, from full supervision through weak or distant supervision to self-supervised or completely unsupervised methods. A discussion of supervised learning covers available datasets, feature sets which describe relation instances, and successful algorithms. An overview of weakly supervised and unsupervised learning zooms in on the acquisition of relations from large corpora with hardly any annotated data. We show how bootstrapping from seed examples or patterns scales up to very large text collections on the Web. We also present machine learning techniques in which data redundancy and variability lead to fast and reliable relation extraction.

About the Author(s)

Vivi Nastase, FBK, Trento, Italy While writing this book, Vivi Nastase was a visiting professor at the University of Heidelberg. She is now a researcher at the Fondazione Bruno Kessler in Trento, working mainly on lexical semantics, semantic relations, knowledge acquisition and language evolution. She holds a Ph.D. from the University of Ottawa, Canada.

Preslav Nakov, QCRI, Qatar Foundation Preslav Nakov, a Research Scientist at the Qatar Computing Research Institute, part of Qatar Foundation, holds a Ph.D.from the University of California, Berkeley. His research interests include computational linguistics and NLP, machine translation, lexical semantics, Web as a corpus and biomedical text processing.

Diarmuid O Seaghdha, Computer Laboratory, University of Cambridge, UK Diarmuid O Seaghdha, a postdoctoral Research Associate at the University of Cambridge Computer Laboratory, holds a Ph.D. from the University of Cambridge. His research interests include lexical and relational semantics, machine learning for NLP (probabilistic models and kernel methods), scientific text mining and social media analysis.

Stan Szpakowicz, EECS, University of Ottawa & ICS, Polish Academy of Sciences Stan Szpakowicz, a professor of Computer Science at the University of Ottawa, holds a Ph.D. from Warsaw University.He has been active in NLP for 44 years. His recent interests include lexical resources, semantic relations, emotion analysis and text summarization.