14 Ignoring Information: Lexical Acquisition studying syntactic patterns, e.g. finding verbs in a corpus, displaying possible arguments e.g. gave, in 100 files of the Penn Treebank corpus replaced internal details of each noun phrase with NP gave NP gave up NP in NP gave NP up gave NP help gave NP to NP use in lexical acquisition, grammar development

26 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

27 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

28 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

29 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

30 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

31 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

32 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

33 Evaluating Chunk Parsers Process: 1 take some already chunked text 2 strip off the chunks 3 rechunk it 4 compare the result with the original chunked text ChunkScore.score() precision: what fraction of the returned chunks were correct? recall: what fraction of correct chunks were returned?

Dongqing Zhu What is chunking Why chunking How to do chunking An example: chunking a Wikipedia page Some suggestion Useful links POS Tag recovering phrases constructed by the partof-speech tags finding

Part-of-speech tagging Read Chapter 8 - Speech and Language Processing 1 Definition Part of Speech (pos) tagging is the problem of assigning each word in a sentence the part of speech that it assumes in

31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software

CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed

Chapter 9 Chart Parsing and Probabilistic Parsing 9.1 Introduction Chapter 8 started with an introduction to constituent structure in English, showing how words in a sentence group together in predictable

Context Free Grammars So far we have looked at models of language that capture only local phenomena, namely what we can analyze when looking at only a small window of words in a sentence. To move towards

Comma checking in Danish Daniel Hardt Copenhagen Business School & Villanova University 1. Introduction This paper describes research in using the Brill tagger (Brill 94,95) to learn to identify incorrect

Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language

Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or

Extraction of Hypernymy Information from Text Erik Tjong Kim Sang, Katja Hofmann and Maarten de Rijke Abstract We present the results of three different studies in extracting hypernymy information from

How much does word sense disambiguation help in sentiment analysis of micropost data? Chiraag Sumanth PES Institute of Technology India Diana Inkpen University of Ottawa Canada 6th Workshop on Computational

Discovery of Manner Relations and their Applicability to Question Answering Roxana Girju, Manju Putcha and Dan Moldovan Human Language Technology Research Institute University of Texas at Dallas and Department

Classification and Generation of Grammatical Errors by Anthony Penniston Bachelor of Science, Ryerson University, 2009 A thesis presented to Ryerson University in partial fulfillment of the requirements