3
NLP is AI-complete All the difficult problems in artificial intelligence manifest themselves in NLP problems Turing Test: links machine intelligence with the ability to process language The interrogator C needs to determine which player - A or B - is a computer and which is a human.

4
Passing Turing Test Turing (1950): “I believe that in about fifty years’ it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.” ELIZA (Weizenbaum, 1966): first computer dialogue system based on keyword matching young woman: Men are all alike. eliza: In what way? young woman: They're always bugging us about something specific or other. eliza: Can you think of a specific example? young woman: Well, my boyfriend made me come here. eliza: Your boyfriend made you come here?

15
Why NLP is Hard? ( example from L.Lee ) ``At last, a computer that understands you like your mother''

16
Ambiguity at Syntactic Level Different structures lead to different interpretations

17
Ambiguity at Semantic Level “Alice says they've built a computer that understands you like your mother” Two definitions of mother: female parent a stringy slimy substance consisting of yeast cells and bacteria; is added to cider or wine to produce vinegar This is an instance of word sense disambiguation

18
Ambiguity at Discourse Level Alice says they've built a computer that understands you like your mother but she … doesn’t know any details … doesn’t understand me at all This is an instance of anaphora, where “she” co-refers to some other discourse entity

23
NLP History: Symbolic Era “ Colorless green ideas sleep furiously. Furiously sleep ideas green colorless. It is fair to assume that neither sentence (1) nor (2) (nor indeed any part of these sentences) had ever occurred in an English discourse. Hence, in any statistical model for grammaticalness, these sentences will be ruled out on identical grounds as equally "remote" from English. Yet (1), though nonsensical, is grammatical, while (2) is not.” (Chomsky 1957) 1970’s and 1980’s: statistical NLP is in disfavor emphasis on deeper models, syntax toy domains/manually developed grammars (SHRDLU, LUNAR) weak empirical evaluation

25
Case Study: Determiner Placement Task: Automatically place determiners a, the, null in a text Scientists in United States have found way of turning lazy monkeys into workaholics using gene therapy. Usually monkeys work hard only when they know reward is coming, but animals given this treatment did their best all time. Researchers at National Institute of Mental Health near Washington DC, led by Dr Barry Richmond, have now developed genetic treatment which changes their work ethic markedly. "Monkeys under influence of treatment don't procrastinate," Dr Richmond says. Treatment consists of anti-sense DNA - mirror image of piece of one of our genes - and basically prevents that gene from working. But for rest of us, day when such treatments fall into hands of our bosses may be one we would prefer to put off.

26
Relevant Grammar Rules Determiner placement is largely determined by: – Type of noun (countable, uncountable) – Uniqueness of reference – Information value (given, new) – Number (singular, plural) However, many exceptions and special cases play a role: – The definite article is used with newspaper titles (The Times), but zero article in names of magazines and journals (Time) Hard to manually encode this information!

27
Statistical Approach: Determiner Placement Simple approach: Collect a large collection of texts relevant to your domain (e.g. newspaper text) For each noun seen during training, compute its probability to take a certain determiner Given a new noun, select a determiner with the highest likelihood as estimated on the training corpus

29
Does it work? Implementation details: – Training --- first 21 sections of the Wall Street Journal corpus, testing -- the 23th section – Prediction accuracy: 71.5% The results are not great, but surprisingly high for such a simple method – A large fraction of nouns in this corpus always appear with the same determiner ``the FBI'', ``the defendant''

36
The NLP Cycle Get a corpus Build a baseline model Repeat: – Analyze the most common errors – Find out what information could be helpful – Modify the model to exploit this information – Use new features – Change the structure of the model – Employ new machine learning method