Named entity learning and verification: EM in large corpora

The regularity of named entities is used to learn
names and extract named entities. Having only
a few name elements and a set of patterns the algorithm
learns new names and its elements. A
verification step assures quality using a large
background corpus. Further improvement is
reached through classifying the newly learnt
elements on character level. Moreover, unsupervised
rule learning is discussed.