Wednesday, August 23, 2006

Language Learning

The well-known specialist in neural networks Kohonen developed the idea of Dynamically Expanding Context (DEC) grammars. He applied the concept successfully to, among other things, musical composition. The basic idea behind DEC-grammars is the notion that the occurrence of a symbol in a string is determined by its predecessors. A DEC-grammar consists of rules that determine which symbol can occur in which context. A context is a string symbols that precedes a symbol. A rule of a DEC-grammar has the following form:X => yHere ‘X’ is a context, ‘y’ is a symbol. These rules can be used to generate a string of symbols: in context ‘X’ write symbol ‘y’.

The algorithm to learn a DEC-grammar (i.e. to find a DEC-Grammar which describes a particular string) is very simple. Say you want to find a DEC-Grammar for the string ‘^abae’. You start to read the first symbol of the string, and you form the rule:^ => aThen you read the next symbol (which is ‘b’) and you form the rule:a => bYou proceed along these lines, generating rules, until you obtain the grammar:^ => aa => bb => aa => eThis grammar is not deterministic anymore, because there are two possible successors of ‘a’, namely ‘b’ and ‘e’. In order to repair this you update the rules in the grammar that violate the deterministic structure by expanding their context as necessary: (a => b) is replaced by (^a => b) and (a => e) is replaced by (ba => e). The final grammar becomes:^ => a^a => bb => aba => e