A language blog by Warren M Tang

Menu

Tag Archives: token

The type-token ratio (or TTR) is used to compare two corpora in terms of lexical complexity. The formula is the number of types divided by the number of tokens. The closer to 1 the greater the complexity. The closer to 0 the greater the repetition of words. There is not a specific ratio which can be said to be ideal as such but that one corpus can be said comparably more or less complex to another only when they are of similar size. Therefore approximate size equivalence is an important criterion in using TTR.