Integration of Speech and Natural Language

Finding an effective way of using natural-language understanding
technology to improve speech recognition has been a long-standing goal
of spoken-language understanding research, but achieving positive
results has proved difficult. Under its Improved Spoken-Language Understanding project, SRI International has demonstrated a
significant reduction in speech recognition error by using a
natural-language processing system to rescore recognition hypotheses.

The difficulty of this task is due, at least in part, to lack of
robustness when the natural-language system is unable to analyze an
utterance as a single coherent phrase or sentence. SRI's innovative
approach to this problem involves finding the best analysis of a
recognition hypothesis as a sequence of semantically meaningful
fragments, estimating the probability of an utterance consisting of a
sequence of fragments of the linguistic types found, and combining
that probability with estimates of the probability of each fragment
type consisting of the corresponding word sequence in the hypothesis.
This gives an overall linguistic probability for the hypothesis that
is used to modify the score for the hypothesis produced by the
baseline speech recognizer.

This method was tested in the
December 1994 DARPA benchmark evaluations, with the result that word
recognition error was reduced by 15% (from 2.5% to 2.1%). These
results represent the only significant improvement we are aware of
obtained by using a linguistically-based natural-language knowledge
source in conjunction with a current state-of-the-art recognizer, in a
blind test on spontaneous, natural speech. For more information, see