Speech Recognition Research 2.0: A Web 2.0 Approach to Speech Recognition Research

Introduction:

PodCastle
is a speech retrieval web service
that collects and amplifies voluntary contributions by anonymous users.
Our goal is to provide users with a public web service based on
speech recognition and crowdsourcing
so that they can experience state-of-the-art speech recognition performance
through a useful service.
PodCastle enables users to
find speech data (such as podcasts and video clips on video sharing services)
that include a search term,
read full texts of their recognition results, and
easily correct recognition errors
by simply selecting from a list of candidates.
The resulting corrections were used
to improve both the speech retrieval and recognition performances.
In our experiences
from its practical use over the past four years (since December, 2006),
over half a million recognition errors
in about one hundred thousand speech data
were corrected by anonymous users and
we confirmed that the speech recognition performance of PodCastle
was actually improved by those corrections.

PodCastle is the world's first speech service based on
crowdsourcing and
wisdom of crowds,
and the first instance of our research approach,
``Speech Recognition Research 2.0'',
which is aimed at providing users with a web service based on Web 2.0
and at promoting
speech recognition technologies in cooperation with anonymous users.

Jun Ogata and Masataka Goto:
PodCastle: A Spoken Document Retrieval System for Podcasts and Its Performance Improvement by Anonymous User Contributions,
Proceedings of the Third Workshop on Searching Spontaneous Conversational Speech (SSCS 2009),
pp.37-38, October 2009.

Jun Ogata and Masataka Goto:
PodCastle: Collaborative Training of Acoustic Models on the Basis of Wisdom of Crowds for Podcast Transcription,
Proceedings of the 10th Annual Conference of the
International Speech Communication Association (Interspeech 2009),
pp.1491-1494, September 2009.

Jun Ogata, Masataka Goto, and Kouichirou Eto:
Automatic Transcription for a Web 2.0 Service to Search Podcasts,
Proceedings of the 8th Annual Conference of the
International Speech Communication Association (Interspeech 2007),
pp.2617-2620, August 2007.