Voice Data Collection

Our client takes the stress out of human-to-tech communication. Their innovations in voice, natural language understanding and systems integration work together to create more human-oriented technology; tech that has adapted to the way people communicate instead of forcing people to adapt to machines.

THE CHALLENGE WITH VOICE DATA COLLECTION

The challenge was developing the next-generation of in-car speech recognition technology. Our client needed support with voice data collection. Meaning, hundreds of hours of voice data in various languages, demographics, and locations around the world. The data would be used to teach in-car systems to communicate with human beings. Hence, the need for a precise and comprehensive amalgamation of all possible terms, accents, phrases that would be used to communicate in the vehicle.

OUR APPROACH TO VOICE DATA

In order to collect high quality voice data in the right environment and conditions, Globalme traveled to 10 countries and collected data from more than 2,000 participants over three months. The project initially began with voice data collections in China, Russia, Japan, Korea, Poland, Italy, Turkey and Spain. We presented the participants with various loosely-structured scenarios. In response to these scenarios, participants phrased the requests the way they liked. Natural language data is critically important as terminology and the sentence structure will vary between participants. Culture, education, dialect, social environment and many other attributes have an impact on how a user will articulate a request.

When our project team eventually returned to our home base in Vancouver with suitcases full of valuable data, we conducted data collection in another 15 languages. Thanks to the multicultural nature of Vancouver, we were able to find almost every foreign language we needed within the city limits. Globalme collected the likes of Russian, Dutch, Korean, and more from over 40 participants per language. With the data Globalme collected, our client was able to build their research base, and continue the innovation in human and machine interaction.

OUR DATA COLLECTION SOLUTIONS

Our data collection services include more than just in-field voice data collection. We offer terminology and lexicon development, multilingual transcription, and linguistic analysis. Find details on our data collection services page or reach out to us below.

Languages

PARTICIPANTS

HOURS OF DATA

We worked with TunnelBear to scale their business internationally by localizing their app for 16 languages, ensuring their brand identity remained intact across all cultures.