Add exercises where the user has to pronounce the word.
Voice recognition (maybe Google has an API?) used to verify correct word.
Further, analysis of the voice could be used to correct wrong pronunciation.