Input From Audio File

Hi,
I have a large collection of documentarie videos and I'm looking for a way to extract frequently used words from them in order to index them in a database. Extracting the audio is an easy process and accuracy of the recognition is not very important as I only plan to save the 50 most used words.

Is there any way to hook Tazti up between an audio file and a text file to extract every word it recognises so I can then determine the most used words in a separate process. A command line tool would be preferred