We’ve been using OpenEars for several years now. It’s been fantastic. Ran into an issue today. In our standard setup, speech rec is generally suspended. A three finger tap resumes speech rec, at which point the user can issue their speech rec command. We let PocketSphinx detect the silence and return the hypothesis. After the hypothesis is returned, we suspend OE again. I think of this as the “standard” method of doing things.

That setup has always worked like a charm. Now, we’re adding a wrinkle. For some commands, the user indicates the start of the command by holding down a button, and the end of the command by releasing it. Then, we execute runRecognitionOnWavFile for that .wav. This is all done while OE is in suspended mode.

Both of these work fine, up until you switch between them. So, if we issue a command using the “standard” method, and follow that with the our new method using runRecognitionOnWavFile, everything is fine. However, after using runRecognitionOnWavFile, if we go back to the “standard” method for the next command, OE crashes in the OEContiniousModel thread. The log isn’t super-revealing, but I’ve included it below (OE logging and verbose PocketSphinx).

2019-01-07 09:08:54.817597-0500 otto[1393:496245] Starting OpenEars logging for OpenEars version 2.506 on 64-bit device (or build): iPad running iOS version: 12.100000
2019-01-07 09:08:54.818944-0500 otto[1393:496245] Creating shared instance of OEPocketsphinxController
2019-01-07 09:08:56.242688-0500 otto[1393:496245] Since there is no cached version, loading the language model lookup list for the acoustic model called AcousticModelEnglish
2019-01-07 09:08:56.280052-0500 otto[1393:496245] The word GUMPS was not found in the dictionary of the acoustic model /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle. Now using the fallback method to look it up. If this is happening more frequently than you would expect, likely causes can be that you are entering words in another language from the one you are recognizing, or that there are symbols (including numbers) that need to be spelled out or cleaned up, or you are using your own acoustic model and there is an issue with either its phonetic dictionary or it lacks a g2p file. Please get in touch at the forums for assistance with the last two possible issues.
2019-01-07 09:08:56.284656-0500 otto[1393:496245] Using convertGraphemes for the word or phrase gumps which doesn’t appear in the dictionary
2019-01-07 09:08:56.296099-0500 otto[1393:496245] Elapsed time to generate unknown word phonemes in English is 0.015762
2019-01-07 09:08:56.296521-0500 otto[1393:496245] the graphemes “G AA M P S” were created for the word GUMPS using the fallback method.
2019-01-07 09:08:56.334851-0500 otto[1393:496245] The word ZULU was not found in the dictionary of the acoustic model /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle. Now using the fallback method to look it up. If this is happening more frequently than you would expect, likely causes can be that you are entering words in another language from the one you are recognizing, or that there are symbols (including numbers) that need to be spelled out or cleaned up, or you are using your own acoustic model and there is an issue with either its phonetic dictionary or it lacks a g2p file. Please get in touch at the forums for assistance with the last two possible issues.
2019-01-07 09:08:56.335538-0500 otto[1393:496245] Using convertGraphemes for the word or phrase zulu which doesn’t appear in the dictionary
2019-01-07 09:08:56.340676-0500 otto[1393:496245] Elapsed time to generate unknown word phonemes in English is 0.005588
2019-01-07 09:08:56.341070-0500 otto[1393:496245] the graphemes “Z UW L UW” were created for the word ZULU using the fallback method.
2019-01-07 09:08:56.341358-0500 otto[1393:496245] I’m done running performDictionaryLookup and it took 0.088049 seconds
2019-01-07 09:08:56.516564-0500 otto[1393:496245] Returning a cached version of LanguageModelGeneratorLookupList.text
2019-01-07 09:08:56.526831-0500 otto[1393:496245] The word CESS was not found in the dictionary of the acoustic model /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle. Now using the fallback method to look it up. If this is happening more frequently than you would expect, likely causes can be that you are entering words in another language from the one you are recognizing, or that there are symbols (including numbers) that need to be spelled out or cleaned up, or you are using your own acoustic model and there is an issue with either its phonetic dictionary or it lacks a g2p file. Please get in touch at the forums for assistance with the last two possible issues.
2019-01-07 09:08:56.527585-0500 otto[1393:496245] Using convertGraphemes for the word or phrase cess which doesn’t appear in the dictionary
2019-01-07 09:08:56.531316-0500 otto[1393:496245] Elapsed time to generate unknown word phonemes in English is 0.004267
2019-01-07 09:08:56.531699-0500 otto[1393:496245] the graphemes “S EH S” were created for the word CESS using the fallback method.
2019-01-07 09:08:56.573433-0500 otto[1393:496245] The word ZULU was not found in the dictionary of the acoustic model /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle. Now using the fallback method to look it up. If this is happening more frequently than you would expect, likely causes can be that you are entering words in another language from the one you are recognizing, or that there are symbols (including numbers) that need to be spelled out or cleaned up, or you are using your own acoustic model and there is an issue with either its phonetic dictionary or it lacks a g2p file. Please get in touch at the forums for assistance with the last two possible issues.
2019-01-07 09:08:56.574259-0500 otto[1393:496245] Using convertGraphemes for the word or phrase zulu which doesn’t appear in the dictionary
2019-01-07 09:08:56.578103-0500 otto[1393:496245] Elapsed time to generate unknown word phonemes in English is 0.004419
2019-01-07 09:08:56.578463-0500 otto[1393:496245] the graphemes “Z UW L UW” were created for the word ZULU using the fallback method.
2019-01-07 09:08:56.578631-0500 otto[1393:496245] I’m done running performDictionaryLookup and it took 0.061926 seconds
SUSPEND
2019-01-07 09:08:56.755857-0500 otto[1393:496328] [avas] AVAudioSessionPortImpl.mm:56:ValidateRequiredFields: Unknown selected data source for Port Speaker (type: Speaker)
2019-01-07 09:08:56.756406-0500 otto[1393:496328] Audio route has changed for the following reason:
2019-01-07 09:08:56.756784-0500 otto[1393:496328] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
2019-01-07 09:08:56.777773-0500 otto[1393:496328] [avas] AVAudioSessionPortImpl.mm:56:ValidateRequiredFields: Unknown selected data source for Port Speaker (type: Speaker)
2019-01-07 09:08:56.778783-0500 otto[1393:496328] This is not a case in which OpenEars notifies of a route change. At the close of this method, the new audio route will be <Input route or routes: “MicrophoneBuiltIn”. Output route or routes: “Speaker”>. The previous route before changing to this route was “<AVAudioSessionRouteDescription: 0x116504420,
inputs = (
“<AVAudioSessionPortDescription: 0x1165045f0, type = MicrophoneBuiltIn; name = iPad Microphone; UID = Built-In Microphone; selectedDataSource = Front>”
);
outputs = (
“<AVAudioSessionPortDescription: 0x116504970, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
)>”.
2019-01-07 09:08:56.779209-0500 otto[1393:496328] [avas] AVAudioSessionPortImpl.mm:56:ValidateRequiredFields: Unknown selected data source for Port Speaker (type: Speaker)
2019-01-07 09:08:58.156527-0500 otto[1393:496245] Attempting to start listening session from startListeningWithLanguageModelAtPath:
2019-01-07 09:08:58.162090-0500 otto[1393:496245] User gave mic permission for this app.
2019-01-07 09:08:58.162538-0500 otto[1393:496245] setSecondsOfSilence wasn’t set, using default of 0.700000.
2019-01-07 09:08:58.167801-0500 otto[1393:496385] Starting listening.
2019-01-07 09:08:58.168008-0500 otto[1393:496385] About to set up audio session
2019-01-07 09:08:58.184123-0500 otto[1393:496385] Creating audio session with mixing disabled.
2019-01-07 09:08:58.184248-0500 otto[1393:496385] Done setting audio session category.
2019-01-07 09:08:58.184513-0500 otto[1393:496385] audioMode is incorrect, we will change it.
2019-01-07 09:08:58.203788-0500 otto[1393:496385] audioMode is now on the correct setting.
2019-01-07 09:08:58.243139-0500 otto[1393:496385] Done setting preferred sample rate to 16000.000000 – now the real sample rate is 16000.000000
2019-01-07 09:08:58.245279-0500 otto[1393:496385] number of channels is already the preferred number of 1 so not setting it.
2019-01-07 09:08:58.246798-0500 otto[1393:496385] Done setting session’s preferred I/O buffer duration to 0.128000 – now the actual buffer duration is 0.128000
2019-01-07 09:08:58.246897-0500 otto[1393:496385] Done setting up audio session
CURRENT LOAD PROGRESS: 0%
2019-01-07 09:08:58.247388-0500 otto[1393:496385] About to set up audio IO unit in a session with a sample rate of 16000.000000, a channel number of 1 and a buffer duration of 0.128000.
2019-01-07 09:08:58.300913-0500 otto[1393:496385] Done setting up audio unit
2019-01-07 09:08:58.301021-0500 otto[1393:496385] About to start audio IO unit
2019-01-07 09:08:58.494435-0500 otto[1393:496385] Done starting audio unit
INFO: pocketsphinx.c(145): Parsed model-specific feature parameters from /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn current current
-cmninit 8.0 40
-compallsen no no
-debug 0
-dict /var/mobile/Containers/Data/Application/19F5A73A-1E2C-47A8-A147-45533591DC4E/Library/Caches/GENERAL.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/noisedict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/feat.params
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle
-input_endian little little
-jsgf /var/mobile/Containers/Data/Application/19F5A73A-1E2C-47A8-A147-45533591DC4E/Library/Caches/GENERAL.gram
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 1.000000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/mdef
-mean /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/means
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec 0-12/13-25/26-38
-tmat /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/transition_matrices
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 69
-vad_prespeech 20 10
-vad_startspeech 10 10
-vad_threshold 2.0 2.300000e+00
-var /var/containers/Bundle/Application/E55C078E-ED64-4F36-80AF-5F2FD98F7B60/otto.app/AcousticModelEnglish.bundle/variances
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02

NeatSpeech is a plugin for OpenEars™ that lets it do fast, high-quality offline speech synthesis which is compatible with iOS6.1, and even lets you edit the pronunciations of words! Try out the NeatSpeech demo free of charge.

OpenEars® is a registered trademark of PolitepixAllHours® is a registered trademark of PolitepixThe Politepix site uses cookies in order to understand how the website is used by visitors and in order to enable some required functionality. You can learn all about which cookies we use on the About page, as well as everything about our privacy policy.TWITTER | CONTACT POLITEPIX | IMPRESSUM | ABOUT | LEGAL | IMPRINT