(Versions and console output at end of post)
I’m attempting to use the Chinese pretrained model provided in the main git repo. I have gotten the English model/scorer to work perfectly.
But the output I am getting from the Chinese model/scorer in UTF-8 encoding is just “���” repeatedly, and in GBK (simplified Chinese) is “锟斤拷”, repeatedly, regardless of the audio input.
I was able to find that “锟斤拷” is a standard Chinese output when there is a problem with the encoding. I’m outputting the file with writeFileSync in nodejs standard library, with encoding of utf-8. gbk is not a valid encoding to specify, so I haven’t been able to do that. I did try to use a charset detection package, that told me it was 100% confident that the token.text was ASCII, which I find hard to believe. I also tried to use an encoding converter to convert to/from a variety of encodings. No luck there either.
Any ideas why I might be getting bad encoding back from the chinese pretrained models in the token.text? I’m not sure this is the issue.
I confirmed that the audio input I am using is .wav, 16khz, single channel.
I’m very new to this stuff. I have some standard mandarin audio that I plan on running through this, as well as collecting user audio, and cross referencing to give the user feedback on how good their pronunciation is. I don’t yet know if I can get everything I need here, or if I am going to be able to utilize something like SCTK/SCLite. Still exploring! Thanks for the assistance.
Tried running in two environments.
- Just running inference.
- using ‘deepspeech’ npm package
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85 - Windows 10.0.19042 / Python 3.9.0
- WSL2 Debian 10 / Python 3.7.3
I haven’t been able to find anything anywhere, and I’ve looked hard. I can’t remember the last time I had to post a question. Thank you again for any assistance.
My console output is:
TensorFlow: v2.3.0-6-g23ad988fcd
DeepSpeech: v0.9.3-0-gf2e9c858
2021-03-21 20:47:21.357944: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.