Hello,
I am working on Ubuntu 20.04, Python 3.9.5.
I created a venv and installed the requirements on the requirements.txt file in the Mic VAD Streaming example, so that I can stream audio and generate text.
I installed deepspeech-tflite==0.9.3 because I want to run the pre-trained tflite model on the GitHub repo and not the pbmm due to resource constraints.
I renamed the files and I tried to do run the following command:
python DeepSpeech/mic_vad_streaming/mic_vad_streaming.py --model DeepSpeech/models/ds_tflite.tflite --scorer DeepSpeech/models/ds_scorer.scorer
but then I ran into some errors stating the following:
Initializing model...
INFO:root:ARGS.model: DeepSpeech/models/ds_tflite.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2021-05-29 11:52:03.018818: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Data loss: Can't parse DeepSpeech/models/ds_tflite.tflite as binary proto
Traceback (most recent call last):
File "/home/ub2004/DeepSpeech/mic_vad_streaming/mic_vad_streaming.py", line 224, in <module>
main(ARGS)
File "/home/ub2004/DeepSpeech/mic_vad_streaming/mic_vad_streaming.py", line 163, in main
model = deepspeech.Model(ARGS.model)
File "/home/ub2004/PycharmProjects/GlobalDWS_Assistant/lib/python3.9/site-packages/deepspeech/__init__.py", line 38, in __init__
raise RuntimeError("CreateModel failed with '{}' (0x{:X})".format(deepspeech.impl.ErrorCodeToErrorMessage(status),status))
RuntimeError: CreateModel failed with 'Error reading the proto buffer model file.' (0x3005)
And so then I tried the non-tflite version (i.e. the regular .pbmm model) and so I installed deepspeech==0.9.3 and run the following command:
python DeepSpeech/mic_vad_streaming/mic_vad_streaming.py --model DeepSpeech/models/ds_pbmm.pbmm --scorer DeepSpeech/models/ds_scorer.scorer
and some warnings popped up but it worked nonetheless:
Initializing model...
INFO:root:ARGS.model: DeepSpeech/models/ds_pbmm.pbmm
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
2021-05-29 11:48:22.203950: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO:root:ARGS.scorer: DeepSpeech/models/ds_scorer.scorer
ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave
ALSA lib setup.c:547:(add_elem) Cannot obtain info for CTL elem (MIXER,'IEC958 Playback Default',0,0,0): No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave
Listening (ctrl-C to exit)...
Recognized: testing
However, as I stated, due to resource constraints, I only want to work with the tflite model. I looked for solutions online but I could not find any that could help me with this specific issue.
Here are the packages I have installed on the venv after all of this testing:
Package Version
----------------- -------
colorama 0.4.4
deepspeech 0.9.3
deepspeech-tflite 0.9.3
halo 0.0.31
log-symbols 0.0.14
numpy 1.20.3
pip 20.0.2
pkg-resources 0.0.0
PyAudio 0.2.11
scipy 1.6.3
setuptools 44.0.0
six 1.16.0
spinners 0.0.24
termcolor 1.1.0
webrtcvad 2.0.10
wheel 0.36.2
Any help is greatly appreciated.