Thanks for your help in advance. So here is the code:
Running an example:
pi@raspberrypi:~/DeepSpeech $ deepspeech --model deepspeech-0.9.3-models.tflite --scorer deepspeech-0.9.3-models.scorer --audio audio/2830-3980-0043.wav
Loading model from file deepspeech-0.9.3-models.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
Loaded model in 0.0578s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.0177s.
Running inference.
**experience proves this**
Inference took 7.162s for 1.975s audio file.
Running a wav-file which is recorded via cellphone and then transformed to .wav. This worked pretty good.
pi@raspberrypi:~/DeepSpeech $ deepspeech --model deepspeech-0.9.3-models.tflite --scorer deepspeech-0.9.3-models.scorer --audio audio/DeepSpeechTest44100khz.wav
Loading model from file deepspeech-0.9.3-models.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
Loaded model in 0.00259s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.000485s.
Warning: original sample rate (44100) is different than 16000hz. Resampling might produce erratic speech recognition.
Running inference.
one two three four five six seven eight nine ten this is a test pick up boxes collected boxes get me boxes one above is going to be wonderful the weather is bad
Inference took 18.162s for 23.127s audio file.
Running a .wav file recorded via raspberry pi (with 4-mic respeaker)
pi@raspberrypi:~/DeepSpeech $ deepspeech --model deepspeech-0.9.3-models.tflite --scorer deepspeech-0.9.3-models.scorer --audio audio/DeepSpeechTestArecord64000khz.wav
Loading model from file deepspeech-0.9.3-models.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
Loaded model in 0.00286s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.000537s.
Warning: original sample rate (64000) is different than 16000hz. Resampling might produce erratic speech recognition.
Running inference.
> one two three four five six seven eight nine ten is is in test pickaxes cornet oceanos your botanising be wonderful to wiesbaden
Inference took 21.370s for 25.000s audio file.
Using the microphone (4-mic respeaker). I played the audio file from cellphone so that there so different emphasis or so on.
pi@raspberrypi:~/DeepSpeech/DeepSpeech-examples/mic_vad_streaming $ python3 mic_vad_streaming.py -m deepspeech-0.9.3-models.tflite -s deepspeech-0.9.3-models.scorer
Initializing model…
INFO:root:ARGS.model: deepspeech-0.9.3-models.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
INFO:root:ARGS.scorer: deepspeech-0.9.3-models.scorer
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_a52.c:823:(_snd_pcm_a52_open) a52 is only for playback
ALSA lib conf.c:5014:(snd_config_expand) Unknown parameters {AES0 0x6 AES1 0x82 AES2 0x0 AES3 0x2 CARD 0}
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM iec958:{AES0 0x6 AES1 0x82 AES2 0x0 AES3 0x2 CARD 0}
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_hw.c:1822:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1822:(_snd_pcm_hw_open) Invalid value for card
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
Listening (ctrl-C to exit)…
> Recognized: one two three four for
> Recognized: seven
> Recognized:
> Recognized: not
> Recognized:
> Recognized: this is a test
> Recognized: books
> Recognized: octopus
> Recognized: get your boxes
> Recognized:
> Recognized: professing to be wonderful to water is bad
^CTraceback (most recent call last):
File “mic_vad_streaming.py”, line 224, in
main(ARGS)
File “mic_vad_streaming.py”, line 182, in main
for frame in frames:
File “mic_vad_streaming.py”, line 130, in vad_collector
for frame in frames:
File “mic_vad_streaming.py”, line 114, in frame_generator
yield self.read()
File “mic_vad_streaming.py”, line 82, in read
return self.buffer_queue.get()
File “/usr/lib/python3.7/queue.py”, line 170, in get
self.not_empty.wait()
File “/usr/lib/python3.7/threading.py”, line 296, in wait
waiter.acquire()
KeyboardInterrupt
The original text of the recorded file is:
one two three for five six seven eight nine ten
this is a test
pick up boxes
get new boxes
collect boxes
the new appartment is gonna be wonderful
the weather is bad
I hope this is what you meant.