Real-time DeepSpeech Analysis using built-in microphone

sehar_capricon · November 13, 2019, 8:26am

now i have ran this file as python file and i am getting this above error kindly help

carlfm01 · November 13, 2019, 8:40am

Please use proper code format (try the forum toolbox and the preview)

Looks like a version mismatch, please make sure that you are using the same version tags for the client and model.

sehar_capricon · November 13, 2019, 8:48am

version mismatch of tensorflow??

lissyx · November 13, 2019, 10:40am

Of the deepspeech Python wheel. The code your wrote does not use the same API as the module you have installed …

lissyx · November 13, 2019, 10:42am

Ok, @sehar_capricon, you seriously need to make an effort on your end and read and do what we are instructing you to help you. Please read the code of the examples in the git repo, the link and the instructions were already shared to you earlier. We are welcoming newcomers, but we cannot do this work for you. If you refuses to make any effort, we won’t be able to help you.

mbonsign · November 18, 2019, 9:22pm

I don’t think you are using deepspeech here. I believe you are simply using speech_recognition’s default STT. Without deepspeech, if you just install pyaudio and SpeechRegognition you can type python -m speech_recognition, and it will work without pointing to a STT engine.

sehar_capricon · January 2, 2020, 8:43am

whats the difference between output_graph.pb and output_graph.pbmm

lissyx · January 2, 2020, 9:50am

If you read the documentation, you will learn that it is a protocobuffer modification to make the file mmap()able.

sehar_capricon · January 3, 2020, 6:14am

how to get output_graph.pbmm file

sehar_capricon · January 3, 2020, 7:36am

what is the process to get output_graph.pbmm

lissyx · January 3, 2020, 12:08pm

Read the documentation

sehar_capricon · January 5, 2020, 12:14pm

Which documentation can you provide me the link

lissyx · January 5, 2020, 2:47pm

Mattia_Ducci · January 7, 2020, 5:37pm

Hi! Starting from the @duys’ script and the @sehar_capricon’s issue I adapted the script to match the __init__.pyof the DeepSpeech 0.6.0-g6d43e21 installed on python 3.7 and now I did the first attempts with the English pre-trained model (to downloaded it following the documentation) and audio-streaming (no .wav file). Here’s the code:

from deepspeech import Model
import numpy as np
import speech_recognition as sr
sample_rate = 16000
beam_width = 500
lm_alpha = 0.75
lm_beta = 1.85
n_features = 26
n_context = 9
models_folder = 'deepspeech-0.6.0-models/'
model_name = models_folder+"output_graph.pbmm"
alphabet = models_folder+"alphabet.txt"
language_model = models_folder+"lm.binary"
trie = models_folder+"trie"

ds = Model(model_name, beam_width)
ds.enableDecoderWithLM(language_model, trie, lm_alpha, lm_beta)

r = sr.Recognizer()
with sr.Microphone(sample_rate=sample_rate) as source:
    print("Say Something")
    audio = r.listen(source)
    fs = audio.sample_rate
    audio = np.frombuffer(audio.frame_data, np.int16)
    print(ds.stt(audio))

Hope it helps

nathphoenix · March 1, 2021, 9:59am

please i have been working on a real time speech to text, but i notice deepspeech can actually give me what i want then i notice the algorithm only accepts a wave file and not microphone as i want to record and get text in real time, does this approach finally works here in real time

othiele · March 1, 2021, 10:19am

No hijacking of old threads please, delete your post and start a new thread or simply google DS examples. Mic is not a problem.

nathphoenix · March 1, 2021, 10:31am

nobody is hijacking, thought is a normal thing for someone to ask a question under a particular thread that reference one’s desire

othiele · March 1, 2021, 12:08pm

Reading doesn’t seem to be your strong suit either … I’m out

lissyx · March 1, 2021, 12:17pm

No, if you read carefully the API you can see it accepts neither a WAV file nor a microphone, the library accepts chunks of audio data. Feeding that from a file, from the microphone is your responsability.

Except you are reviving a thread that is more than one year old, bringing back different context.

There are multiples examples https://github.com/mozilla/DeepSpeech-examples/ implementing already what you need, please read them.

nathphoenix · March 1, 2021, 12:30pm

Thanks for the link, @othiele sorry, i’m actually new here, just disturbed with my current issues.