I am new to DeepSpeech i followed this link to create Speech to text code, but my results are no where near to the original speech. I am using Deepspeech 0.6.1 and have installed the relevant pretrained model. I am using this link to create my wav file with default options. Below is my code.
import numpy as np
import wave
from deepspeech import Model
from scipy.io import wavfile as wav
import speech_recognition as sr
audio_file = "D:/Dataset/DeepSpeech/nz.wav"
ds = Model('D:/Dataset/DeepSpeech/deepspeech-0.6.1-models/models/output_graph.pbmm',500)
ds.enableDecoderWithLM('D:/Dataset/DeepSpeech/deepspeech-0.6.1-models/models/lm.binary','D:/Dataset/DeepSpeech/deepspeech-0.6.1-models/models/trie', 0.75, 1.85)
rate, audio = wav.read(audio_file)
print(audio)
transcript =ds.stt(audio)
print(transcript)
I am suspecting that this issue because of my audio format or something. Please help me with this issue how can i make the most of deepspeech library.
UPDATE:
I have used below configuration to create the wav file.
After that i used audacity software to export my .wav file WAV (microsoft) signed 16bit PCM
Also i am getting different output from command line and from my code even though i have added lm.binary file and trie in my code.
I don’t know how to generate the .wav file through my python code so i have opted for this long process.
Below is my output:
original: newzeland run chase off to a solid start
command line: the news and ranges offers anitar
Code: and he also a
I am also attaching my audio file if it helps nz.zip (272.3 KB)
Command use to run the same through command line
deepspeech --model output_graph.pbmm --lm lm.binary --trie trie --audio nz.wav
*Note i am using windows 8