Getting probability with intermediateDecode in Python

Dear DeepSpeech-Team,
I am currently in the process of finishing my application where DeepSpeech is a central component and I am already astonished by the results. The versions I am using are Python 3.6.8 and DeepSpeech 0.6.1.

One thing I haven’t actually managed to do is obtaining a list or tuple of the candidates for the most recent word that has been spoken/decoded with indermediate-decode. I read the documentation and searched in this forum but I am not shure if or how I can do this in combination with model.intermediateDecode (not model.stt and sttWithMetadata)

An example: I say “two”, Deepspeech will think it’s either “two”, “to” or “too” and finally decides on one and I get back a string with intermediate.Decode. How to obtain the other words/candidates?

My code snippet for feeding/resampling/decoding (working):

model = deepspeech.Model(MODEL_FILE_PATH, BEAM_WIDTH)
model.enableDecoderWithLM(LM_FILE_PATH, TRIE_FILE_PATH, LM_ALPHA, LM_BETA)
context = model.createStream()

def process_audio(in_data, frame_count, time_info, status):
global text_so_far
global TextCut
global RATE
data16 = np.frombuffer(in_data, dtype=np.int16)
resample_size = int(len(data16) / RATE * 16000) #RATE is 44.1kHz in this case
resample = signal.resample(data16, resample_size) #using scipy
resample16 = np.array(resample, dtype=np.int16) #numpy
model.feedAudioContent(context, resample16)
text = model.intermediateDecode(context)
if text != text_so_far:
text_so_far = text
TextCut = cut_strings(text_so_far)

return (in_data, pyaudio.paContinue)

Thank you very much in advance!
Regards,
Arjaan Auinger

I added DS_IntermediateDecodeWithMetadata only recently on master, so you’ll have to wait for 0.7.0.

1 Like

Hello Reuben,
thanks for the info! Any idea when 0.7.0 will be released on the master?