Storing Output From Inference



I am currently running Deepspeech with a virtual environment on Linux and can get decently accurate transcriptions of my sound files (there are some technical words that aren’t being picked up on). However, my current analysis is to just look at the command line print statements. Is there a way to output the results of:

deepspeech models/output_graph.pb my_audio_file.wav models/alphabet.txt models/lm.binary models/trie

to a text file? I’d ideally like to place multiple outputs from different chronological short wav files so the transcript would be one long transcript for the larger wav file the short ones came from.

I figured this would have already been answered, but I can’t find an answer within the Mozilla forums or the feature requests/issues on github.


(Reuben Morais) #2

Just redirect output to a file:

deepspeech models/output_graph.pb my_audio_file.wav models/alphabet.txt models/lm.binary models/trie >> text_file.txt

This should append the transcriptions to the end of text_file.txt.

(Yv) #3

Although that’s possible, having a batch mode allowing to process several wavs at once rather than initializing the model for each wav separately would be useful.

Correct me if I am wrong but I suspect that currently the easiest way to achieve that without recompiling deepseech from source is to fiddle with installed python script deepspeech/ and wrap the inference part in a loop.

It can be found by running
pip install --verbose deepspeech


I don’t have my linux on me, but I presume you are speaking of DeepSpeech/native_client/python/ within the github (

I’m much better at python than I am at command line commands, so I believe I can work through that and have python search the appropriate space for all wav files to make inferences on. One thing I know offhandedly is that getting python to recognize files in a different folder is a bit wonky, so can I run a command like:

python path_to_client/ path_to_pretrained_model/output_graph.pb path_to_pretrained_model/alphabet.txt path_to_pretrained_model/lm.binary path_to_pretrained_model/trie

while currently in the folder that contains all my wav files, then have the grab all wav files in the folder and run inference on them?


Ah, I knew it’d be stupidly simple. Thanks!

(Yv) #6

I think it should work in principle. Getting a list of files in given directory and passing them to the wav reader + deepspeech (basically what lines 66-73 in the file you have found do) is not specific to deepspeech and you should be able to find many examples on python forums.