Storing Output From Inference

DJ-Hay · January 15, 2018, 4:54pm

Hello,

I am currently running Deepspeech with a virtual environment on Linux and can get decently accurate transcriptions of my sound files (there are some technical words that aren’t being picked up on). However, my current analysis is to just look at the command line print statements. Is there a way to output the results of:

deepspeech models/output_graph.pb my_audio_file.wav models/alphabet.txt models/lm.binary models/trie

to a text file? I’d ideally like to place multiple outputs from different chronological short wav files so the transcript would be one long transcript for the larger wav file the short ones came from.

I figured this would have already been answered, but I can’t find an answer within the Mozilla forums or the feature requests/issues on github.

Thanks!

reuben · January 15, 2018, 7:44pm

Just redirect output to a file:

deepspeech models/output_graph.pb my_audio_file.wav models/alphabet.txt models/lm.binary models/trie >> text_file.txt

This should append the transcriptions to the end of text_file.txt.

yv001 · January 15, 2018, 8:15pm

Although that’s possible, having a batch mode allowing to process several wavs at once rather than initializing the model for each wav separately would be useful.

Correct me if I am wrong but I suspect that currently the easiest way to achieve that without recompiling deepseech from source is to fiddle with installed python script deepspeech/client.py and wrap the inference part in a loop.

It can be found by running
pip install --verbose deepspeech

DJ-Hay · January 15, 2018, 8:36pm

I don’t have my linux on me, but I presume you are speaking of DeepSpeech/native_client/python/client.py within the github (https://github.com/mozilla/DeepSpeech/blob/master/native_client/python/client.py).

I’m much better at python than I am at command line commands, so I believe I can work through that and have python search the appropriate space for all wav files to make inferences on. One thing I know offhandedly is that getting python to recognize files in a different folder is a bit wonky, so can I run a command like:

python path_to_client/modified_client.py path_to_pretrained_model/output_graph.pb path_to_pretrained_model/alphabet.txt path_to_pretrained_model/lm.binary path_to_pretrained_model/trie

while currently in the folder that contains all my wav files, then have the modified_client.py grab all wav files in the folder and run inference on them?

DJ-Hay · January 15, 2018, 8:37pm

Ah, I knew it’d be stupidly simple. Thanks!

yv001 · January 15, 2018, 10:50pm

I think it should work in principle. Getting a list of files in given directory and passing them to the wav reader + deepspeech (basically what lines 66-73 in the file you have found do) is not specific to deepspeech and you should be able to find many examples on python forums.