I am currently running Deepspeech with a virtual environment on Linux and can get decently accurate transcriptions of my sound files (there are some technical words that aren’t being picked up on). However, my current analysis is to just look at the command line print statements. Is there a way to output the results of:
to a text file? I’d ideally like to place multiple outputs from different chronological short wav files so the transcript would be one long transcript for the larger wav file the short ones came from.
I figured this would have already been answered, but I can’t find an answer within the Mozilla forums or the feature requests/issues on github.
Although that’s possible, having a batch mode allowing to process several wavs at once rather than initializing the model for each wav separately would be useful.
Correct me if I am wrong but I suspect that currently the easiest way to achieve that without recompiling deepseech from source is to fiddle with installed python script deepspeech/client.py and wrap the inference part in a loop.
It can be found by running pip install --verbose deepspeech
I’m much better at python than I am at command line commands, so I believe I can work through that and have python search the appropriate space for all wav files to make inferences on. One thing I know offhandedly is that getting python to recognize files in a different folder is a bit wonky, so can I run a command like:
while currently in the folder that contains all my wav files, then have the modified_client.py grab all wav files in the folder and run inference on them?
I think it should work in principle. Getting a list of files in given directory and passing them to the wav reader + deepspeech (basically what lines 66-73 in the file you have found do) is not specific to deepspeech and you should be able to find many examples on python forums.