I tested the TIMIT dataset using the pre-trained model, and the WER value of my result is 26%


(jackhuang) #1

Does anyone get the same result with me?


(Lissyx) #2

Can you elaborate a little bit on your process ?


(Keelan Evanini) #3

I ran the following command using the pre-trained model for each of the 1680 files in the TIMIT test partition:

$ python DeepSpeech/native_client/python/client.py DeepSpeech/models/output_graph.pb <audio_file> DeepSpeech/models/alphabet.txt

This resulted in a microaveraged WER of 31.7% (3,817 substitutions, 362 insertions, and 424 deletions for 14,518 reference words).


Best way to get a baseline WER with the pre-trained model on my own test set