Output of deepspeech test epoch and deepspeech native client application is not same

I am using the model and deepspeech version 0.6.1. The first part shows the result of the test epoch that is run at the end of training and the second shows the result of native client application. The language model and other parameters are the same in both the cases.

File 1:

WER: 1.000000, CER: 0.481481, loss: 27.261322

  • wav: file:///home/sranjeet/Documents/Speech_DataSet/internal_test_data/en-US/pcm_general_ckohls_american_accent/noagc/1583527502552_share_gas_stations_nearby.wav
  • src: “show me gas stations nearby”
  • res: "they gas stations and your by

deepspeech --model /home/rsandhu/sns/sns-app-android/app/src/main/assets/asr/en-US/output_graph.pbmm --lm /home/rsandhu/sns/sns-app-android/app/src/main/assets/asr/en-US/lm.binary --trie /home/rsandhu/sns/sns-app-android/app/src/main/assets/asr/en-US/trie --audio /home/rsandhu/Downloads/en-US_testing/pcm_general_ckohls_american_accent/noagc/1583527502552_share_gas_stations_nearby.wav

inference: surely guessed stations in your by

File 2:

WER: 1.000000, CER: 0.800000, loss: 27.743788

  • wav: file:///home/sranjeet/Documents/Speech_DataSet/internal_test_data/en-US/pcm_general_ckohls_american_accent/noagc/1583527528416_tenacity_see.wav
  • src: “turn off the ac”
  • res: “those”

deepspeech --model /home/rsandhu/sns/sns-app-android/app/src/main/assets/asr/en-US/output_graph.pbmm --lm /home/rsandhu/sns/sns-app-android/app/src/main/assets/asr/en-US/lm.binary --trie /home/rsandhu/sns/sns-app-android/app/src/main/assets/asr/en-US/trie --audio /home/rsandhu/Downloads/en-US_testing/pcm_general_ckohls_american_accent/noagc/1583527528416_tenacity_see.wav

inference: i also

are you sure about that? lm alpha, beta and beam width were handled separately on that version

@lissyx I re-ran the test by specifying beam_width 1024, lm_alpha 0.75, lm_beta 1.85 and the results are still the same.

Well, sorry, but there’s nothing I can think of …

1 Like

I agree with lissyx, if the parameters were the same, you should get the same result. Double check that everything is really the same. To me it looks like you changed either the language model or the pbmm file.

You could check whether the wav-files have a different format from 16KHz PCM and the downsampling is different, but usually the same conversion is applied.

Also you are removing training command line, training log, stdout / stderr output of inference, so we wan’t cross-check what you do …