For the comparison I used the release acoustic model, lm, trie and scorer for both the versions.
Following is the command that I used for 0.9.3:
./deepspeech --model /storage/emulated/10/Android/data/com.visteon.sns.ww.app/files/sns/ww/output_graph.tflite --scorer /data/local/tmp/native_client_093.arm64.cpu.android/kenlm.scorer --beam_width 32 --lm_alpha 1.0545920026574804 --lm_beta 3.2744955478757265 -t --audio /data/local/tmp/sns_ww_cli_test_data/alexa/en-JP_155427693_alexa_2019-09-18T12:57:03.017Z.wav
Following is the command that I used for 0.6.1:
./deepspeech --model /data/local/tmp/native_client_061.arm64.cpu.android/output_graph.tflite --lm /data/local/tmp/native_client_061.arm64.cpu.android/lm.binary --trie /data/local/tmp/native_client_061.arm64.cpu.android/trie --lm_alpha 1.0545920026574804 --lm_beta 3.2744955478757265 --beam_width 32 -t --audio /data/local/tmp/sns_ww_cli_test_data/alexa/en-JP_155427693_alexa_2019-09-18T12:57:03.017Z.wav
Then I repeated the test without an LM/scorer for both the versions. The inference times for the same files, in seconds, are recorded in the table below.
audio file 0.6.1 native client with release model and lm 0.6.1 native client with release model and no lm 0.9.3 native client with release model and lm 0.9.3 native client with release model and no scorer en-JP_155427693_alexa_2019-09-18T12:57:03.017Z.wav 1.50574 2.61644 6.24147 9.66468 global-GLOBAL_149392486_hey_siri_2020-04-22T043640.343Z.wav 1.60139 2.76046 6.3522 9.94327 global-GLOBAL_28819796_ok_google_2020-04-23T054509.144Z.wav 2.51439 1.70396 4.03806 6.30115 global-GLOBAL_221614382_visteon_2020-04-23T064454.172Z.wav 2.77821 1.19478 2.84475 3.59143