I wanted to check how fast is evaluate_tflite and it is a couple of orders of magnitude slower than evaluate. But what surprised me the most was much worse quality of inference. With evaluate_tflite I got 40% WER and with evaluate 20% WER. Is it a known issue?
I don’t see any place in the evaluate_tflite script though where I could specify the model, the model variable is not used in tflite_worker.
So I assume, maybe the beam width is much smaller than what I use with evaluate.py? (There I have the default beam_width of 1024).
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
please be specific on what you tested, evaluate tflite uses the Python bindings and spawns multiple processes. If you compare to GPU-backed big-batch, it’s not surprising to be slower.
Sure it is. But it is not used in the tflite_worker function.
I’ll double check it, because if your tests indicate similar performance it should be mistake on my side.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
6
Sorry, but your message was very unclear. That looks like a bug you could send a fix for, it’s easy to fix: we lack a call to enable the external scorer.
Can you file a bug at least, and make a PR if you can? Since you are working on that you can verify whether it works or not.
If we are missing the scorer, it may very well explain your discrepancy