How can I use intel-tensorflow for inference of deepspeech?

Those are very reasonable suggesstions. I’m planning:

  1. Test streaming
  2. Test (hacky way) TF using oneDNN for inference , just to see if there is improvement
  3. If there is improvement then perhaps makde DS with TF and oneDNN . Maybe I could share a patch (changes to BUILD) with you if results are good.

That would be welcome, but adding that kind of complexity will have to be balanced with maintainability, as well as impact for deployment, so that’s why you need to go through (2).

Again, building tensorflow with AVX2 enabled for example can give nice speedup, but:

  • it’s not constant for all intel CPUs, we saw important variations depending on each CPU
  • it generates only avx2-enabled code, and thus will completely fail to run on a CPU without it ; we had avx2 enabled at first, but too many people were blocked for the lack of it on their (powerful enough) CPU, so we had to go back and stick to only AVX.

You can build tensorflow for AVX but still benefit from AVX2 in oneDNN . This is because JIT code is generated during running of inference and only when CPU where it runs is capable of having AVX2 (or other newer ISAa). So if only TF is build for AVX but oneDNN enabled then still on relevant CPU’s there will be AVX2 inside of oneDNN implementations.

I’m just explaining the current status here.

Thanks for explanations. when I get something to share I will come back to you.

1 Like