I experience very long inference times on my desktop. I.e. 200ms for a ogg file with less than 1s in duration and no content. For a 20s regular ogg file it takes at least 6 seconds.
I’m using the prebuilt model on Rust with the deepspeech-rs binding.
I want to use my GPU. I’m using the
native_client.amd64.cuda.linuxmodel already. What else can i do?