Alternative candidate transcripts without bindings

Jendker · August 11, 2020, 4:39pm

Is there any possibility of getting the alternative transcripts if I am not using the bindings? For what I have seen the sttWithMetadata bindings function is now capable of doing it, but I wanted to ask if there is any other way without the bindings.

Specifically, I am using the evaluate.py-like function for the efficient inference on the GPU and would like to also get the alternative, multiple transcripts.

lissyx · August 11, 2020, 4:42pm

You’d have to hack the CTC decoder Python interface to try and expose them, I guess, but that feels like a lot of painful things to do.

lissyx · August 11, 2020, 4:43pm

What is your exact need, we had some people trying to get more efficient GPU inference using the library, maybe there is somethign doable there?

Jendker · August 11, 2020, 5:11pm

Yes exactly, and if there is already a way to do it with bindings maybe there is an opportunity to reuse it. I understand that I would have to try to hack it by myself then.

Nothing really special, I am using the model for the inference using the learned model loaded from the checkpoint as recommended here: Deepspeech inference with multiple gpu. I am reading the appearing JSON files, which list the files to process, and make the inference on them.

lissyx · August 11, 2020, 5:15pm

Right but why do you rely on this setup instead of libdeepspeech.so CUDA-enabled? Do you need batching for high amount of transcriptions?

Jendker · August 14, 2020, 2:20pm

Yes, exactly. Should I use then libdeepspeech.so to achieve it? I checked the documentation for it, but it is only mentioned how to compile it. With evaluate.py it was easier because I had the specific example which I could follow.

lissyx · August 14, 2020, 2:23pm

Sorry but don’t we have clear enough doc on how to use the API? https://mozilla-voice-stt.readthedocs.io/en/latest/C-API.html

We don’t have batching support in the API, but there’s patches pending for a super long time to any help is welcome.