Alternative candidate transcripts without bindings

Is there any possibility of getting the alternative transcripts if I am not using the bindings? For what I have seen the sttWithMetadata bindings function is now capable of doing it, but I wanted to ask if there is any other way without the bindings.

Specifically, I am using the evaluate.py-like function for the efficient inference on the GPU and would like to also get the alternative, multiple transcripts.

You’d have to hack the CTC decoder Python interface to try and expose them, I guess, but that feels like a lot of painful things to do.

What is your exact need, we had some people trying to get more efficient GPU inference using the library, maybe there is somethign doable there?

Yes exactly, and if there is already a way to do it with bindings maybe there is an opportunity to reuse it. I understand that I would have to try to hack it by myself then.

Nothing really special, I am using the model for the inference using the learned model loaded from the checkpoint as recommended here: Deepspeech inference with multiple gpu. I am reading the appearing JSON files, which list the files to process, and make the inference on them.

Right but why do you rely on this setup instead of libdeepspeech.so CUDA-enabled? Do you need batching for high amount of transcriptions?

Yes, exactly. Should I use then libdeepspeech.so to achieve it? I checked the documentation for it, but it is only mentioned how to compile it. With evaluate.py it was easier because I had the specific example which I could follow.

Sorry but don’t we have clear enough doc on how to use the API? https://mozilla-voice-stt.readthedocs.io/en/latest/C-API.html

We don’t have batching support in the API, but there’s patches pending for a super long time to any help is welcome.