I’ve found the confidence to be nowhere near reliable. Even when the transcription was spot on, the confidence often says this was 99% or 0% seemingly random. I understand correctly that the number given is the logit value and to get an actual usable probability/confidence I calculate:
exp(confidence) / (1+ exp(confidence)
right?
It also differs a lot when using the default language Model in contrast to using a custom language model with just 20 vocabularies. With the custom model I get more normal results but with the default model its almost always 0% even if the transcription was good.
Is this normal?