I put the TTS of the google translator with “Who is your creator”. The problem I see since yesterday is that one attempt says it well and the next one says it wrong, one attempt well, and the next wrong.
Why does that happen?
I put the TTS of the google translator with “Who is your creator”. The problem I see since yesterday is that one attempt says it well and the next one says it wrong, one attempt well, and the next wrong.
Why does that happen?
??? TTS is the other forum, but I guess you mean STT. Do you use an audio file? With what? I alread asked you to provide more information.
Let me explain better:
I placed the microphone next to the speaker while I played something in Google Translate (this way I made sure that it always played the same thing, with the same pronunciation).
I put “Who is your creator”. At one point he was saying it well, and at another moment not.
I used DeepSpeech to recognize text in audio.
The way to reproduce errors is to have a wav file that you feed directly to DeepSpeech. This could be anything. DeepSpeech will give you the same output for the same input.
so it’s all thanks to wav file?
No, but if you want to debug something you have to keep the other factors constant to rule certain causes out.