I had import the Dataset one time again. But with my edited aphabet, which contains (ä,ü,ö), that was my mistake.
I had found that by testing different alphabet’s. And I found out that (32,) is the number of letters in my edited alphabet version and (29,) the nummber of letters in the standard version.
And I haven’t had this Error again, but then came this Error:
ValueError: Alphabet cannot encode transcript “ich hoffe es” while processing sample “/media/sf_de/clips/common_voice_de_21632146.wav”, check that your alphabet contains all characters in the training corpus. Missing characters are: [’ ', ’ '].
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
And yet your command line does not show any specific alphabet being passed, which is consistent with your error.
The reason I’m asking is because obviously there is something wrong, so just telling us “I followed that doc” does not really help: we know the doc. Maybe you did something else, prior.
Like, have you checked what I said, about a prior checkpoint being reloaded automatically and that would have been produced with a different alphabet than yours?
I had saved the new alphabet to the same path like the old one was saved.
So I don’t had to add the “–alphabet_config_path” arrgument.
Yes, I understand your point.
I had remove all checkpoints in $USER/.local/share/deepspeech/checkpoints/.
Is there another path where they are stored, or do I need to enter any command to check this?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
6
Do you mean you just did, or you did that prior to the error ?
That depends if you pass a specific checkpoint dir.
Sorry @lissyx,
I think I expressed myself wrong, I solved the error with (29,) and (32,) by replacing the English alphabet with the German one when importing.
Now I stuck with this Error:
But I think for this I should open a new discussion. Right?
Thanks for helping.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
10
Ok, I was mislead and thought you still had the error
Well, it looks like your alphabet does not match what you have in your data, you need to fix that as well. Likely you have some UTF-8 non breaking space?
I’dont know what was wrong, but now I had used check_characters.py from deepspeech 0.6.1 and compared that to my alphabet. It looks same. I do Copy&Paste and now it works. Maybe you were right.
I am also facing the same error related to outer layer shape mismatch.
This is the way I am trying to replace the hindi alphabet file with the english one and provide pre-trained deepspeech model for initialisation.
You don’t give much information, but it says above that you can’t change the alphabet for a trained model. I guess you did change that? Hard to tell without the error msg and some more info.