I had some questions about the 0.5 model that weren’t answered in the readme.
-
How many hours of data were used to train it? What proportion of this was from Common Voice?
-
Is the CV audio data recent enough to contain the wiki sentences?
-
Do the regular and lite models have different error rates? Are there any other limitations we should expect from using the lite model compared to the regular one?