I had some questions about the pre-trained model for 0.4.1.
How many hours of data in total were used to train the pre-trained model?
What are the proportions of each speech corpus used? i.e. is it mainly LibriSpeech, Common Voice or an even mix of all of them?
It says that the model is optimized for American English but that it uses the English Common Voice corpus, so presumably this isn’t filtered first and thus contains all English accents?