Hi, a friend is using Common Voice’s labeled (audio + text) data with nonlabeled (just audio) one from YouTube to train STT for various languages. They are looking for youtube channels with lots of long videos, in these languages. If you know any, please send a link. Thanks.
Albanian
Azerbaijani
Belarusian
Bosnian
Bulgarian
Croatian
Czech
Greek
Hungarian
Kazakh
Kyrgyz
Macedonian
Moldova
Polish
Romanian
Serbian
Slovak
Slovenian
Tajik
Turkish
Ukrainian
Uzbek