Hey everyone,
looks like Esperanto has come a long way, we already have 20 hours of audio in the dataset, recorded by over 140 contributors. I am very impressed about your work and try to add some value to the dataset myself. Plus I talk about this projekt on r/Esperanto, Mastodon, duolingo and lernu.net to bring more people to contribute in the project.
I am looking forward to see the first experiments with this dataset, because afaik no one has ever tried machine learning with constructed languages. I would guess that the regularity and the lack of exceptions leads to very good results, even with small sample sizes.
I would love to experiment with text to speech in Esperanto. If this leads to good results this could become a breakthrough for audiobooks in Esperanto. Does the Mozzilla TTS engine run on a consumer laptop without a fast GPU?
PS: some parts of the Common Voice website are not translated to Esperanto yet. I tried to help a little but unfortunately I am only an itermediate esperanto speaker and don’t want to translate longer texts. Does anybody want to help me translate the missing parts?
BTW: the article about Common Voice in the Esperanto Wikipedia could use some work, I will try to expand it this weekend.