I’ve just noticed that Persian (fa) has validated 276hrs of voice
But I’ve just checked and Persian only has 12K sentences on the system, which means people are recording them again and again, something we know it’s not ideal for the quality of the dataset.
This is a call to action to Persian speakers with technical knowledge to help with the Persian wikipedia extraction:
Important: Please do not use the sentence collector to send wikipedia sentences, we must use the process describe in the link above.
This would allow the project to have way more sentences without repetitions, increasing the quality of the Persian dataset.