Would it be possible to use librevox?

Do you think it would be possible to make use of some of the recordings from librevox? https://librivox.org/

Just an idea, I do not know how practical this would be.

Yes we will! Thank you for the suggestion :slight_smile:

Another possibility is to take the samples from https://en.wiktionary.org/wiki/Wiktionary:Main_Page - they have recorded many many single words. It might be a good little addition to the data set.

This might also be a good source https://tatoeba.org/eng/sentences/show/2544351 apparently they have 6,128,636 sentences in 322 languages

Thanks @rain1! We are definitely working with Tatoeba. That is a great project!