Thank you, good to know
I think it is doable in a controlled environment. A fork can be used to modify the SW for a set of people who give consent. A more larger limitation comes from the fact that you need to be at least 20 or more (or have consent & constant supervision) from a legal guardian. So, there are no voices from children here (only a few), but one can devise similar setup for a project about children’s speech related problems, language learning etc.
There was a related discussion lately, also pointing to security implications: Tags for voice (accent)