Hi David,
You can have a look here, if you haven’t already.
There’s an instant messaging group (“Telegram”) where we’re a few French-speaking people interested in Common Voice and discussing this kind of thing. Feel free to join the discussion and share your ideas!
Generally speaking, I think we should try to communicate towards scholars and researchers in linguistics and social sciences. They may know people willing to give their voice, and the potential applications of Common Voice may be of direct interest for them (e.g. it gives a new spoken corpus to study for linguists; it’s also a way to improve speech recognition technologies, which may be useful for interview retranscriptions done by sociologists, etc.). I started to see how the land lies with some colleagues working with spoken corpora. They showed some interest but my guess is that to really involve them, they need to have a direct and obvious interest in it.
Another kind of institutions or persons we could involve are associations giving French lessons to foreign learners. The point for them is that Common Voice gives them sentences to practice their pronunciation; the point for Common Voice is that we would get a wider variety of accents. The problem is probably that we can’t make complete beginners participate, because there would be too many pronunciation mistakes. So in my opinion, it should be limited it to intermediate and advanced learners. It would be great to have the point of view of someone involved in this kind of association, to see how realistic it is.
Another observation I have is that the current corpus misses regional accents, and accents from other countries than France. I don’t know how we could correct that, but there are millions of French speakers outside of France, and it’s a shame that they’re not involved; communicating towards them would make the corpus grow way faster.
If you want to speak about the project on social media and share it with your friends and contacts, talking about the applications that Common Voice may have is relatively efficient to make people participate (e.g. how it helps develop voice recognition systems, which may be useful for people with handicap, etc.). The problem seems to make people participate on the long run, rather than occasionally - even if it’s still great to have people participating!