At first, I was also thinking like you do, but my models say otherwise, as I explained before.
It might seem a waste of volunteer time for now, but please be aware that dataset-building projects like Common Voice have long spans, measured by years.
Accuracy in machine learning models gets better with exponentially more data. Just to give some hypothetical numbers: You can drop WER from 50% to 40% with +100h of recordings, but to go from 20% to 10% you would need +1000h, much more if you need to go from 5% to 3%.
Georgian has ~4M native speakers, and you have 1254 different (?) voices, which is a good number (at least much better than our sample size for Turkish). You’ll probably could not reach 1% of the population (40.000) with campaigns etc, so you will increase that gradually, but having those people record more will be more important. As I explained previously, more data is better.
One can easily use 5k recordings from Nemo in training. In a couple of years, many people from the community will also reach 10k+, so her recordings will be used more, and they will not be wasted (except bandwidth perhaps). And probably she will quit after some time, but her contribution will live…
We will never be able to get 1M different people recording diverse sentences (ideal case). So we need 1-2000 people to record 1000s of sentences and try to enlarge the voice diversity.