Generally I don’t think that it’s fun, efficient or useful to have the full dictionary there. If it indeed should be helpful for the model (which I doubt), this is something that should be imported outside of the Sentence Collector.
I have removed those from the Sentence Collector database.
See my question in Is there a way to mass-clean a language overrun by trolls or religious far-from-neutral texts? · Issue #425 · common-voice/sentence-collector · GitHub.