CVSS. Is offensive lexicon strictly forbidden or it only should not be addresed to anyone?

I want to increase diversity of language situations/words covered by dataset. Not only something specific.

Yeah, you need specific solutions for specific situations, but always there are some common solutions and I thought about CV as one of them. Isn’t it a goal of Common Voice to collect all types of spontaneous speech to made dataset for ASR models, that can convert all possible speech to text? If not, then what are your specific goals? Which cases do you try to cover by collecting this dataset?

In this specific case I mentioned, I meant that when model which is trained on CV will be able to transcript all possible words, you can easily to add/train other model that will analyze your transcript based on your needs such as finding offensive words, off topic answer etc.

Finally I see at least something about domains, thanks… Why is so hard to find some concrete information about them? On page for adding sentences I see only their list, but not an explanation what are they used for and in guidelines they just are mentioned as “theme of sentense” without any explanation of this “feature”. But I think that my misunderstanding of that is too huge for this answer. I will open a separate theme for it