Does inflammatory content of Sentences matter?

I’m currently reviewing some sentences, and am unsure how to proceed.
Example: 台湾是中华人民共和国的神圣领土的一部分
Translation: Taiwan is part of the sacred territory of the People’s Republic of China

The text may be potentially inflammatory, but there are no rules regarding this. However, “we are trying to make recording sentences as much fun as possible”, and I’m not sure these sentences are much fun to read. :expressionless:
Any advice?

Perhaps the guidelines could be updated to include a rule that they shouldn’t be political / controversial to certain groups?

1 Like

I believe this is up to individual communities to decide. See e.g. here. People can always report the sentence.

1 Like

Well I believe it is still okay. Just read it for speech recognition sake you know. If Common Voice cannot understand words or phrases that are political or controversial, then it would prove to be a disadvantage in the long run.


I think you make a good point! It’s definitely important to cover the corpus of words that come up. Just wary of turning away contributors. Think these sentences are grammatically sound, so I’ll allow them?

Yes, I believe you should. Because you’re concerned about turning away contributors, how about before they read aloud that particular sentence, you can have a warning message saying:

" Please be wary that the following sentence may be offensive or politically incorrect. For the sake of voice recognition please take part. If you you feel uncomfortable, feel free to skip this sentence. "

Something like what I wrote above. So you can just tell the viewers that the following sentence may be a bit sensitive.

Hope I helped!

Sorry, I’m still new to the contributing/reviewing sentences space. (I used to do recording/listening) I’m not aware of how to add a warning message?

where did you find the sentences? sentences collector, or on common voice website during record, or in the downloaded database?

I think it’s fine to ignore them, if the local community had no problem with it.

We had remove some Chinese Simplified sentences previously due to political sensative to local participants. If you find any sentences on Common Voice site that had similar problem, you can also report in the following thread.

If you found the sentence on Sentences Collector, you can just reject them if you feel the sentences will upset some contributor.

1 Like

this could be very difficult because many sentences were came from wikipedia scraptor. If we set up such guideline, we’re basically asking contributor to do manual censoring on every single sentences.