Vulgarities common speech

What’s the policy and future plans for vulgarities? Offensive words and phrases are a part of the common tongue and how many people communicate, but I haven’t seen any at all when recording. Even worse than that, there’s a facility to report language that is even “disrespectful”, presumably so it can be purged.

Does this mean that Mozilla will be enforcing standards of taste and decency on what can and can’t be spoken to machines?

To take a couple of extreme examples… I’m British with family in Scotland, so the “c-word” is not very offensive to me or those around me, and it’s the sort of word I’d unflinchingly use in a group chat. Among England’s working class it’s an ungendered term for a spiteful or vindictive person, while in Scotland or Australia it’s actually synonymous with “friend”. In America on the other hand, it’s an extremely offensive, misogynistic term that I’m sure most people wouldn’t want to be exposed to let alone record.

Similarly, as a white man I can’t even type the N-word, but I wouldn’t want to dissuade black people (or anyone else for that matter) from being able to use it.

The c-word is my word, the n-word isn’t, and presumably both of them strongly contrast with Mozilla’s culture. Will Mozilla be inclusive enough to take a liberal stance on this, or is the culture of progressive-imperialism too strong?

Are there plans to incorporate vulgarities, maybe via their own language sets or as extensions, tagging words and phrases etc?

We want anyone to be able to record sentences without swear words, so the experience is optimized for most people.

Having said that, we can consider in the future allow people to opt-in to an additional set of sentences which include offensive language, but right now we don’t have that technical capability.

@rosana might have some thoughts on the quality of the dataset and these ideas in the future.

2 Likes

You want to be able to record/review the sentences even in public spaces; if someone shy encountered some bigger level of profanitites in there, chances are they would just shut down the website and never come back.

1 Like

Yeah I think you’re right about public obscenities. I wouldn’t mind but I’m a pretty crass and shameless individual.

What’s the technical barrier to adding sexual or obscene phrases, and how can it be overcome?

1 Like

The technical limitation is that we don’t have a way to do it now and the current dev efforts are focused on the most immediate 2020 needs, so it’s not a priority at this moment.

This year we are optimizing for dataset quality, what we can do is add this to the backlog and evaluate its priority later in the year, once priority 1 issues have been solved.

2 Likes