Problem with some French sentences


(Jean-Baptiste Bertrand) #1

Hello everyone,

When recording or validating sentences through the interface in French, I stumbled upon a couple of sentences with spelling mistakes, and some others which weren’t in French at all.

I guess there is a problem somewhere. Where or how should I report these sentences, the next time I come across them?

Thanks!


(Lissyx) #2

It’d be great if you added some samples. Sentences are in the text files: https://github.com/mozilla/voice-web/tree/master/server/data/fr


(Jean-Baptiste Bertrand) #3

Thanks!

At the time I came across them, I didn’t think at all about writing them down somewhere to report them later, but I’ll definitely do that the next time it happens.

I’ll have a look at the link you gave to see if I can find the ones I stumbled upon.


(Jean-Baptiste Bertrand) #4

I found again two examples not in French, line 4391 and 4392 of this file:

Relijion gozh ar Gelted.
Kredennoù kozh ar pobloù amerindian

These are the two I came across when validating sentences, maybe there are others.

I sent a pull request to correct the issue.


(Lissyx) #5

Thanks for sending a PR !


(Francois) #6

In another adresses, there are some names which are very localized. Only people who live near this adress can read the name correctly. (Oyonnax, Werendeheim…)
And some short name aren’t readable clearly like : RTE, CH. …
It will be perhaps useful to add a button with “wrong sentence” ?


(Jean-Baptiste Bertrand) #7

Hi François,

I don’t think we should exclude rare words, on the contrary.

When I’m not sure how to pronounce a word I’m not familiar with, either I pass the sentence, or I try to guess the pronuncation. In some instances I also looked for the pronunciation in Wiktionary, but it requires to know the International Phonetic Alphabet.

As for short names, I guess that “RTE” is an acronym; I don’t know how acronyms should be dealt with, but my guess is that we should simply read the letters one by one when we come across one.

That being said, I think that a button to report problematic sentences may be a good idea; but I think it requires to display some recommandations about what kind of sentences should be reported (e.g. spelling mistakes).