Important questions: punctuation, time available to read, errors on written sentences


(Dam) #1

Hello everyone, this is my first post here but I’m a contributor from July. I have some big questions/doubts so I sent @mhenretty an email asking for help. He told me to post here the questions so we can all discuss about them:

  • Should I pronounce the punctuation of the sentences (eg full stop, comma, colon, slash and so on)? So far I validated audio where people didn’t pronounce them and I didn’t pronounce them in mine, too. I think many other volunteers did the same. What can we do?

  • Another question, sometimes sentences are too long compared to the time available to read them and people often read them badly by jumping words. Is it possible to extend the time available to read these sentences?

  • Last question: the written sentences sometimes have errors, which the reader corrects in the audio. Technically this reading is wrong because what was read is not the same of what is written, but from the point of view of grammar and syntax rules the sentence pronounced is correct. In these cases, how do we behave? I think there should be a button to report writing errors in sentences. What do you think?

Thank you for your help! :slight_smile:


(Luc Salommez) #2

Hello BluLion :),

  • I think we shouldn’t pronounce the punctuation, otherwise it won’t be very intuitive for contributors and it could be a bit painful to read out loud. Also some punctuation can have multiple ways to be spoken.

For example for " you can say in French “Guillemets” (Quote) “Ouvrez les guillemets” (Open quote) “Fermer les guillemets” (Close Quote) etc …

Also, how to distuiguish if words are spoken or if we try to read the punctuation ?

One more thing that might only be my opinion is, I think punctuation is already spoken implicitely by the way you talk most of the time, for exemple the way you speak change when you read the symbols : , . ? ! so they are implicitely encoded in the speech.

I would say we should read sentences as if we were reading a paper to a friend or a book in classroom, which seems to be more natural and would lead to more consistency among gathered data as most people would instinctively read this way.

  • I also talked about this issue and I agree more time should be allowed for longer sentences, or too long sentences should be split into smaller sentences if it is possible. I know some processing is already done but sometimes sentence still are a bit too long to be spoken in the allowed time.

  • I share you opinion, I think sentences should be read the same way they are written and I wish we could report sentences or at least propose a new version in order to fix the typos or remove inapropriate sentences.

The team at Common Voice is currently doing a great job adding an easy way to propose sentences and to validate them before they are added to the corpus of sentences to be spoken. I read somewhere this should be available soon so this issue should be fixed soon.


(Michael Henretty) #3

Yes, please do not pronounce the punctuation (meaning don’t say “comma” in the phrase if it contains a comma.

In this case, the source sentence is probably too long. We aim to have the recordings be less than 10 seconds, which is ideal for training DeepSpeech. We could extend the time, but better would be to allow users to report overly long sentences. See: https://github.com/mozilla/voice-web/issues/272#issuecomment-427512992

Yup, we are working on that:


(Dam) #4

Thank you Luc for your exhaustive answer!
I agree with you but talking about punctuation I was thinking about one particular use case: dictation. I think many people would use Mozilla Common Voice to dictate a document (eg a .docx or .odt) and would expect it to work as usual. How can we deal with that problem?


(Dam) #5

Thank you for your reply and for working on this, Michael!! :clap:


(Luc Salommez) #6

It would be up to the dictation software to convert the words “comma” “period” “new line” to special characters, in my opinion.