Single quotes in Russian

Some Russian sentences have only one french quote, opening («) or closing (»). What I should do with them?


  • «Посмотрим, посмотрим, – сказал он (Should be: «Посмотрим, посмотрим», – сказал он)
  • Отцы родные, отведите меня к Ивану Кузмичу». (Should be: «Отцы родные, отведите меня к Ивану Кузмичу» or without quote)

It’s punctuation problem and usually we ignore (example, missed commas). But in this case I think, that incorrect quotes is problem, when people want make audio clips, because comma and full stop are small punctuation symbols, but french quotes are big.

They are most probably preprocessed by a sentence segmenter, which is not aware of them. I also coded one and it is also not aware. Therefore I manually postprocess them to correct such occurrences. But I can miss some…

Usually, during training, all these punctuations got stripped out using a normalization process. But I prefer to correct them as they can change the sound of the voice (accents, silence etc), although we tend to accept wrong intonation.

During the Sentence Collector phase, I ask the validators to reject them, so that I can correct and re-enter them to the text-corpus to prevent further problems.

If I see such mishaps during recording/listening, if it does not change the sound/reading, I disregard these errors. But that’s me…

1 Like

OK, thank you. If I correctly understood, you advise to skip them like commas or full stops. Then I will do that

I merely shared how I / we process them, with some insight - and it is context dependent.

The final decision is yours (+ Russian community).

1 Like

Thank you again, I understood :slight_smile: