Choppy audio, looping, cut off early

Today only, I’ve started getting many Listen audio clips that have bad audio problems. Problems include loud clicks, portions that loop, missing parts, distortion, etc.

I’ve been hesitantly marking them “No” but this assumes I’m having no technical problems, and am hearing the audio that was actually submitted. I’m a bit reassured this is the right move because they all seem to be from one contributor.

For an example, here’s the printed text, and my good faith attempt to transcribe what it actually sounds like:
shown: He likes to burp and eat cats, and can whistle without opening his mouth.
heard: He likes to burp an-tss -tss wistle wi-thout -thout opening his mouth.

Is “No” the right move? Shame about all their wasted Speak time.

1 Like

No is the right move since you didn’t hear what’s displayed on the test. As a general rule if I doubt I try to close my eyes an listen without reading to avoid bias.

Thanks for the reply and confirmation. I do answer No if they don’t match.

I suppose my concern was that I’ve never heard audio problems on Common Voice like the ones I’m getting now—but now I’m getting very many. It is not an error in their speaking; I believe the sentences were spoken correctly (a normal, high fraction anyway), but then the recording process or saved file has technical problems. So I started doubting my browser/connection, since I’d hate to be rejecting clips that are actually correct server-side and merely somehow mangled in transit.

But I no longer think that is the case. I have heard enough good clips from other contributors since posting that I’m reasonably assured it’s just a batch of bad recordings from one “Speak”-er. Thanks.

I’m also getting it, mostly one Australian or New Zealander lady who, I assume, has a web browser or phone that’s buggy or not powerful enough to do the recording; the audio buffer seems to fill up and then overwrite itself causing jumbled and repeated audio. There’s also another lady who has perfect enunciation and patter, but keeps hitting the stop button too early. That’s a pity on both counts.

I think both of these could be solved technically, the former possibly with a combination of checking the user agent and displaying warnings / forcing them to listen a couple of times (or maybe introducing a native app with better performance), the latter by extending the recording time by 200ms or so (or whatever the average human reaction time is).

There’s also a guy who is far too quiet! Ideally we’d be able to leave a review comment on specific recordings. Anonymously of course, and with a “be kind and constructive!” instruction on the post comment page.

Can you check from other connection if that’s still the case?

Some records start with long pause. Can we somehow cut it to clear the silence ?

I don’t think we have that capability right now.

We should definitely gather a list of improvements on the record/listen and then document them as request, so we have that on the backlog.

On the now-rarer occasions when I get them, I’m hearing the choppy contributor with the same symptoms in three browsers, and both while VPNed to multiple countries and directly connected in PA, NJ, and NY, USA. I’ve heard plenty of good recordings since—it’s just her device.

As for silence, I’m new to these projects, but from other threads I understand there’s always going to be some trade-off with clip “processing” for our sake. If we expect The Algorithm to cope with leading silence, disparate volume levels, no in-clip normalization, background noise, pauses, coughing, mic overload, plosive air noises, etc. …then, for now, we must do so as well.

Yup I also checked in another browser, with and without my VPN. It was the lady’s device or browser, and had the texture of the sort of synchronization issues you get when a buggy ring buffer is at extreme load; games in the days of CRT monitors and without double buffering, programming your own keyboard routines in MS-DOS, download routines before ubiquitous HTTP.

My guess would be it’s a phone with a slow CPU, a heavy browser and crappy audio drivers.