When is quiet too quiet?

I don’t know if it’s possible to lookup the individual clips. But this one the clips that i never know what to with. It’s quiet but still barely audible if i increase my volume.
Have had decent amount of clips from this user

clip-id: 30711810
clip-glob: c33d80e8-08d9-438d-a350-50518ca2a68a/301258733088c03b77210df7c27609382782fca17c7cc2e374095b5cc8db0856
sentence-id: 301258733088c03b77210df7c27609382782fca17c7cc2e374095b5cc8db0856
expiry-date: 2023-03-07 02:45:37.218

I’m very new to the project but was also thinking along similar lines. Do you know if each participant have a fixed id so that it would be possible to filter out consistently quiet/poor quality recordists?

No idea. This is a similar clip but I don’t see any similarties

clip-id: 36120098
clip-glob: 87453714-6c96-49c7-86c3-d342b453fa44/28bf25a76e59017b6db47bb486bb2439bb9be1a16af61581c3c1308b084141de
sentence-id: 28bf25a76e59017b6db47bb486bb2439bb9be1a16af61581c3c1308b084141de
expiry-date: 2023-03-07 02:45:37.221

If logged in or using the same device/browser as logged out / not registered, you can pinpoint problematic voices from dataset releases (not on the website). The client_id field in .tsv files is where you should look.

Low voice level happens sometimes, at least in my case, the person is speaking at night, e.g. in a dorm, so that nobody gets disturbed.

Very low energy levels can be problematic. In one case I had to remove a very silent one to be able to create a model. Common Voice is targeting natural speech, and such recordings are not-so-natural (silent whisper). E.g., Whisper checks/eliminates such problematic ones.

In my opinion, if one has problems understanding with a reasonable volume setting, he/she should invalidate it. If you have problems understanding, the DL model will also have problems. But that’s me…