I think I’m being fairly lenient with what I will accept when doing listening tests, but I take a different approach than others I think:
I will only accept a sample if I hear all the words in the displayed text, and if I could recognize all of them on the first listen with my eyes closed.
With this approach, I end up with about an 85% rate of acceptance on the English samples, while the top hundred or so listeners seem to have an acceptance rate in the ballpark of 95%. I’m not sure what level of quality the validation could have if around 95% of these samples are accepted. Many of the samples are inaudible or full of buffer underruns, and many of the speakers clearly do not recognize the words they are attempting to pronounce, or are generally not fluent in any English dialect.
Maybe the listeners should be coached or prompted to have at least some standards for accepting samples.