Validating the dataset

What are you experiences validating the dataset?
I’m helping the validation and I think it’s quite cool to do that, I’m improving my listening skills by a lot, I’ve encountered just a few files which were empty, some are impossible to hear, some have ‘the’ when it doesn’t have it, some files have too much noise, but in general the quality is quite good.