Missing data info in common-voice german dataset ver de_538h_2019-12-10

Dear Support.

This is just for the info which I have experienced while using the version de_538h_2019-12-10 there are approximately 35 files with no vocals/voice which is making trouble when processing the data. The list of files is

clips/common_voice_de_17486238.mp3
clips/common_voice_de_18235786.mp3
clips/common_voice_de_18235793.mp3
clips/common_voice_de_18235796.mp3
clips/common_voice_de_18235804.mp3
clips/common_voice_de_18235807.mp3
clips/common_voice_de_18235808.mp3
clips/common_voice_de_18235810.mp3
clips/common_voice_de_18235811.mp3
clips/common_voice_de_18235812.mp3
clips/common_voice_de_18235813.mp3
clips/common_voice_de_18235814.mp3
clips/common_voice_de_18235815.mp3
clips/common_voice_de_18235816.mp3
clips/common_voice_de_18235817.mp3
clips/common_voice_de_18235818.mp3
clips/common_voice_de_18235819.mp3
clips/common_voice_de_18235820.mp3
clips/common_voice_de_18235821.mp3
clips/common_voice_de_18235822.mp3
clips/common_voice_de_18235823.mp3
clips/common_voice_de_18235824.mp3
clips/common_voice_de_18235825.mp3
clips/common_voice_de_18235826.mp3
clips/common_voice_de_18235827.mp3
clips/common_voice_de_19411969.mp3
clips/common_voice_de_19411970.mp3
clips/common_voice_de_19411971.mp3
clips/common_voice_de_19411972.mp3
clips/common_voice_de_19411973.mp3
clips/common_voice_de_19411979.mp3
clips/common_voice_de_19411980.mp3
clips/common_voice_de_19411981.mp3
clips/common_voice_de_19411982.mp3
clips/common_voice_de_19411983.mp3

It may help others and you as well and safe time. Maybe while extracting it is only at my end. But I cross-checked twice at least and same findings discovered by me.

Are these files described as validated on the description file? They should be marked as rejected.

not all, but mostly YES. !! also in train file.

OK, let me ping the team to investigate, thanks for reporting!

2 Likes