How accurate are the statistics of Recorded/Validated clips per language?

Note that the statistics on the SentenceCollector only includes what it knows about. Anything that’s added outside of that is not counted. So this would be missing extracts from Wikipedia through the Sentence Extractor as well as bulk uploads such as the Europarl corpus in several languages.