I have several ideas about cross-language and/or time-axis (version) analysis of CV datasets. But as the datasets include the recordings they are huge if you download many languages and many versions, both for bandwidth and for disk space. If you do not do the actual trainings, you do not need the recordings…
If we could have “clips.tsv” file (DB dump before Corpora Creator) - or packages containing only .tsv files downloadable, that would be extremely helpful for such analysis.
Is this available/possible?