Dataset split best practices?
|
|
23
|
4722
|
December 23, 2019
|
German dataset doesn't work for training
|
|
3
|
1045
|
October 22, 2019
|
Generate voice command dataset
|
|
2
|
806
|
September 2, 2019
|
Common Voice mid-year release - more data, more languages!
|
|
20
|
2490
|
August 12, 2019
|
Portuguese dataset
|
|
1
|
1152
|
August 1, 2019
|
Is dataset of acoustic model subset of dataset of language model?
|
|
1
|
433
|
August 1, 2019
|
How can one download the German dataset?
|
|
3
|
1025
|
June 12, 2019
|
Downloading 20gb is tough on weak networks. Alt download method?
|
|
2
|
887
|
June 12, 2019
|
Add Basque to the dataset page
|
|
6
|
1092
|
June 12, 2019
|
Dataset downloads Dutch
|
|
4
|
1317
|
June 12, 2019
|
Dataset releases - What's more valuable for you?
|
|
9
|
2301
|
June 12, 2019
|
Subpar data uses
|
|
7
|
1466
|
June 5, 2019
|
Importing large annotated database of CC0 speech data in Swedish?
|
|
2
|
638
|
May 28, 2019
|
Common Voice datasets (Mandarin zh-tw)
|
|
2
|
906
|
May 23, 2019
|
Privacy concerns about dataset metadata
|
|
7
|
2761
|
May 16, 2019
|
What is the ideal decibel? Do we need to adjust volume of datasets?
|
|
2
|
508
|
May 11, 2019
|
Fine tuning data requirements
|
|
5
|
2386
|
May 11, 2019
|
Add in dataset Sakha language
|
|
5
|
1278
|
April 25, 2019
|
Rejected audio dataset
|
|
2
|
704
|
April 5, 2019
|
Pre Release Data vs Latest Release Data
|
|
1
|
464
|
April 2, 2019
|
Gender breakdown of English language dataset
|
|
5
|
1739
|
March 25, 2019
|
En 22G dataset, problems about 'path' in .tsv files
|
|
9
|
763
|
March 15, 2019
|
What are the rules behind 'path' ID generation?
|
|
0
|
425
|
March 11, 2019
|
Sharing Common Voice Through peer-to-peer
|
|
16
|
1750
|
March 11, 2019
|
How are the dev/test/train datasets split?
|
|
4
|
2661
|
March 7, 2019
|
Zero byte files in German language set (new official release)
|
|
2
|
526
|
March 2, 2019
|
Filtering a specific word / sentence
|
|
0
|
395
|
February 28, 2019
|
Multi-Language-Dataset (Beta) is gone
|
|
5
|
631
|
February 20, 2019
|
Speaker ID split between train/test/dev
|
|
4
|
1001
|
February 15, 2019
|
Stats about Common Voice: Kabyle Corpus
|
|
2
|
587
|
February 7, 2019
|