Multi-language Dataset Beta Release
|
|
24
|
4687
|
April 6, 2020
|
How to add sentences and recordings. Kyrgyz. 10000 samples
|
|
6
|
671
|
March 26, 2020
|
Why train.tsv includes a few files (just 3% of validated set)?
|
|
22
|
3724
|
February 26, 2020
|
Missing data info in common-voice german dataset ver de_538h_2019-12-10
|
|
4
|
394
|
January 24, 2020
|
Problem decoding Breton audio
|
|
3
|
385
|
January 9, 2020
|
Russian senteces are low quality
|
|
4
|
916
|
December 26, 2019
|
Dataset split best practices?
|
|
24
|
2620
|
December 23, 2019
|
Common Voice mid-year release - more data, more languages!
|
|
21
|
1687
|
August 12, 2019
|
Portuguese dataset
|
|
2
|
732
|
August 1, 2019
|
How can one download the German dataset?
|
|
4
|
700
|
June 12, 2019
|
Downloading 20gb is tough on weak networks. Alt download method?
|
|
3
|
606
|
June 12, 2019
|
Add Basque to the dataset page
|
|
7
|
678
|
June 12, 2019
|
Dataset downloads Dutch
|
|
5
|
850
|
June 12, 2019
|
Dataset releases - What's more valuable for you?
|
|
10
|
1527
|
June 12, 2019
|
Subpar data uses
|
|
8
|
960
|
June 5, 2019
|
Importing large annotated database of CC0 speech data in Swedish?
|
|
3
|
496
|
May 28, 2019
|
Common Voice datasets (Mandarin zh-tw)
|
|
3
|
584
|
May 23, 2019
|
Privacy concerns about dataset metadata
|
|
8
|
1804
|
May 16, 2019
|
Add in dataset Sakha language
|
|
6
|
687
|
April 25, 2019
|
Rejected audio dataset
|
|
3
|
490
|
April 5, 2019
|
Pre Release Data vs Latest Release Data
|
|
2
|
359
|
April 2, 2019
|
Gender breakdown of English language dataset
|
|
6
|
1170
|
March 25, 2019
|
En 22G dataset, problems about 'path' in .tsv files
|
|
10
|
565
|
March 15, 2019
|
What are the rules behind 'path' ID generation?
|
|
1
|
299
|
March 11, 2019
|
Sharing Common Voice Through peer-to-peer
|
|
17
|
1335
|
March 11, 2019
|
How are the dev/test/train datasets split?
|
|
5
|
1632
|
March 7, 2019
|
Zero byte files in German language set (new official release)
|
|
3
|
368
|
March 2, 2019
|
Filtering a specific word / sentence
|
|
1
|
262
|
February 28, 2019
|
Multi-Language-Dataset (Beta) is gone
|
|
6
|
509
|
February 20, 2019
|
Speaker ID split between train/test/dev
|
|
5
|
671
|
February 15, 2019
|