Sentence Collector - Cleanup before export vs. cleanup on upload

mkohler · September 17, 2022, 10:18am

This already exists: GitHub - common-voice/cv-sentence-extractor: Scraping Wikipedia for fair use sentences. The problem here is that back in 2019 when it wa ran for 2019, many of these rule possibilities did not exist and the French rule file is very minimal. Of course this can be fixed now in case an extract of articles created since then ever would be run.

In a perfect world…

Topic		Replies	Views
Sentence collection tool development topic Common Voice sentence-collection , announcements	30	4112	January 26, 2019
Common rule files for Sentence Collector / Sentence Extractor Common Voice	2	609	October 2, 2022
We want your feedback: Improving the sentence collection Common Voice sentence-collection , feedback	34	8983	December 17, 2018
Question about CV Sentence Extractor quality and your experience Common Voice	18	1612	August 30, 2023
Common Voice New Sentence Collector Common Voice	15	1048	August 12, 2023

Sentence Collector - Cleanup before export vs. cleanup on upload

Related topics