This already exists: GitHub - common-voice/cv-sentence-extractor: Scraping Wikipedia for fair use sentences. The problem here is that back in 2019 when it wa ran for 2019, many of these rule possibilities did not exist and the French rule file is very minimal. Of course this can be fixed now in case an extract of articles created since then ever would be run.
In a perfect world…