A couple of years ago speech data from a bankrupt speech research company was made available by the National library of Norway. It contains audio files and annotated text files from a multitude of speakers and is CC0 licensed. It is close to 100GB of data. Would it be possible to import this data into Common Voice to improve Swedish ASR?
Similar data exists for Norwegian and Danish. An overview (in norwegian) is available in this document.