Our current data is what’s called a “read speech corpus”, because each sentence is read from a prompt. It can also be very useful to have a “spontaneous speech corpus”. In this case, the speech is produced spontaneously, and the transcription is created later on by listening to it.
I have long thought that that the creation of a spontaneous corpus would also be an ideal application for crowd-sourcing, not sure if it could one day be included in the scope of this project. You would initially contribute a recording of your voice, then other users would verify and transcribe it. Your Alexa voice history might be good for that, or also just recording your side of (phone) conversations, etc.