We want to train a speech recognition machine learning model to predict age, gender, and ethnicity based on the speaker’s voice. Common Voice seems like the best dataset for this!
Is there any pre-trained model files we can use to do transfer learning on top of?
Or a simple command-line tool to do the above task?