Hi I have a well trained set in Greek and now I want to expand the model with more voice tones (female / male / elder / etc).
I have the audio recordings in wav and the transcripts from usage of the current model which I will use as input.
According to the documentation I will have to give train, dev and test csv files.
Do I have to separate the data on dev, train and test sets or I can use only 1 csv for the fine tunning?
Then another question is, if I want to expand the vocabulary or create a new vocabulary (I have a vocabulary and a model for medical use and I want to create a new one for layer usage) I have to create a new scorer or I have to make a new training set?