Improving the custom stt model + testing strategies

buxbaum · September 12, 2020, 2:19pm

Hello,

I’m working currently on the custom deepspeech model for german language in the specific domain.
I was wondering what can I do in order to improve the model, beside finetuning the model with custom domain-specific and augmented data, as well as beside creating a domain specific language model.

Is it adding more data? I prepare 86 hours of data for finetuning.
Some relevant parameters? (I was tuning alpha and beta untill now)
Improving lexicon? Do I have any other possibilities?

Also I was wondering if there are any best practises when it comes to testing the stt model ? Untill now i was implementing unit tests for this.

Thanks in advance!