Multi Accent S2T Model

Hi everyone,

Can we use Deepspeech to build a multi accent S2T model?
example: I want to build an ASR model that can transcribe US and Indian accent English language. I have 3-4k hours of labelled data for each accent. Is it possible to build a single model for these accents which can give me good enough WER(<15)?

Any suggestion would be of great help!

Thanks

This is just my view, if any of you have another view. Please, go ahead :slight_smile:

Yes, with a lot of data as you need to “store” more information.

The current model should be fine for US English, or isn’t it? You therefore want to finetune it for Indian English without loosing too much information. Use a low error rate and have a good test set that measures both which you can use to test how many epochs, what error rate and dropout you need to do that.