How to go about creating this to work with indian accent english

Hi
I wanted some help in the way i need to go to be able to build a STT model for indian english.
Because the one deepspeech is trained on is mainly american english.
Is there a way by which i can build on top of this itself ???
Or do i need to train it from the beginning ??

Thanks in advance for your help :slight_smile:

3 Likes

If you have a data set of Indian English, then you can continue training on the checkpoints we released.

This should use some of the English knowledge in the checkpoint we released and “fine tune” the model to Indian English.

@kdavis how do i access this checkpoint ?? like is it same as this : Continuing training from a frozen graph concept defined in ReadMe ??
Also how would i know what are the english knowledge which is present in the checkpoint released ??

Thanks in advance :slight_smile:

The checkpoint is in the compressed file deepspeech-0.1.1-checkpoint.tar.gz that’s part of the current release[1].

Alternatively you could continue from the frozen graph, a slightly different process.

I’m not sure I understand you last question. Do you mean:

How do I know it works with Indian English?

If that’s the question, then answer is that you have to create a test set of Indian English, which you don’t train on, and evaluate the system on this test set when you are done training on your Indian English training data set.

Thanks @kdavis i will try this out and come back with the results.

Thanks for taking time to reply to my query :slight_smile:

You are welcome! :grinning:

@kdavis yes it was able to get much better accuracy with the frozen model appoach.
But i have a small question, How do i go about adding more words to this frozen model ? Is there a way to do that ??
Like what if i wanted to add another word called “driver” to my vocabulary, how can i do something like that ??

Thanks

There’s no notion of known words in the checkpoint or frozen model.

Both the checkpoint and frozen model contain a so called acoustic model containing information indicating how to go from audio to letters. Basically it learns how to go from how a word sounds to a spelling of the word.

This acoustic model does not have a fixed vocabulary. So, for example, if you started speaking Spanish to the engine it would do its best to spell out what it heard as it if were an English speaker.

The vocabulary is “baked” in to the lm.binary and trie. In particular the trie holds “known words”, as you are thinking about the term. So, for your case it looks like you want to create a trie using the generate_trie tool.

@kdavis

Thanks for explaining this in detail.

But here are the confusions which i have:

For the right formation of a sentence i might need to add more words which are specific to my usecase, which maynot even be a valid english word, ( example DSP might be a word which i want the model to recognise )

But is for this i might need to add to the existing trie, i would not want to build the entire trie or lm.binary from scratch.

Please let me know how i can go about doing this.

Thanks

@kausthubnaarayan
Hi, It seems that you/I must wait until Kelly send the complete vocabulaty file as txt one.
They don’t have complete one, but he works on it…
Waiting for it too for some days.
:slight_smile:

Hey @kausthubnaarayan, which datasets from youtube you used for Indian accent English? I am trying to train the model too, so looking for suggestions on where can i get Indian accent English datasets and how much data i will have to feed?

@kdavis

By when can the vocabulary file be shared ?? Is it expected soon ??

Thanks

In the next release 0.2.0 we’re building the trie and language model from open corpora.

0.2.0 should happen within the next 3-4 weeks.

Thanks for the update

Waiting for the new release.

@kdavis when is 0.2.0 release scheduled ?? waiting for the trie and language model with the vocabulary files as well.

Thanks

Hey, how’s your speech for Indian english working

Any update on this issues