Subword piece approch with deepspeech

Hi,

i’m starting a new project about reading assessment on small vocabulary (~1000 words), before digging into deepspeech, i would like to ask if anybody has already tried a subwordpiece (“reading” “assessment” -> “rea ding ass ess ment”) approach with deepspeech? and if you had, any feedback would be welcomed.

thx

Could you elaborate more ? I don’t get what you want to do in the end.

Hi Lissyx,

thank you for replying, the target is to do some oral reading assessment, meaning that the decoded sequence from the ASR will be compared to the supposed read sequence.

The main problem is that if a word is not correctly read (“assessment” -> “assessent”), the decoding with usual language model, will probably give us “assessment”, and we can’t detect the error unless with a probability threshold, or in the best case (maybe with ponderation of LM weight) we will get an OOV.

I read a paper that suggest working on sub words like (ass ess ment) but that involve some rework on labels, and it become much more a “phonetic” labelisation…, so i feel that it s going a bit out of the standard objective of deepspeech.

I hope i ve made my question clearer :slight_smile:

Yes. I’m going to ask a dumb question, but what results do you get without any language model at all ? That should get you a more phonetic, raw output.

I will make a try, and come back here :slight_smile: I still have work on data before that.
ty

Hello, @kezakool
How is your experiment going? :slight_smile:

Hi,
i saw your topic on the forum, for now i trained the model on a character level, i have a good WER about 3.5% on my test set (Dataset is a small vocabulary of about 150 distinct short sentences from children - 50h) so i didn’t explore yet the subword piece solution.
My next step for reading assessment is to analyse inferences performances on specific sets with missreading, it will be at this step (if it doesn’t work well enough) that i will know if i need to try a subword piece approach.

I get it. Much appreciated for your reply!

@kezakool Could you share the paper you read? I’m interested.

Hi,

this one should help you http://www.cs.cmu.edu/~fmetze/interACT/Publications_files/publications/main.pdf

You could reference espnet’s librispeech recipe which uses bpe approach.

@kezakool I’m also interested in Deepspeech for reading assessment! Did you attempt the subword approach or raw phonemes? How is the experiment going?

Cheers!

Hi @jpsb ,

i finally choose to work with rule based language models for each text, showing some promissing results, but the project is freezed because of covid, hoping to restart it this spring :slight_smile:

Raw decoding didn’t bring good results, the subword approach still get my favours but i felt it was a bit far of deepspech solution that is grapheme based and that shortcut many of phonemic interpretation,

hope it helps

1 Like

@kezakool, Definitely helpful, and I hope you get to restart in the spring :slight_smile:

I’ll investigate rule based language models! If you have any other pointers or know of any Github repos that have an example, I’d be super grateful!

Thanks again!