Improve accuracy and make it more context aware

recorded audio file: https://gofile.io/d/A59IgZ

Input #0, wav, from 'dewitest9.wav':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:00:20.09, bitrate: 768 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 1 channels, s16, 768 kb/s

deepspeech-0.9.3-models.pbmm
deepspeech-0.9.3-models.scorer

deepspeech transcription:
by mail from the shop all my dad ring my dad what is the weather to day play the radio

what I actually said (corrections are in italics):
BUY MILK from the shop. CALL my dad. ring my dad. what is the weather TODAY. play the radio

What are the best ways to improve accuracy?

Apologies for the noob question. I am completely new to this.
Many thanks for your help

1 Like

Shwmae!

A quick way to improve the results is to train a new scorer with the kind of sentences you expect to use. If you have specific things that you want to say then those should be in the language model. I believe the provided model is trained from the transcripts.

Another possibility if that doesn’t help is to fine tune on your own recordings. If you have contributed to Common Voice, you can get a copy of your recordings by emailing the team. Otherwise you can record them yourself.

Also feel free to join us on Mozilla’s Matrix for faster responses.

@ftyers
From my limited knowledge a custom scorer won’t allow for wildcards?
You can state sentences in the scorer such as:
Add milk to my shopping

But I would want anything to be in place of milk.
I want to state in the scorer something like:
Add * to my shopping

Is that possible?

Diolch yn fawr

You could of course generate those from a list of things available in the shops. So yeah, that would be possible.

Croeso!

F.

Hmmmmm I don’t think it would be possible to generate all possible items.
Is there any way to have wildcard templates in the scorer?
Sorry for the noob questions

The shops have a finite number of items, so it should be possible to generate them in a finite amount of time. As for the wildcards I can imagine how it would be possible, but it would involve changing the scoring code.

@ftyers Any tips on how to achieve this by changing the scorer code? :grinning:
thanks

Maybe look into factored language models with KenLM. I’m not sure if KenLM supports them out of the box, but you might be able to interpolate.

Looks like I have lots of reading to do @ftyers
I am completely new to all of this :cry:

Many thanks though

1 Like