Hello everyone,
I’m using Deepspeech to build a TTS software for a specific domain. Therefore I’m trying to build my own scorer as shown in https://mozilla.github.io/deepspeech-playbook/SCORER.html
On this page we can read :
Preparing the text file
…
These phrases should not be copied from test.tsv
, train.tsv
or validated.tsv
as you will bias the resultant model.
I don’t understand why giving the scorer phrases from the train_set or val_set will bias the model and I can’t find any clear explanation on the internet or DS documentation. Can you help me ?
I also wonder why in the librispeech corpus txt used by default for DS Scorer we can read phrases like :
A
A A
A A A
A A A A
A A A A A
A A A A A A A A A A A A A A
A A A A A AH
A A A A A AH THE CRY WAS WRUNG FROM JOHNNIE
A A A A A BOVE SECOND SINGER DIMINUENDO
A A A A A MEN
Why so many A’s before every sentence ?
Best regards