Can I strict the scorer to a given list at runtime?

tarekeldeeb · April 27, 2021, 10:45am

I have built an Arabic model which runs just fine. The output is accepted in most cases; WER=0.1.

In my application I need to strict the output to a list. Maybe to reach WER=0.01 Example use case:

App: Do you accept the agreement?
User-voice: Yes / No / Read (Give this list to the scorer)
App: What do you want to order?
User-voice: Breakfast / Lunch / Dinner (Give this list to the scorer)
…
…
And the business logic continues, and the dialog goes on.

How can I implement such functionality with the current scorer?

ftyers · April 27, 2021, 2:20pm

Use the hotword boosting functionality?

tarekeldeeb · April 28, 2021, 10:25pm

Thanks @ftyers for your reply.

I think hotword only increases the probability of a given word, but does not restrict the output to those words … right?

Is there a max probability that can give the same restrictive effect?

ftyers · April 28, 2021, 10:27pm

Correct, it increases the probability of that. You can try making a closed-vocabulary language model/scorer and using that. Just a file with words in and then run the generate_scorer. But I don’t think there is a way to guarantee that you will only get a set of words out.

If you wanted to do that you could probably build a multiclass classifier on top of the softmax output of the acoustic model trained for your specific vocabulary. But I’d suggest trying the simpler stuff first.

ftyers · April 28, 2021, 10:28pm

You also might try joining us on Matrix.