Deep Speech outputs all characters, no numbers. So let’s say I want to output “I want 2000$ now” instead of “I want two thousand dollars now”, which is what Deep Speech will output. I was just wondering what are the approaches people have tried in this regard. Do people use rule-based heuristics to post-process the output after speech recognition? Or while training, do they make the alphabet include numbers and dollars as well (not a favourable approach in my case as I want to use pretrained model, and not train from scratch)? Or do people use some language model for this purpose, at the end?
This question of mine is not specific to Deep Speech, so I apologise if this is the wrong place to post, but since I am working with Deep Speech itself and want to address this issue, I have posted it here. I’d really appreciate any help I get