Post-transcription correction methods

buxbaum · October 11, 2020, 3:06pm

Hi,

I’m currently working and searching for some best post-transcription correcting method for DeepSpeech/ ASR?

I implement some re-scoring based using n-best transcripts from DeepSpeech and then using fuzzy matching compute the most similar sequence from the in-domain language model. I also tried with some cosine similarity metrics based on tfidf or count vectorizer. Still I think the methods are very naive.
Can someone recommend me some salutions, or maybe give some hints what can I try to make the post transcription action valueable for the system.

I would be very greatful for your advises!

othiele · October 11, 2020, 7:05pm

I guess it depends on the use case, what is yours?

The standard CTC beam search isn’t bad if you use a custom language model. It looks like you are currently trying to reverse a search already done by the beam search …