I’m currently working and searching for some best post-transcription correcting method for DeepSpeech/ ASR?
I implement some re-scoring based using n-best transcripts from DeepSpeech and then using fuzzy matching compute the most similar sequence from the in-domain language model. I also tried with some cosine similarity metrics based on tfidf or count vectorizer. Still I think the methods are very naive.
Can someone recommend me some salutions, or maybe give some hints what can I try to make the post transcription action valueable for the system.
I would be very greatful for your advises!