I’m wondering how to build a LM for human names which can be used by DeepSpeech.
Option #1: Build a database of “First Last” names
Pros: work with KenLM directly
Cons: hard to find sources with valid “First Last” names.
Option #2: Build a database of mixed “First” or “Last” names.
Pros: easy to build such a database
Cons: KenLM doesn’t support unigram
Has anybody done this before? And could you please share your experience? Thanks!