Text produced has long strings of words with no spaces

Hi, How can I train it without a language model?

Pure training never uses a language model.

It is only when is_display_step is active the language model is used to create a WER report indicating training progress.

Hi @bradneuberg , I too am working on a project that uses speech-to-text as one of the initial parts in the product architecture, and unfortunately this issue of outputting illegible words is driving us further away from Mozilla’s solution. I’m searching for other implementations that can get the job done, but I haven’t been able to find any other implementation that is this comprehensive and detailed. Would you be kind enough to suggest some of the other solutions that you evaluated, that are open source and seem to mitigate this problem ?