Early overfit

@lissyx Thank you very much for your help so far. I have tried generating a language model copying the steps in data/generate_lm.py. The result was basically the same: Early stop after 11 epochs. I have tried training with CommonVoice and the lm present in DS package but I have been getting a segfault similar to this one Segmentation fault - #11 by lissyx which also seems to be LM-related if I understand it correctly. I don’t know what to try next.

[UPDATE]: Aha, the lm.binary and trie files have just 133 bytes each. Something’s wrong, I’ll dig into int.

[UPDATE 2]: No, I keep getting the segmentation fault early in the 1st epoch of training, even with the correct downloaded LM. :frowning: