Vocabulary.txt what text it should contain

Good morning,

Could any one tell me about vocabulary.txt. What should it contains ?

  1. Only train.csv text
  2. Train.csv + validation.csv text
  3. Train.csv + validation.csv + text.csv

Thank you in advance

  1. None of the above ?

:smiley: Please, can you tell what it should contain ?
This was suprising answer for me.

I guess I found answer. Where is Vocab.txt file?

Thank you

Think a bit about the usage of the LM: if you feed it the same text content as your model trained, it’s not going to help that much, and if you feed it the test set then you are optimizing your WER on the test set and not on general use.