Hi, thanks for your work.
When I try to use deepspeech to extract audio’s feature for other tasks, I find the output dimension is 256. It makes me confused, the dimension should equal the number of classes. For example, I got 5700 characters from wenetspeech dataset with 2000 hours.
Another problem, could you share your Chinese alphabet.txt?
Thanks for your reply!