Please bear with me I’m a total noob with ASR. I’m trying to go through the guide training your own model
After running import_cv2, it generated a couple of files such as dev, other test, train and validated, but is that ok that the file size is far from the total size of the original tsv files?
The only thing I added from the tutorial’s command is add a filter alphabet which consists a-z letters because without it I’m getting errors due to detected characters from german language. Comparing the sizes of the result, did I lost a lot of data? Maybe I used filter_alphabet incorrectly?
From corpora
6M dev.tsv
43M invalidated.tsv
37M other.tsv
264K reported.tsv
3.5M test.tsv
32M train.tsv
273M validated.tsv
Files generated after running import_cv2
81k dev.csv
505K other.csv
789K test.csv
1.9K train-all.csv
1.7K train.csv
1.3M validated.csv
I also noticed that the resulting csv files, they have a transcript of only having 1 word. should I include a space character in my alphabet.txt?
my alphabet.txt looks like this
Another question, I would like to train using our audio files, mostly calls from clients. Do I need to cut up the audio files to smallers bits like the size of a phrase before I can use them in deepspeech?
Also on training, it is required to have train, dev and test csv files to supply on the command. What is the difference of those files? I think it was not mentioned in the tutorial but please correct me if I’m wrong. Will they have the same contents? (I’m also using this as reference but it’s to high level a bit hard to grasp for a beginner https://medium.com/visionwizard/train-your-own-speech-recognition-model-in-5-simple-steps-512d5ac348a5)
Any help would be appreciated.
Thanks