Hi, one area that DeepSpeech could be use is for phones conversations. However those files do not have the same format as the required one for DeepSpeech (16kHz, wave PCM signed etc).
They have as format:
Format: ADPCM
Format profile: A-Law
Sampling rate: 8kHZ
Bit Depth: 8 bits
So I was wondering if there was a way to train models on those kind of files or is converting to PCM signed then upsampling the only way ?
Thanks in advance
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
You can train on whatever format you want, you might have to apply changes to DeepSpeech.py and others though.
But please note we only work / test with 16kHz PCM signed.
Do you think it’s possible to use the 16kHz model that you trained and fine tune it on 8kHz data or shall I start a model from scratch (I don’t know how much data is required but I can gather quite a lot of samples)
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4