I am working in training new deepSpeech model for German language…
I have downloaded data-sets from the official site and followed the steps mentioned on - https://www.npmjs.com/package/deepspeech , to convert mp3 into wav format that is compatible with deepSpeech training .
I am executing following command to start training -
Thanks for your quick responses . I just competed the flow with one training file without any errors , hope i will not get any when i train with bulk data-set
i am trying to use the newly trained language model using NodeJS and i am getting below error-
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.
Error running session: Not found: PruneForTargets: Some target nodes not found: initialize_state
Segmentation fault (core dumped)
above command will output_graph.pb in the mentioned export dir i.e - ./test/export/destination
Test with newly exported model
python3 ./native_client/python/client.py --model ./test/export/destination/output_graph.pb --alphabet ./data/alphabet.txt --lm ./data/lm/lm.binary --trie ./data/lm/trie --audio …/Data-sets/german/clips/common_voice_de_17300571.wav
I am getting below error after step 8 , i.e trying to use newly trained model.
I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
and then –
I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
is it now related to my system configurations , please provide your inputs.
Thanks!!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
9
This is not an error and is not related to your models. Those are warnings, you can ignore them.
I made a copy paste error , sorry for that . Below is the actual error message i am getting -
Error running session: Invalid argument: Tensor input_lengths:0, specified in either feed_devices or fetch_devices was not found in the Graph
While looking out on Net , i got below reason on one site -
Although the model has a Session and Graph, in some tensorflow methods, the default Session and Graph are used. To fix this I had to explicity say that I wanted to use both my Session and my Graph as the default:
but i am not getting this properly , Please let me know your inputs.
Thanks!!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
11
That feels strange, but you are running directly client.py and you don’t share the start of the input, so we cannot check what libdeepspeech.so is actually running.
Please test properly, as documented: set-up a virtualenv and install with pip install deepspeech==0.5.1 and run inference with deepspeech rather than calling client.py directly.
I tried with creating new virtual environment , still facing same error.
Can it be because i have trained model with very few data-set (2-3 files of 10 sec).Currently i am trying to do a complete POC that’s why i have not trained with large data-set .Please let me know your inputs?
Thanks!!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
14
No, that’s something else.
Like …
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
15
and yes @laxmikant04.yadav you shared that earlier, but since you kept sharing without proper code formatting, your python command line was unreadable to me and thus I missed that information.
I could see it’s trying to recognise the speech but accuracy is not coming good for me .
i am working on Ubuntu 16.04 OS on a desktop .
Currently it’s only able to recignise one word that too when spoken very loud and very clear. and failing otherwise .
Could you please suggest what else i should try or where i can look up to increase it’s accuracy.
Our expectations are that it should be able to recognise simple sentances like - “Welcome to speech recognition” . this works perfectly when i try with clean audio files.
Thanks!!!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
20
Looks like you’ve got some hint yourself. Though, you don’t document if those clean audio files are produced by you or if they are from other origin.
It looks like we have not updated that to 0.5.1, maybe it is worth testing if it improves, since this model was trained to be more robust to some noise.
Make sure your system is able to actually capture at mono 16kHz, resampling might get into.
It could also just be a side-effect of your mic, that captures poor quality sound. Besides improving the model, there’s hardly anything we can easily improve.