I’m sorry but I absolutely don’t understand your question here.
Again, without more context on how you get that error, it’s impossible for us to help you.
I’m sorry but I absolutely don’t understand your question here.
Again, without more context on how you get that error, it’s impossible for us to help you.
I have changed script as you said for 44100 hz and stereo channel
changed
this is the log what I got
Help me remove that error
WER: 1.000000, CER: 0.985000, loss: 29.460112
WER: 1.000000, CER: 0.985000, loss: 34.642292
WER: 1.000000, CER: 0.985000, loss: 43.823837
WER: 1.000000, CER: 0.988000, loss: 829.879456
WER: 1.000000, CER: 0.988000, loss: 830.506531
WER: 1.000000, CER: 0.988000, loss: 831.406311
WER: 1.000000, CER: 0.988000, loss: 833.888916
WER: 1.000000, CER: 0.988000, loss: 834.442749
WER: 1.000000, CER: 0.988000, loss: 843.434326
WER: 1.000000, CER: 0.988000, loss: 846.850525
WER: 1.000000, CER: 0.992000, loss: 906.634338
WER: 1.000000, CER: 0.992000, loss: 917.082153
WER: 1.000000, CER: 0.992000, loss: 931.485535
WER: 1.000000, CER: 0.992000, loss: 971.765442
WER: 1.000000, CER: 0.992000, loss: 980.901733
WER: 1.000000, CER: 0.996000, loss: 985.703308
WER: 1.000000, CER: 0.992000, loss: 1011.776367
WER: 1.000000, CER: 0.996000, loss: 1016.469482
WER: 1.000000, CER: 0.996000, loss: 1033.371948
WER: 1.000000, CER: 0.986667, loss: 1068.499268
WER: 1.000000, CER: 0.986667, loss: 1087.635254
WER: 1.000000, CER: 0.986667, loss: 1108.907349
WER: 1.000000, CER: 0.986667, loss: 1111.774170
WER: 1.000000, CER: 0.986667, loss: 1114.400024
WER: 1.000000, CER: 0.986667, loss: 1125.432251
WER: 1.000000, CER: 0.986667, loss: 1149.028564
WER: 1.000000, CER: 0.986667, loss: 1159.635742
WER: 1.000000, CER: 0.986667, loss: 1185.972534
WER: 0.980000, CER: 0.983333, loss: 569.788208
WER: 0.980000, CER: 0.984000, loss: 654.026611
WER: 0.980000, CER: 0.985000, loss: 656.827332
WER: 0.980000, CER: 0.985000, loss: 659.727051
WER: 0.980000, CER: 0.984000, loss: 661.934265
WER: 0.980000, CER: 0.985000, loss: 666.090271
WER: 0.980000, CER: 0.984000, loss: 671.630920
WER: 0.980000, CER: 0.985000, loss: 672.173218
WER: 0.980000, CER: 0.985000, loss: 673.630249
WER: 0.980000, CER: 0.984000, loss: 675.070740
WER: 0.980000, CER: 0.984000, loss: 682.800720
WER: 0.980000, CER: 0.984000, loss: 692.452820
WER: 0.980000, CER: 0.984000, loss: 695.382751
WER: 0.980000, CER: 0.985000, loss: 696.252136
WER: 0.980000, CER: 0.984000, loss: 713.148743
WER: 0.980000, CER: 0.984000, loss: 775.913818
WER: 0.980000, CER: 0.984000, loss: 822.966125
WER: 0.980000, CER: 0.983333, loss: 935.471558
WER: 0.980000, CER: 0.983333, loss: 966.865479
WER: 0.980000, CER: 0.983333, loss: 988.343689
WER: 0.980000, CER: 0.983333, loss: 1004.328613
WER: 0.980000, CER: 0.983333, loss: 1031.770996
WER: 0.980000, CER: 0.983333, loss: 1040.531250
WER: 0.980000, CER: 0.983333, loss: 1045.897217
WER: 0.980000, CER: 0.983333, loss: 1050.801758
WER: 0.980000, CER: 0.983333, loss: 1061.078735
WER: 0.980000, CER: 0.983333, loss: 1063.963989
WER: 0.980000, CER: 0.983333, loss: 1065.088745
WER: 0.980000, CER: 0.983333, loss: 1067.910645
WER: 0.980000, CER: 0.983333, loss: 1143.155151
WER: 0.980000, CER: 0.983333, loss: 1165.391235
WER: 0.980000, CER: 0.983333, loss: 1183.412720
WER: 0.980000, CER: 0.983333, loss: 1239.415771
WER: 0.980000, CER: 0.983333, loss: 1263.958374
WER: 0.980000, CER: 0.983333, loss: 1275.644775
WER: 0.980000, CER: 0.983333, loss: 1326.395996
I Exporting the model…
Traceback (most recent call last):
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/execute.py”, line 145, in make_shape
shape = tensor_shape.as_shape(v)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 1125, in as_shape
return TensorShape(shape)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 690, in init
self._dims = [as_dimension(d) for d in dims_iter]
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 690, in
self._dims = [as_dimension(d) for d in dims_iter]
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 632, in as_dimension
return Dimension(value)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 188, in init
raise ValueError(“Ambiguous dimension: %s” % value)
ValueError: Ambiguous dimension: 1411.2
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “DeepSpeech.py”, line 836, in
tf.app.run(main)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py”, line 125, in run
_sys.exit(main(argv))
File “DeepSpeech.py”, line 828, in main
export()
File “DeepSpeech.py”, line 687, in export
inputs, outputs, _ = create_inference_graph(batch_size=FLAGS.export_batch_size, n_steps=FLAGS.n_steps, tflite=FLAGS.export_tflite)
File “DeepSpeech.py”, line 568, in create_inference_graph
input_samples = tf.placeholder(tf.float32, [Config.audio_window_samples], ‘input_samples’)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py”, line 2077, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py”, line 5789, in placeholder
shape = _execute.make_shape(shape, “shape”)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/execute.py”, line 150, in make_shape
e))
ValueError: Error converting shape to a TensorShape: Ambiguous dimension: 1411.2.
@lucifera678 Can you use proper code formatting for console / log output ? It’s complicated to read otherwise.
Again, can you share your changes ?
(.virtualenv) kdavis-19htdh:DeepSpeech kdavis$ find . -name “*.py” -exec grep 16000 {} /dev/null ;
./util/flags.py: f.DEFINE_integer(‘audio_sample_rate’, 16000, ‘sample rate value expected by model’)
./bin/import_cv2.py:SAMPLE_RATE = 16000
./bin/import_fisher.py: origAudios = [librosa.load(wav_file, sr=16000, mono=False) for wav_file in wav_files]
./bin/import_swb.py: audioData, frameRate = librosa.load(temp_wav_file, sr=16000, mono=True)
./bin/import_ts.py:SAMPLE_RATE = 16000
./bin/import_cv.py:SAMPLE_RATE = 16000
./bin/import_gram_vaani.py:SAMPLE_RATE = 16000
./bin/import_lingua_libre.py:SAMPLE_RATE = 16000
./bin/import_aishell.py: durations = (df[‘wav_filesize’] - 44) / 16000 / 2
./examples/vad_transcriber/wavTranscriber.py: audio_length = len(audio) * (1 / 16000)
./examples/vad_transcriber/wavTranscriber.py: assert sample_rate == 16000, “Only 16000Hz input WAV files are supported for now!”
./examples/vad_transcriber/wavSplit.py: assert sample_rate in (8000, 16000, 32000)
./examples/mic_vad_streaming/mic_vad_streaming.py: RATE_PROCESS = 16000
./examples/mic_vad_streaming/mic_vad_streaming.py: “”“Return a block of audio data resampled to 16000hz, blocking if necessary.”""
./examples/mic_vad_streaming/mic_vad_streaming.py: DEFAULT_SAMPLE_RATE = 16000
./stats.py: parser.add_argument("–sample-rate", type=int, default=16000, required=False, help=“Audio sample rate”)
./native_client/python/client.py: sox_cmd = 'sox {} --type raw --bits 16 --channels 1 --rate 16000 --encoding signed-integer --endian little --compression 0.0 --no-dither - '.format(quote(audio_path))
./native_client/python/client.py: return 16000, np.frombuffer(output, np.int16)
./native_client/python/client.py: if fs != 16000:
./native_client/python/client.py: audio_length = fin.getnframes() * (1/16000)
./native_client/python/init.py: def setupStream(self, pre_alloc_frames=150, sample_rate=16000):
changed in all this files as you mentioned above
Please, this is absolutely not useful. Can’t you git diff
and share the changes appropriately using code formatting ?
@lucifera678,
40s sequences to train are very large !!
You took a lot of time to train your model, however you obtain a wer. 99!!
99% error !!
It’s normal that the results are poor…
Sure that you should try training with only 10 sentences, 16k mono, max 15s… See the results…correct the params… Anderstand…
And restart later with all your datas.
Good luck
Vincent
WER: 0.980000, CER: 0.983333, loss: 1326.395996
• src: "three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three three "
• res: “three”
I Exporting the model…
Traceback (most recent call last):
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/execute.py”, line 145, in make_shape
shape = tensor_shape.as_shape(v)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 1125, in as_shape
return TensorShape(shape)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 690, in init
self._dims = [as_dimension(d) for d in dims_iter]
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 690, in
self._dims = [as_dimension(d) for d in dims_iter]
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 632, in as_dimension
return Dimension(value)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py”, line 188, in init
raise ValueError(“Ambiguous dimension: %s” % value)
ValueError: Ambiguous dimension: 1411.2
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “DeepSpeech.py”, line 836, in
tf.app.run(main)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py”, line 125, in run
_sys.exit(main(argv))
File “DeepSpeech.py”, line 828, in main
export()
File “DeepSpeech.py”, line 687, in export
inputs, outputs, _ = create_inference_graph(batch_size=FLAGS.export_batch_size, n_steps=FLAGS.n_steps, tflite=FLAGS.export_tflite)
File “DeepSpeech.py”, line 568, in create_inference_graph
input_samples = tf.placeholder(tf.float32, [Config.audio_window_samples], ‘input_samples’)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py”, line 2077, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py”, line 5789, in placeholder
shape = _execute.make_shape(shape, “shape”)
File “/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/execute.py”, line 150, in make_shape
e))
ValueError: Error converting shape to a TensorShape: Ambiguous dimension: 1411.2.
I have formated the code is there anything that can I do
Sharing the diff of your change s?
./util/flags.py: f.DEFINE_integer(‘audio_sample_rate’, 44100, ‘sample rate value expected by model’)
./bin/import_cv2.py:SAMPLE_RATE = 44100
./bin/import_fisher.py: origAudios = [librosa.load(wav_file, sr= 44100, mono=False) for wav_file in wav_files]
./bin/import_swb.py: audioData, frameRate = librosa.load(temp_wav_file, sr= 44100, mono=True)
./bin/import_ts.py:SAMPLE_RATE = 44100
./bin/import_cv.py:SAMPLE_RATE = 44100
./bin/import_gram_vaani.py:SAMPLE_RATE = 44100
./bin/import_lingua_libre.py:SAMPLE_RATE = 44100
./bin/import_aishell.py: durations = (df[‘wav_filesize’] - 44) / 44100 / 2
./examples/vad_transcriber/wavTranscriber.py: audio_length = len(audio) * (1 / 44100)
./examples/vad_transcriber/wavTranscriber.py: assert sample_rate == 16000, “Only 16000Hz input WAV files are supported for now!”
./examples/vad_transcriber/wavSplit.py: assert sample_rate in (8000, 16000, 32000, 44100)
./examples/mic_vad_streaming/mic_vad_streaming.py: RATE_PROCESS = 44100
./examples/mic_vad_streaming/mic_vad_streaming.py: “”“Return a block of audio data resampled to 16000hz, blocking if necessary.”""
./examples/mic_vad_streaming/mic_vad_streaming.py: DEFAULT_SAMPLE_RATE = 44100
./stats.py: parser.add_argument("–sample-rate", type=int, default= 44100, required=False, help=“Audio sample rate”)
./native_client/python/client.py: sox_cmd = 'sox {} --type raw --bits 16 --channels 1 --rate 44100 --encoding signed-integer --endian little --compression 0.0 --no-dither - '.format(quote(audio_path))
./native_client/python/client.py: return 44100, np.frombuffer(output, np.int16)
./native_client/python/client.py: if fs != 44100:
./native_client/python/client.py: audio_length = fin.getnframes() * (1/ 44100)
./native_client/python/ init .py: def setupStream(self, pre_alloc_frames=150, sample_rate= 44100):
All the Bold formatted text are the changes in the files as you said me to change 16000 to 44100
I’m sorry, that’s still not a diff as I asked. It’s completely unusable.
@lucifera678 maybe you aren’t aware what a diff is? If not, one of these might give you a bit of background:
https://www.git-tower.com/learn/git/ebook/en/command-line/advanced-topics/diffs
Could you upload your data somewhere? I could try to train a model and give you the config you need. Feels like that’s easier than debugging such specific issues.
@lucifera678
ok ok listen here. go to util/flags.py . change audio_sample_rate to 16000(you set it as 44100). and then you’ll see that you can export your model.
Do i know if that screws up your model? I do not. But can you export it? yes. good luck.
Is there restriction for wav file length for training?
You want to use data with max length 15 seconds, optimally even less than that (Mozilla model is trained on files with max length 8 seconds if I recall correctly).
More on how to deal with it here: Longer audio files with Deep Speech
Moreover in DeepSpeech source code in examples there are some scripts where vad transcriber was presented.
(venv) sehar@sehar-HP-Z220-CMT-Workstation:~/DeepSpeech$ python speech.py /home/sehar/urdu-models/output_graph1.pb /home/sehar/urdu-models/alphabet1.txt /home/sehar/urdu-models/sent6urd.wav
TensorFlow: v1.13.1-10-g3e0cc53
DeepSpeech: v0.5.1-0-g4b29b78
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2019-11-26 11:35:11.022262: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-26 11:35:11.023504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.759
pciBusID: 0000:01:00.0
totalMemory: 5.93GiB freeMemory: 5.59GiB
2019-11-26 11:35:11.023541: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-11-26 11:35:15.973514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-26 11:35:15.973566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-11-26 11:35:15.973593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-11-26 11:35:16.001763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5371 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-11-26 11:35:17.972698: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel (‘op: “UnwrapDatasetVariant” device_type: “CPU”’) for unknown op: UnwrapDatasetVariant
2019-11-26 11:35:17.972759: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel (‘op: “WrapDatasetVariant” device_type: “GPU” host_memory_arg: “input_handle” host_memory_arg: “output_handle”’) for unknown op: WrapDatasetVariant
2019-11-26 11:35:17.972786: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel (‘op: “WrapDatasetVariant” device_type: “CPU”’) for unknown op: WrapDatasetVariant
2019-11-26 11:35:17.972954: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel (‘op: “UnwrapDatasetVariant” device_type: “GPU” host_memory_arg: “input_handle” host_memory_arg: “output_handle”’) for unknown op: UnwrapDatasetVariant
Error running session: Not found: PruneForTargets: Some target nodes not found: initialize_state
Segmentation fault (core dumped)
after training my model i tested it and its giving me this error
@sehar_capricon Please please please, can you really make an effort and USE CODE FORMATTING ? Your output is hard to read, this is DIFFICULT for people to help you.
You are running binary v0.5.1
, your error would suggest you trained from current master which targets v0.6.x
binaries.
thanks for your prompt response,
i have trained my model on deepspeech v 0.5.1
Can you triple check that ? How did you performed the export ?