Error getting feature_cache

I am training DS in Colab, and since the corpus is saved in gdrive, to avoid reading dataset for every trail run, am using feature_cache, and ran it for 1 epoch,

!python -u DeepSpeech.py
–train_files data/CV/ur/train.csv
–test_files data/CV/ur/test.csv
–dev_files data/CV/ur/dev.csv
–feature_cache data/CV/ur/feature_cache/
–learning_rate 0.001
–train_batch_size 128
–dev_batch_size 128
–test_batch_size 128
–checkpoint_dir data/CV/ur/checkpoint_dir
–max_to_keep 10
–export_dir data/CV/ur/export_dir
–export_language “ur”
–summary_dir data/CV/ur/summary_dir
–n_hidden 1024
–dropout_rate 0.2
–epochs 1
“$@”

Obtained .data-00000-of-00001 in feature_cache dir

However when i continue training, i get the following error, instead of fetching the cache data, it says file not found

tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory
[[node tower_0/IteratorGetNext (defined at DeepSpeech.py:222) ]]
[[Mean_8/_105]]
(1) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory
[[node tower_0/IteratorGetNext (defined at DeepSpeech.py:222) ]]
0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node tower_0/IteratorGetNext:
IteratorV2 (defined at DeepSpeech.py:445)

Input Source operations connected to node tower_0/IteratorGetNext:
IteratorV2 (defined at DeepSpeech.py:445)

I could find the files in the gdrive,
also when i run

!ls .* 
.data-00000-of-00001.tempstate      .index

but when i run !ls i don’t see any file.

Is it with colab that it can’t access the file. If so is there a workaround?

We don’t know and use Colab. Have you verified you are calling feature cache properly? Default behavior is in-memory as much as I remember.

I am calling using this flag, and i do get the data saved in that dir, yet i get that error

tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory

Here is the snap of the data available in the specified dir in gdrive.

The problem is the when i can see the file in the specified dir, i get the error that the file not found, which is weird.

In my analysis, i have found that file beginning with (dot) isn’t accessible, is there a way to save in some other name instead of .data-00000-of-00001. I may be completely wrong in my analysis, just hoping if it could help you debugging.

I can just tell you we have no problem on actual linux sytem, and we don’t use Colab. So no idea

I am attaching the complete error when i run the training command with --feature_cahce

/content/drive/My Drive/PhD/DS/deepspeech
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py:494: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It’s easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means tf.py_functions can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.

W0420 19:02:01.935480 140644369827712 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py:494: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It’s easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means tf.py_functions can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:348: Iterator.output_types (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.data.get_output_types(iterator).
W0420 19:02:02.003367 140644369827712 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:348: Iterator.output_types (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.data.get_output_types(iterator).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:349: Iterator.output_shapes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.data.get_output_shapes(iterator).
W0420 19:02:02.003594 140644369827712 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:349: Iterator.output_shapes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.data.get_output_shapes(iterator).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:351: Iterator.output_classes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.data.get_output_classes(iterator).
W0420 19:02:02.003748 140644369827712 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:351: Iterator.output_classes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.data.get_output_classes(iterator).
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0420 19:02:02.963346 140644369827712 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:Entity <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7fe9d418b0f0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: converting <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7fe9d418b0f0>>: AttributeError: module ‘gast’ has no attribute ‘Num’
W0420 19:02:02.990979 140644369827712 ag_logging.py:145] Entity <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7fe9d418b0f0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: converting <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7fe9d418b0f0>>: AttributeError: module ‘gast’ has no attribute ‘Num’
WARNING:tensorflow:From DeepSpeech.py:236: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0420 19:02:03.061359 140644369827712 deprecation.py:323] From DeepSpeech.py:236: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0420 19:02:04.852655 140644369827712 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from data/CV/ur/checkpoint_dir/train-89
I0420 19:02:04.855031 140644369827712 saver.py:1280] Restoring parameters from data/CV/ur/checkpoint_dir/train-89
I Restored variables from most recent checkpoint at data/CV/ur/checkpoint_dir/train-89, step 89
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000 Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1356, in _do_call
return fn(*args)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory
[[{{node tower_0/IteratorGetNext}}]]
[[Mean_8/_105]]
(1) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory
[[{{node tower_0/IteratorGetNext}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “DeepSpeech.py”, line 974, in
absl.app.run(main)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 299, in run
_run_main(main, args)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 250, in _run_main
sys.exit(main(argv))
File “DeepSpeech.py”, line 947, in main
train()
File “DeepSpeech.py”, line 640, in train
train_loss, _ = run_set(‘train’, epoch, train_init_op)
File “DeepSpeech.py”, line 602, in run_set
feed_dict=feed_dict)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 950, in run
run_metadata_ptr)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1173, in _run
feed_dict_tensor, options, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1350, in _do_run
run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory
[[node tower_0/IteratorGetNext (defined at DeepSpeech.py:222) ]]
[[Mean_8/_105]]
(1) Not found: data/CV/ur/feature_cache/.data-00000-of-00001; No such file or directory
[[node tower_0/IteratorGetNext (defined at DeepSpeech.py:222) ]]
0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node tower_0/IteratorGetNext:
IteratorV2 (defined at DeepSpeech.py:445)

Input Source operations connected to node tower_0/IteratorGetNext:
IteratorV2 (defined at DeepSpeech.py:445)

Original stack trace for ‘tower_0/IteratorGetNext’:
File “DeepSpeech.py”, line 974, in
absl.app.run(main)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 299, in run
_run_main(main, args)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 250, in _run_main
sys.exit(main(argv))
File “DeepSpeech.py”, line 947, in main
train()
File “DeepSpeech.py”, line 477, in train
gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
File “DeepSpeech.py”, line 303, in get_tower_results
avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
File “DeepSpeech.py”, line 222, in calculate_mean_edit_distance_and_loss
batch_filenames, (batch_x, batch_seq_len), batch_y = iterator.get_next()
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py”, line 426, in get_next
output_shapes=self._structure._flat_shapes, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_dataset_ops.py”, line 1947, in iterator_get_next
output_shapes=output_shapes, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py”, line 788, in _apply_op_helper
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py”, line 507, in new_func
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py”, line 3616, in create_op
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py”, line 2005, in init
self._traceback = tf_stack.extract_stack()

I have been trying to resolve it since a week, I have no issues running it without --feature_cache, I have to run in colab due to resource constraints,

I completely understand your response, just posting the whole error.

Maybe it’s because you created a data/CV/ur/feature_cache/ folder? --feature_cache is a file name prefix, you shouldn’t create a folder at that path. Does it happen if you remove the folder and try again?

This clears everything. Thanks a ton…! @reuben

data/CV/ur/feature_cache creates a cache named feature_cache.data-00000-of-00001 where feature_cache acted as a prefix.

But when i used data/CV/ur/feature_cache/ it created a .data-00000-of-00001 inside the feature_cache folder, which is a mess.

I have read in flags.py about this many time since a week, and made sure i was correct,

f.DEFINE_string('feature_cache', '', 'cache MFCC features to disk to speed up future training runs ont he same data. This flag specifies the path where cached features extracted from --train_files will be saved. If empty, or if online augmentation flags are enabled, caching will be disabled.')

Definitely there isn’t mention of a prefix name. However, Thanks a ton for resolving my issue.

Though prefix isn’t mandatory, but i suppose if working in colab, it is necessary to have a prefix, as (dot) files are assumed as hidden and you will end up getting file not found error. Hope am correct.