Update:
- Files causing inf loss -
This issue was fixed after fixing my transcripts from hex notation to proper utf-8 encoded chars.
- OOM errors -
Fixed by reducing batch size to something my gpu could handle, in my case 4.
I was able to get it train for 2 epochs successfully, however i encountered another issue i have been stuck on
I wanted to test the model after 2 epochs, when the tests are run it returns the error -
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /home/anon/Downloads/jaSTTDatasets/final-test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00 Traceback (most recent call last):
File "DeepSpeech.py", line 12, in <module>
ds_train.run_script()
File "/DeepSpeech/training/deepspeech_training/train.py", line 982, in run_script
absl.app.run(main)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/DeepSpeech/training/deepspeech_training/train.py", line 958, in main
test()
File "/DeepSpeech/training/deepspeech_training/train.py", line 682, in test
samples = evaluate(FLAGS.test_files.split(','), create_model)
File "/DeepSpeech/training/deepspeech_training/evaluate.py", line 132, in evaluate
samples.extend(run_test(init_op, dataset=csv))
File "/DeepSpeech/training/deepspeech_training/evaluate.py", line 114, in run_test
cutoff_prob=FLAGS.cutoff_prob, cutoff_top_n=FLAGS.cutoff_top_n)
File "/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/__init__.py", line 228, in ctc_beam_search_decoder_batch
for beam_results in batch_beam_results
File "/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/__init__.py", line 228, in <listcomp>
for beam_results in batch_beam_results
File "/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/__init__.py", line 227, in <listcomp>
[(res.confidence, alphabet.Decode(res.tokens)) for res in beam_results]
File "/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/__init__.py", line 138, in Decode
return res.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: invalid continuation byte
Naturally i assumed its utf8 issue - and i need to fix my files. However i have tried almost everything to fix the file and nothing seems to work. Note that it works for training and validation - however during tests it fails.
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ isutf8 ./Downloads/jaSTTDatasets/final-dev.csv
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ isutf8 ./Downloads/jaSTTDatasets/final-train.csv
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ isutf8 ./Downloads/jaSTTDatasets/final-test.csv
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ isutf8 ./Downloads/jaSTTDatasets/new
newLogs.txt newutf.csv
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ isutf8 ./Downloads/jaSTTDatasets/newutf.csv
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ iconv -f UTF-8 ./Downloads/jaSTTDatasets/newutf.csv -o /dev/null; echo $?
0
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ iconv -f UTF-8 ./Downloads/jaSTTDatasets/final-test.csv -o /dev/null; echo $?
0
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ iconv -f UTF-8 ./Downloads/jaSTTDatasets/final-train.csv -o /dev/null; echo $?
0
anon@anon-Lenovo-Legion-Y540-15IRH-PG0:~$ iconv -f UTF-8 ./Downloads/jaSTTDatasets/final-dev.csv -o /dev/null; echo $?
0
As you can see the files seem to be properly utf 8 encoded.
Here is the test csv file - final-test.zip (349 Bytes)
My docker file if your curious about the environment i am building in -
# Please refer to the TRAINING documentation, "Basic Dockerfile for training"
FROM tensorflow/tensorflow:1.15.4-gpu-py3
ENV DEBIAN_FRONTEND=noninteractive
ENV DEEPSPEECH_REPO=https://github.com/mozilla/DeepSpeech.git
ENV DEEPSPEECH_SHA=origin/master
RUN apt-get update && apt-get install -y --no-install-recommends \
apt-utils \
bash-completion \
build-essential \
cmake \
curl \
git \
libboost-all-dev \
libbz2-dev \
locales \
python3-venv \
unzip \
wget
# We need to remove it because it's breaking deepspeech install later with
# weird errors about setuptools
RUN apt-get purge -y python3-xdg
# Install dependencies for audio augmentation
RUN apt-get install -y --no-install-recommends libopus0 libsndfile1
# Try and free some space
RUN rm -rf /var/lib/apt/lists/*
WORKDIR /
RUN git clone $DEEPSPEECH_REPO DeepSpeech
WORKDIR /DeepSpeech
RUN git checkout $DEEPSPEECH_SHA
# Build CTC decoder first, to avoid clashes on incompatible versions upgrades
RUN cd native_client/ctcdecode && make NUM_PROCESSES=$(nproc) bindings
RUN pip3 install --upgrade native_client/ctcdecode/dist/*.whl
# Prepare deps
RUN pip3 install --upgrade pip==20.2.2 wheel==0.34.2 setuptools==49.6.0
# Install DeepSpeech
# - No need for the decoder since we did it earlier
# - There is already correct TensorFlow GPU installed on the base image,
# we don't want to break that
RUN DS_NODECODER=y DS_NOTENSORFLOW=y pip3 install --upgrade -e .
# Tool to convert output graph for inference
RUN python3 util/taskcluster.py --source tensorflow --branch r1.15 \
--artifact convert_graphdef_memmapped_format --target .
# Build KenLM to generate new scorers
WORKDIR /DeepSpeech/native_client
RUN rm -rf kenlm && \
git clone https://github.com/kpu/kenlm && \
cd kenlm && \
git checkout 87e85e66c99ceff1fab2500a7c60c01da7315eec && \
mkdir -p build && \
cd build && \
cmake .. && \
make -j $(nproc)
WORKDIR /DeepSpeech
ENV TF_FORCE_GPU_ALLOW_GROWTH=true
RUN apt-get update
RUN apt-get install vim -y
RUN sed -i 's/tfv1.nn.ctc_loss(labels=batch_y, inputs=logits, sequence_length=batch_seq_len)/tfv1.nn.ctc_loss(labels=batch_y, inputs=logits, sequence_length=batch_seq_len, ignore_longer_outputs_than_inputs=True)/g' training/deepspeech_training/train.py
RUN sed -i 's/sequence_length=batch_x_len)/sequence_length=batch_x_len, ignore_longer_outputs_than_inputs=True)/g' training/deepspeech_training/evaluate.py