Custom dataset training


(Hema Sree543) #1

I have the importer ready for my custom dataset. Could somebody assist me with the training part.

i run ./bin/myscript.sh
it says fle does not exist though it’s present

Am i doing it right?


(kdavis) #2

Could you show us myscript.sh and the error you are getting?


(Hema Sree543) #3

I could fix that. But now i’m getting a new error when i run the same command

  • [ ! -f DeepSpeech.py ]
  • [ ! -f data/ldc93s1/finalimport.csv ]
  • [ -d ]
  • python -c from xdg import BaseDirectory as xdg; print(xdg.save_data_path(“deepspeech/ldc93s1”))
  • checkpoint_dir=/home/ubuntu/.local/share/deepspeech/ldc93s1
  • python -u DeepSpeech.py --train_files data/ldc93s1/finalimport.csv --dev_files data/ldc93s1/finalimport.csv --test_files data/ldc93s1/finalimport.csv --train_batch_size 1 --dev_batch_size 1 --test_batch_size 1 --n_hidden 494 --epoch 50 --checkpoint_dir /home/ubuntu/.local/share/deepspeech/ldc93s1
    Traceback (most recent call last):
    File “DeepSpeech.py”, line 1838, in
    tf.app.run()
    File “/home/ubuntu/saii/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py”, line 124, in run
    _sys.exit(main(argv))
    File “DeepSpeech.py”, line 1790, in main
    initialize_globals()
    File “DeepSpeech.py”, line 180, in initialize_globals
    COORD = TrainingCoordinator()
    File “DeepSpeech.py”, line 1130, in init
    self._httpd = BaseHTTPServer.HTTPServer((FLAGS.coord_host, FLAGS.coord_port), TrainingCoordinator.TrainingCoordinationHandler)
    File “/usr/lib/python2.7/SocketServer.py”, line 417, in init
    self.server_bind()
    File “/usr/lib/python2.7/BaseHTTPServer.py”, line 108, in server_bind
    SocketServer.TCPServer.server_bind(self)
    File “/usr/lib/python2.7/SocketServer.py”, line 431, in server_bind
    self.socket.bind(self.server_address)
    File “/usr/lib/python2.7/socket.py”, line 228, in meth
    return getattr(self._sock,name)(*args)
    socket.error: [Errno 98] Address already in use

My run-finalimport.sh content:

#!/bin/sh
set -xe
if [ ! -f DeepSpeech.py ]; then
echo "Please make sure you run this from DeepSpeech’s top level directory."
exit 1
fi;

if [ ! -f “data/ldc93s1/finalimport.csv” ]; then
echo "Downloading and preprocessing LDC93S1 example data, saving in ./data/ldc93s1."
python -u bin/import_ldc93s1.py ./data/ldc93s1
fi;

if [ -d “${COMPUTE_KEEP_DIR}” ]; then
checkpoint_dir=$COMPUTE_KEEP_DIR
else
checkpoint_dir=$(python -c ‘from xdg import BaseDirectory as xdg; print(xdg.save_data_path(“deepspeech/ldc93s1”))’)
fi

python -u DeepSpeech.py
–train_files data/ldc93s1/finalimport.csv
–dev_files data/ldc93s1/finalimport.csv
–test_files data/ldc93s1/finalimport.csv
–train_batch_size 1308
–dev_batch_size 1308
–test_batch_size 1308
–n_hidden 494
–epoch 50
–checkpoint_dir “$checkpoint_dir”
“$@”


(kdavis) #4

It looks like the default port 2500 is in use.

Try using another port, say 3247, by passing the --coord_port 3247 argument to DeepSpeech.py


(kdavis) #5

As a follow-on comment, are you sure all old runs of DeepSpeech.py have completed?