How can I change the bottom dense layer to conv1d layer?

beomgon.yu · January 29, 2020, 12:01am

Hi,
thanks in advance for reply.
I am trying to train the speech recog using deep speech for korean.

I want to add conv1d layer at bottom layer.

in create_model ,
…
batch_x = tf.reshape(batch_x, [-1, Config.n_input + 2Config.n_inputConfig.n_context])
layers[‘input_reshaped’] = batch_x

#code for conv1d
W1 = tf.Variable(tf.random_normal([3, 1, 128], stddev=0.01))
conv = tf.nn.conv1d(batch_x, W1, stride=[1,1,1], padding=‘SAME’)

but tensor shape error happened like below.
ValueError: Shape must be rank 4 but is rank 3 for ‘tower_0/conv1d_1’ (op: ‘Conv2D’) with input shapes: [?,1,494], [1,3,1,128].

in above log, where 494 is comming?
I dont know the input data shape.
in feeding.py, n_input(mfcc feature dimenstion) is 26, and where 19 is come from?

I checking the code, but is there any doc or explain about data format, and how is it transformed??

thanks.

beomgon.yu · January 29, 2020, 12:49am

I find the comment like below.

Input shape: [batch_size, n_steps, n_input + 2n_inputn_context]

here it seemed that 494 is made from abot eq during making mfcc.
can I know the n_steps, the length of RNN cell?

reuben · January 29, 2020, 6:15am

19 comes from the context window around each timestep: https://github.com/mozilla/DeepSpeech/blob/master/doc/Geometry.rst#n_context