Problem including BatchNorm after Dense layers

bernardohenz · July 7, 2018, 9:43pm

I am trying to include BatchNormalization layers after dense layers, much like the paper DeepSpeech2 implements (next I am going to implement BN on the LSTMs). In summay, I am using the following implementation of batchnorm inside the BiRNN definition:

    layer_1 = tf.layers.batch_normalization(inputs=layer_1, momentum=FLAGS.batchnorm_momentum, training=training_ph)

where training_ph is a boolean placeholder set to True during training, and False otherwise; and batchnorm_momentum=0.9. I’ve also passing the updateops for session during training (on line https://github.com/mozilla/DeepSpeech/blob/master/DeepSpeech.py#L1637):

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
 _, _, current_step, batch_loss, batch_report = session.run([train_op, update_ops ,global_step, loss, report_params], **extra_params)

Now to the problem: it seems that it works on training (loss going down), but it seems bugged in validation. As batchnorm is behaving differently for training and validation, the training_ph appears to be correct, and I suspect that the params of batchnorm (moving_mean and moving_variance) are not being properly updated.

Ps: I tried to check the branch of ‘bnlstm’, but it didn’t help me to fix this. it seems that @reuben was having some trouble making BN work too, was he capable of using it in the end?

Topic		Replies	Views
DeepSpeech 2 based models with CNN and multiple RNN layers with BatchNorm DeepSpeech	0	560	July 27, 2020
Question on the BiRNN function of DeepSpeech DeepSpeech	2	1068	January 16, 2019
Not working for checking non-finite loss files? DeepSpeech	5	421	October 29, 2019
Fail to implement Stacked LSTM DeepSpeech	5	2113	September 28, 2020
Problems about evaluating and the weight or bias Matrix DeepSpeech	6	498	June 5, 2019

Problem including BatchNorm after Dense layers

Related topics