Checkpoints with changed Loss Function still useful?

Hey all,
I’ve changed the code a bit as I wanted to use tf.nn.ctc_loss_v2 without sparse representation of the labels. In feeding.py:

def generate_values():
for _, row in df.iterrows():
yield row.wav_filename,row.transcript, len(row.transcript)

def entry_to_features(wav_filename, transcript, transcript_len):
features, features_len = audiofile_to_features(wav_filename)
return features, features_len, transcript, transcript_len

def batch_fn(features, features_len, transcripts, transcript_len):
features = tf.data.Dataset.zip((features, features_len))
features = features.padded_batch(batch_size,
padded_shapes=([None, Config.n_input], ))
trans=tf.data.Dataset.zip((transcripts,
transcript_len)).padded_batch(batch_size, padded_shapes=( [None], )
return tf.data.Dataset.zip((features, trans))

dataset = tf.data.Dataset.from_generator(generate_values, output_types=(tf.string,tf.int64, tf.int64))
dataset= dataset.map(entry_to_features)
dataset= dataset.window(batch_size, drop_remainder=True)
dataset= dataset.flat_map(batch_fn)

And for DeepSpeech.py

(batch_x, batch_x_seq_len),(batch_y, batch_y_seq_len) =iterator.get_next()
logits, _ = create_model(batch_x, batch_x_seq_len, dropout, reuse=reuse)
total_loss = tf.nn.ctc_loss_v2( labels=batch_y, logits=logits,label_lengt=batch_y_seq_len, logit_length=batch_x_seq_len, blank_index=None) 

My question now: is it still possible (as I didn’t change anything in create_model) to use the checkpoints reasonable? They are loading in the beginning of the training but my results are less powerful after training a bit with this ‘new loss’. That’s why I assume it’s not using the weights or the new loss is way different from the old one (more than I expected) or I did something wrong with changing how the dataset is created. I wonder why we need the .window function as we batching in ‘batch_fn’ already? Any suggestions? Thank you already