audio_sample_rate 48000
Use 16KHz like all of us
This means you have bad data, get rid of it.
With this few material your results are acutally quite good. Use lower learning rate for transfer. And maybe try 0.8 checkpoint. Some had problems with the newer release.