Hi all,
I have trained a Chinese model with version v0.7.4, 33Hrs mp3 from Common Voice. The result of the trained sentences is very accurate. Even the sentence is in different sources. For example,
source: “對外界提出的問題第一時間交代” (a.mp3)
result : “對外界提出的問題第一時間交代” (recognized in the microphone by web socket)
However, it is bad in new sentences and the combination of the old sentences. For example:
source: “對外界提出的問題第一時間交代” (a.mp3)
source: “尊敬的陳冬副主任” (c.mp3)
And i try to combine some part of above two sentences:
The sentence i want: “陳冬副主任對外界提出的問題”
The result : “陳姑吉列任對外界提出答左我” (recognized in the microphone by web socket)
I am very sorry that I am unavailable to provide the training command and code to you. I can provide the info below only:
Language: Chinese
accent : Hong Kong (Cantonese)
lm: generated by myself with 2904 characters, 5ngrams
MP3 source: Common Voice 33Hrs
dropout rate:0.22
checkpoint size:222195
best-dev size: 222195
I expect my model can do the combination(mix 2 sentence words) with trained sentences.
Q1.What is the problem of my model? Is the data set is not enought?Any solution?
Q2.Besides, I will train the model with new data. If the new data makes the loss functions larger, will the model ignore the new training and use the old checkpoint with lower loss function? The question why i doubt is the testing step always said “I Loading best validating checkpoint from ./path/best_dev-222195”. And i know, the large loss training never save in best_dev.
Thank you