Modèle français 0.3 pour DeepSpeech v0.6

Hello,

J’ai repris le Docker précédemment partagé, et remis à jour sur la version actuelle de DeepSpeech (v0.6).

Il a été entraîné avec le Docker disponible https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/CONTRIBUTING.md

  • entraîné à partir de 0
  • importation de LinguaLibre
  • importation de TrainingSpeech
  • importation de Common Voice

Utilisation du language model, en augmentant celui-ci : maintenant c’est un dump wikipedia + débats de l’assemblée nationale.

Côté qualité, voici la sortie des tests:

esting model on /mnt/extracted/data/lingualibre/lingua_libre_Q21-fra-French_test.csv                                                                                                                                                                                                                                                                                          [358/1965]
Test epoch | Steps: 113 | Elapsed Time: 0:00:55                                                                                                                                                                                                                                                                                                                                          
Test on /mnt/extracted/data/lingualibre/lingua_libre_Q21-fra-French_test.csv - WER: 0.483298, CER: 0.153861, loss: 7.202296                                                                 
--------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                                         
WER: 4.000000, CER: 0.500000, loss: 15.194188                                                                                                                                               
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/Ltrlg/endosmomètre.wav                                                                                                                                                                                                                                                                                        
 - src: "endosmomètre"                                                                                                                                                                                                                                                                                                                                                                   
 - res: "en dos mon mettre"                                                                                                                                                                                                                                                                                                                                                              
--------------------------------------------------------------------------------                                                                                                            
WER: 4.000000, CER: 0.333333, loss: 22.200766                                                                                                                                                                                                                                                                                                                                            
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/Lyokoï/butyrupapiphiliste.wav                                                                                    
 - src: "butyrupapiphiliste"                                                                                                                                                                                                                                                                                                                                                             
 - res: "buti ru papic liste"                                                                                                                                                               
--------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                                         
WER: 4.000000, CER: 0.388889, loss: 27.049189                                                                                                                                               
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/WikiLucas00/psycholinguistique.wav                                                                                                                                                                                                                                                                            
 - src: "psycholinguistique"                                                                                                                                                                
 - res: "si colin qui stique"                                                                                                                                                                                                                                                                                                                                                            
--------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                                         
WER: 4.000000, CER: 2.000000, loss: 34.208519                                                                                                                                                                                                                                                                                                                                            
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/Ltrlg/SMTPS.wav
 - src: "smtps"
 - res: "elle me et se"
--------------------------------------------------------------------------------
WER: 4.000000, CER: 0.440000, loss: 44.713158
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/DSwissK/anticonstitutionnellement.wav
 - src: "anticonstitutionnellement"
 - res: "anti comité ce element"
--------------------------------------------------------------------------------
WER: 3.000000, CER: 0.200000, loss: 2.525115
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/Lyokoï/vivrensemblisme.wav
 - src: "vivrensemblisme"
 - res: "vivre ensembl isme"
--------------------------------------------------------------------------------
WER: 3.000000, CER: 0.307692, loss: 5.106504
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/Lyokoï/pleurogynique.wav
 - src: "pleurogynique"
 - res: "pleu rogi mique"
--------------------------------------------------------------------------------
WER: 3.000000, CER: 0.266667, loss: 5.884517
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/WikiLucas00/cuthomiurophile.wav
 - src: "cuthomiurophile"
 - res: "cu tomi rochile"
--------------------------------------------------------------------------------
WER: 3.000000, CER: 0.555556, loss: 6.203856
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/WikiLucas00/remiauler.wav
 - src: "remiauler"
 - res: "remis au la"
--------------------------------------------------------------------------------
WER: 3.000000, CER: 0.333333, loss: 6.232978
 - wav: file:///mnt/extracted/data/lingualibre/lingua_libre/Q21-fra-French/WikiLucas00/cicéronesque.wav
 - src: "cicéronesque"
 - res: "si ce ronesque"
--------------------------------------------------------------------------------
Testing model on /mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR_test.csv
Test epoch | Steps: 193 | Elapsed Time: 0:08:48                                                                                                                                                                                                                                                                                                                                          
Test on /mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR_test.csv - WER: 0.195013, CER: 0.063836, loss: 20.937897
--------------------------------------------------------------------------------
WER: 3.000000, CER: 1.000000, loss: 15.624549
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/LeComteDeMonteCristoT1Chap5_0237.converted.wav
 - src: "espoir"
 - res: "est ce pour"
--------------------------------------------------------------------------------
WER: 3.000000, CER: 1.000000, loss: 17.766432
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/MonsieurLecoqP1C16_0188.converted.wav
 - src: "continuez"
 - res: "quand il ne"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.500000, loss: 6.433061
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/MonsieurLecoqT2P04_0012.converted.wav
 - src: "hola"
 - res: "o la"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.750000, loss: 8.783286
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/LesMysteresDeParisT3P5C12_0281.converted.wav
 - src: "cici"
 - res: "si si"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.300000, loss: 9.000151
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/LesMysteresDeParisT2P3C2_0146.converted.wav
 - src: "infortunee"
 - res: "un fortune"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.714286, loss: 14.999763
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/LaGloireDuComacchio_0276.converted.wav
 - src: "orfevre"
 - res: "or fait "
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.750000, loss: 19.292809
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/LesMysteresDeParisT4P8C13_0002.converted.wav
 - src: "punition"
 - res: "une fission"
--------------------------------------------------------------------------------
WER: 1.666667, CER: 0.681818, loss: 104.470360
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/LesMysteresDeParisT1P1C5_0129.converted.wav
 - src: "diminution de fourloir"
 - res: "diminution de de fourreur a sa fin"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.333333, loss: 7.173564
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/madamebovaryC18_0315.converted.wav
 - src: "parle moi"
 - res: "par le moins"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.500000, loss: 9.321317
 - wav: file:///mnt/extracted/data/trainingspeech/ts_2019-04-11_fr_FR/madamebovaryC24_0131.converted.wav
 - src: "ah bonjour"
 - res: "a pour jour"
--------------------------------------------------------------------------------
Testing model on /mnt/extracted/data/cv-fr/clips/test.csv                                                                                                                                                                                                                                                                                                                      [250/1965]
Test epoch | Steps: 225 | Elapsed Time: 0:04:54                                                                                                                                                                                                                                                                                                                                          
Test on /mnt/extracted/data/cv-fr/clips/test.csv - WER: 0.381664, CER: 0.186872, loss: 37.832821
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.333333, loss: 12.047651
 - wav: file:///mnt/extracted/data/cv-fr/clips/890d482adb285fdfcdf1ad5e877d175be48cfe9544834136b7ef094f7252083211f6293c49b49bdc5fcbebeccc9b3eab11f0883358648441015127d2b05e902f.wav
 - src: "bienvenue"
 - res: "bien menu"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.375000, loss: 12.625672
 - wav: file:///mnt/extracted/data/cv-fr/clips/a75444979e340102e37ccacd9244f3391bb060497716b11c67159339b4b9c99179e4f33424b348905d615e3470e6bfffaf621ccc9441db79f170a2a91aef3d79.wav
 - src: "immaculé"
 - res: "il macule"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.857143, loss: 17.255438
 - wav: file:///mnt/extracted/data/cv-fr/clips/e7fcb32b8cac6133b8f84803d6300f7035143dde2720549af428e2528a972eec51defa3564ee54f360f20d272fd9f8c37af4eaa2d3e2ddffb12f2015ac9ee212.wav
 - src: "anglais"
 - res: "un gré"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.600000, loss: 19.866590
 - wav: file:///mnt/extracted/data/cv-fr/clips/52bd675747b86ed0817c91351f4b861448c3b1b364850c378af89ca2e4f03a94a6e17b111c4161fe91364b45c5294bd9ab745e3f8d20b4689332da76cafd99a6.wav
 - src: "scandaleux"
 - res: "en alu"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.285714, loss: 21.647202
 - wav: file:///mnt/extracted/data/cv-fr/clips/27a4b648313e1daa05708c74af8d0f68d010e34b949735a1ea85a58ebe1057d416a22a1f21f9e4913a8f556f57534833c39658d3fd87a554c003903ae07552e2.wav
 - src: "dommage"
 - res: "de mage"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.533333, loss: 22.436478
 - wav: file:///mnt/extracted/data/cv-fr/clips/6adb27aca9f87c8c97d3c37107a7f2ae8121b9ed005ec1a93ea1daaa317f082866bcf615e4d776080292cc3f32ee2a5476e2bef75c1487d9dfb2b37eabd12e8f.wav
 - src: "en substitution"
 - res: "en su qui de son"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.545455, loss: 22.594717
 - wav: file:///mnt/extracted/data/cv-fr/clips/e7cfa56b14f04aa3ef3199fb21e9500e257a8c99784de8f785143007bacf3c17f1c64325235afe1112775b1385d0a15647d69594b6c012107266f8637b794cf8.wav
 - src: "défavorable"
 - res: "des érables"
--------------------------------------------------------------------------------
WER: 1.666667, CER: 0.789474, loss: 44.397202
 - wav: file:///mnt/extracted/data/cv-fr/clips/67057db57aa991f9045132389966a7b3326f373c8870aa126d743e24d4f61613d8ac382c63fc7341dd429c1f5ec126205b19ecbe53027130bd5b684d87cd8200.wav
 - src: "maintenant il pêche"
 - res: "vingt ans est ce tas"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.500000, loss: 27.554390
 - wav: file:///mnt/extracted/data/cv-fr/clips/f482c8c54dcf5f84dbe8b158ccc7f9c31644ff91622a9323da1361f353c3406794c962599a9f745d4a9f39c1c539a200885a5af7683ae63e47880d882d9c74f4.wav
 - src: "ayez pitié"
 - res: "il y a pitié"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.666667, loss: 27.840151
 - wav: file:///mnt/extracted/data/cv-fr/clips/2723b11f49442307f855fa8a2f08ad11bd8a983b73502eec38b9d986ef4d9e9cf9c0c34898f42642522c0598ef3d6bbe5f8478bc7682bde492dcc922cdc1a31e.wav
 - src: "ok indigo"
 - res: "once de go"
--------------------------------------------------------------------------------
Testing model on /mnt/extracted/data/M-AILABS/fr_FR/fr_FR_test.csv
Test epoch | Steps: 222 | Elapsed Time: 0:16:53                                                                                                                                                                                                                                                                                                                                          
Test on /mnt/extracted/data/M-AILABS/fr_FR/fr_FR_test.csv - WER: 0.090007, CER: 0.027703, loss: 15.271018
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.250000, loss: 4.159900
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/female/ezwa/monsieur_lecoq/wavs/monsieur_lecoq_1_18_f000067.wav
 - src: "lesquels"
 - res: "les quel"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.166667, loss: 4.634198
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/female/ezwa/monsieur_lecoq/wavs/monsieur_lecoq_2_36_f000179.wav
 - src: "dubois"
 - res: "du bois"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.625000, loss: 11.090318
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/male/gilles_g_le_blanc/lupin_contre_holmes/wavs/lupin_contre_holmes_13_f000184.wav
 - src: "personne"
 - res: "la sont"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 5.000000, loss: 13.735187
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/female/nadine_eckert_boulet/les_mysteres_de_paris/wavs/les_mysteres_de_paris_4_13_f000027.wav
 - src: "m"
 - res: "on ne"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.500000, loss: 3.678611
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/male/gilles_g_le_blanc/lupin_contre_holmes/wavs/lupin_contre_holmes_07_f000165.wav
 - src: "m destange"
 - res: "mais des tang"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.333333, loss: 16.762291
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/male/gilles_g_le_blanc/lupin_contre_holmes/wavs/lupin_contre_holmes_14_f000218.wav
 - src: "langlais ricana"
 - res: "l'anglaise et cana"
--------------------------------------------------------------------------------
WER: 1.333333, CER: 0.137931, loss: 7.862315
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/female/ezwa/monsieur_lecoq/wavs/monsieur_lecoq_1_06_f000057.wav
 - src: "lecoq trépignait d'impatience"
 - res: "le coq trépignait d'un patience"
--------------------------------------------------------------------------------
WER: 1.125000, CER: 0.708333, loss: 117.493599
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/female/nadine_eckert_boulet/les_mysteres_de_paris/wavs/les_mysteres_de_paris_2_09_f000184.wav
 - src: "m césar bradamanti qui l'a guéri d'un rhumatisme"
 - res: "pour ve monsieur césar amani qu'il a gris d'amati"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.333333, loss: 0.081756
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/male/gilles_g_le_blanc/lupin_contre_holmes/wavs/lupin_contre_holmes_12_f000053.wav
 - src: "oui"
 - res: "ou"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.333333, loss: 0.090137
 - wav: file:///mnt/extracted/data/M-AILABS/fr_FR/female/nadine_eckert_boulet/madame_bovary/wavs/madame_bovary_1_08_f000046.wav
 - src: "oui"
 - res: "ou"
--------------------------------------------------------------------------------
Testing model on /mnt/extracted/data/African_Accented_French/African_Accented_French/African_Accented_French_test.csv
Test epoch | Steps: 20 | Elapsed Time: 0:00:19                                                                                                                                                                                                                                                                                                                                           
Test on /mnt/extracted/data/African_Accented_French/African_Accented_French/African_Accented_French_test.csv - WER: 0.492108, CER: 0.271749, loss: 47.807568
--------------------------------------------------------------------------------
WER: 1.875000, CER: 2.068966, loss: 309.880615
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell2-24/ctell2-24-146.wav
 - src: "quand est ce qu' on l' a volé"
 - res: "cet impossible de savoir quand il que en la volée parce que ce n'est pas objet volé"
--------------------------------------------------------------------------------
WER: 1.750000, CER: 1.555556, loss: 174.814621
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell1-01/ctell1-01-004.wav
 - src: "quand êtes vous né"
 - res: "mais je suis ne amincissant et un"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.800000, loss: 25.109167
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell4-57/ctell4-57-131.wav
 - src: "bonne nuit"
 - res: "bonne ni mais si"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.750000, loss: 50.073940
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell2-29/ctell2-29-275.wav
 - src: "saignez vous"
 - res: "ce que vous ne"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 1.040000, loss: 137.309784
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell3-51/ctell3-51-098.wav
 - src: "quelle est sa nationalité"
 - res: "ces personnes paient ou mon entier"
--------------------------------------------------------------------------------
WER: 1.375000, CER: 1.043478, loss: 265.540131
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell2-20/ctell2-20-046.wav
 - src: "pendant combien de temps croyez vous rester là"
 - res: "selon comment le pot était métadonnée ses lointains que je pouvais"
--------------------------------------------------------------------------------
WER: 1.333333, CER: 1.233333, loss: 197.459869
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell3-51/ctell3-51-084.wav
 - src: "de quelle couleur est sa barbe"
 - res: "il n'y a pas de barre massamaesso comme sa peau"
--------------------------------------------------------------------------------
WER: 1.250000, CER: 0.500000, loss: 35.691940
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/read/ctell5-82/ctell5-82-1103.wav
 - src: "autre pays autre coutume"
 - res: "au fait payer aux coutu"
--------------------------------------------------------------------------------
WER: 1.250000, CER: 1.266667, loss: 76.770599
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell4-55/ctell4-55-093.wav
 - src: "que mesure t il"
 - res: "monsieur un maître sur sa "
--------------------------------------------------------------------------------
WER: 1.250000, CER: 0.840000, loss: 90.877655
 - wav: file:///mnt/extracted/data/African_Accented_French/African_Accented_French/speech/train/yaounde/answers/ctell2-26/ctell2-26-192.wav
 - src: "souffrez vous de vertiges"
 - res: "que vous avez a souffert de neige"
--------------------------------------------------------------------------------

Vous pouvez le trouver (avec les paramètres d’entraînement) sur:

Pour information, j’ai fait une erreur avec la mise à jour pour la v0.6.0-alpha.10, il faut que je recommence l’entraînement et l’export : les modèles ne contenaient pas les métadonnées nécessaires, et donc DS_CreateModel() faisait une erreur:

E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] model_pruner failed: Invalid argument: Invalid input graph.
Unable to fetch metadata: Invalid argument: Tensor metadata_feature_win_len:0, specified in either feed_devices or fetch_devices was not found in the Graph

D’ici quelques heures je devrais pouvoir les remettre en ligne.

Ça a été un peu plus long que prévu, mais c’est fait.

1 Like