Played about with the 0.5 model today with the latest code on master. It’s a mixed bag on my test files.
Here’s what improved:
Transcript:
transparent pricing streamlined purchase a three day worry free exchange and test drives that come to you
0.4.1:
anarchising stream line purchase a three day worry for exchange test drives that come to you
0.5:
transparent rising stream line purchase a three day worry free exchange and test drives that come to you
Transcript:
before we begin, we’re very excited to announce that our new book is coming out it’s available for pre order right now it comes out october the eighteenth we would love it if you bought it
0.4.1:
before we begin were very excited to announce that our new book is coming out its available for pre order right now it comes out all torothee we would love it if you boat it
0.5:
before we begin were very excited to announce that our new book is coming out it’s available for pre order right now he comes out alterity eighteenth we would love it if you bought it
Transcript:
if you wanna know for example why they banned false beards in copenhagen that’s in the book if you wanna find out what method was used to get those thai football team kids out of that cave
0.4.1:
if ye want to know for example why they band pulsebeats in copenhagen that’s in the book if you want to find now what method was used to get those tifoon kids out of that cave
0.5:
if you want to know for example why they band false beards in copenhagen that’s in the book if you want to find out what method was used to get those tie football team kids out of that cave
What got worse:
Transcript:
and now i am joined live in the studio by the prime minister theresa may good morning prime minister morning andrew um can we agree to start with that the one thing that voters deserve in what you yourself has said is going to be a very very important election is no sound bites
0.4.1:
and now i am joined life in the studio by the prime minister teresa make good morning from lin enter and can we agree to start with it the one thing the voters deserving what you yourself he said is going to be a very very important election is no son to bite
0.5:
and now i am joined live in the studio by the prime minister to resume or morning from going into and can we agreed to start with the one thing that vote has deserved in what you yourself he said is going to be a very very important election is no son to bits
Transcript:
in five days from now, MPs will vote on the brexit deal the vote that will not only decide britain’s future in europe but the future of the prime minister
0.4.1:
in five days from now and peace will vote on the break it deal the boat the will not only decide britons future in europe of the future of the prime minister
0.5:
in five days from now this will boat on the break it deal the vote that will not only decide briton’s future in europe at the future of the prime minister
And this one just turned into gobbledegook:
Transcript:
your royal highness meghan markle congratulations to you both thank you can we start with the proposal and the actual moment of your engagement when did it happen how did it happen
0.4.1:
for all highness than i can well coagulations to both and you care i was sold with the proposal and the actual woman your engagement wanted at an how to the happen
0.5:
moral highness and me amalgamations to both and her oneself with the proposal and the acumen for ingagement wended out on honedale
So in summary, the model does seem better at the nuances between similar words (e.g. “bought” vs “boat”) but in some cases it is producing worse output than 0.4.1.
(Note: when comparing two incorrect transcripts, I consider one “better” if it is phonetically closer to the correct word - e.g. “and peace” is closer to “MPs” than “this”.)