We are a small, but growing, group of intrepid community members (staff and volunteers) that believe Mozilla should help protect the web through Machine Translation.
This looks really interesting I hope I will be able to get involved with the project as soon as I get enough idea about machine learning. These innovations are very inspiring!
Look at slides 16, 17, and 18 for pathways for involvement and relevant resources and starting points. Also, come hang out in the #intellego room in MozIRC (irc.mozilla.org). We’re always idling there. Hope to see you soon!
That was a very cool talk!
Thanks for sharing the slides. Now it is a little more clear about the contribution pathways. But still not too clear about it.
Is there anything going on in the scanning Wikipedia to build statistical model part?
I will be quite interested in that,
Scanning Wikipedia to build a statistical model was an idea that one of our contributors came up with. We haven’t explored it or fleshed it out in more detail yet because that’s working towards the long-term goal of building our own termbase. For the immediate future, our priority is getting a working tool in an upcoming release of Firefox as fast as possible. Google spent 10 years building their termbase. We’re up for it, and we’d welcome your efforts along that line, but it’s unlikely to yield quick results.
That being said, feel free to take on that part of the project! You can start an Etherpad and begin specifying what is required to do the scanning and statistical modeling. I’m sure other interested developers will join in! Please let us know in #intellego if we can help you along.
@mekki Well to quickly build it (and get going) you still need something to base the work on. from the spec’s it seemed to me the primary objective at this stage will be to integrate/assimilate the available options and letting user choose which one is providing better result is the plan.
But I could not find any specific details on any work that might have been done in that regard. Is there anywhere I can read more about it or is it in planning stage?
Though I am not part of the MT team(or even near) at IBM here but we do some very little bit similar work for trainig our pipeline for watson here (not for MT though). So I would be interested to take up on the offer (Wikipedia)
I will probably create an etherpad (but before that try my hands on a little bit to see how feasible it is)