Django-ftl and developer workflow

pmac · March 7, 2019, 8:54pm

Hi all,

I’m looking at django-ftl for use on bedrock (www.mozilla.org) since we’ll be moving away from our current l10n system this year. I understand that this means that all strings (not just the non-English ones) will live in the .ftl files. So my question is how this is envisioned to be setup for use with our l10n team as well as developers. Would we have a repo where all of these strings are housed and the bedrock developers would check in new strings, or would we have the default locale strings (en-US) in the bedrock repo and others in another, or some combination thereof?

Also, I see that by default it only looks for .ftl files in apps, but given that the translations will most likely be in an external repo we’d want them in a top level directory instead of in an installed django app. I see in the Bundle class that a custom finder can be defined, but adding a folder config a la django staticfiles would be nice.

Pike · March 7, 2019, 9:30pm

Interesting questions. @spookylukey, do you have encoded visions for that?

From how we do workflows in other projects at mozilla, I’d prefer for the en-US sources to be part of the bedrock repository. I’d also like to have flexibility in how we stage files for deployment.

In a bigger picture that extends beyond bedrock, I can see a couple of different stories:

Localization commits back into the product repository, a single source of truth
There’s a separate repository for localizations, with regularly scheduled uplifts into the product repository. Say, on an hourly or daily basis.
L10n is deployed from a separate root, hosted in a different repository than the product repository. Deployments can be triggered by either repository or by a schedule.

For a Mozilla workflow, we’d be interested in a really quick deployment to stage at least, which would favor either the first the the third option.

There is the option of project engineers committing to the l10n repository, but personally, I find that workflow bewildering, so I’d rather have that be seen as a counter example.

Axel

pmac · March 7, 2019, 9:49pm

I agree that it would be best to keep at least en-US strings in the project repo as that is how it works currently (though they’re in the templates and python files directly). But there is this string extraction process that will no longer be required (HOORAY!) that put the en-US strings into the l10n repo. So will we need a step where we’re ready for the en-US strings to go out to the translators so we “sync” them or otherwise move them into the l10n repo so that they can be exposed to translators?

As for quick dev/stage deployment I did see in the docs that dynamic reloading is a setting, so I was figuring that such a thing could be on for dev/stage and off for production and there could be some kind of scheduled process to periodically update the files on dev/stage (via git probably). Does that sound reasonable?

spookylukey · March 8, 2019, 7:24am

@pmac - thanks for your interest in django-ftl - this is the first time I’ve become aware that anyone else was looking at it! In fact I’m not using it in production yet, though hopefully soon. FYI I’ve just pushed some more changes to the repo that were languishing on my computer.

Obviously I have very little clue how Mozilla works in this regard. The current design is based around my own needs and what I perceive to be the normal way things work in Django projects, getting inspiration from the current de-facto standard of GNU gettext and other Django conventions, but with significant changes according to how it seems Fluent works.

I’d be very happy to add features that make it more friendly to Mozilla’s workflow, but it is very difficult for me to know what that looks like from the outside, so you’ll have to be really specific. Adding a folder config setting for searching for files would be fine (which might be useful for quite a lot of other people) and so would formalizing an interface for defining your own ‘finder’ (what I’ve implemented so far may not be at all appropriate, I was just trying to extract some logic out of the Bundle class).

The autoreload logic I’ve made was designed for development, I don’t know how it will stand up in a staging environment. It has improved a lot since my first version, which was based around threading and caused no end of issues - the new version is not threaded and uses django_signals.request_started as a hook on which to do work, which seems to work well with the Django development server. So it might work fine in a staging environment, I just don’t know.

pmac · March 8, 2019, 4:56pm

seems to work well with the Django development server. So it might work fine in a staging environment, I just don’t know.

Ah! So it probably wouldn’t work with a real WSGI server then. That’s no big deal as we can always just have the l10n file update process send SIGHUP if new translations were fetched.

And thanks for being so open and helpful! I was very please to find your project and see how far along it was. I’m working with the l10n team on moving bedrock to something and I think it would be quite nice if that thing were Fluent and django-ftl, but it will be quite a large project to move and adjustment for the development team.

Another question: what happens if multiple apps or folders contain the same FTL file names? Are the translations combined or is the last one found the winner? I ask because I’m trying to imagine the best developer experience for creating or editing pages. We’d need to be able to add and test strings locally without pulling in the translations repo, but we might also want to be able to have both active as well I think so that we could spot issues. What is your envisioned workflow for translations? Are all FTL files kept in the project repo?

spookylukey · March 11, 2019, 9:04am

The reload mechanism is independent of the development server features, it only uses the request_started signal and pyinotify features. So I expect that it should work, and after testing with uwsgi it seems to work fine.

My hesitations regarding use in staging/production were (1) I hadn’t tested it and had only thought about it as a development feature and (2) I’m not sure what pyinotify/inotify is doing at a system level and whether there might be problems for long running servers e.g. are there resources that might not be cleaned up properly in some circumstances? Are there some system calls that might block sometimes. But this is just my lack of knowledge, and for myself I think for staging it would be OK.

Current behaviour is that first one found is the winner. The app folders are searched in the order found in INSTALLED_APPS, as per the Django convention for templates, static files etc. (I just pushed this change to django-ftl, I had it reversed before).

The idea is that apps would normally namespace their FTL files using the same convention that Django apps usually have e.g. for templates, the Django admin change_form.html template is found at:

django/contrib/admin/templates/admin/change_form.html
                               ^^^^^

With this convention there would normally not be clashes, but the mechanism can be used for overriding.

My idea was that often projects would have multiple independent django_ftl.Bundle instances. For example, if an app like django-wiki used Fluent and django-ftl it would have it’s own Bundle instance which would list only FTL files for that app, inside a locales/wiki/ directory, and use that bundle in all its templates and code. (This is different from the current Django gettext implementation where all the messages from different apps get combined).

This method doesn’t provide an easy way for a project to override just a few of an apps messages.

For that, we could change the behaviour so that in the case of multiple FTL files with the same relative path we would combine them all in order (using the same INSTALLED_APPS ordering). Messages/terms defined in earlier files would get preference, because fluent.runtime throws out the later messages in FluentBundle.add_messages. I think this would be a helpful change, and not hard to do.

My idea was that all FTL files would live in the repo, based on the fact that this seems to be how current Django apps do it with gettext.

I was then hoping to get translators to use Pontoon to edit their translations - it seems to be set up to deal with repositories in that way, although I struggled a bit with understanding it. I also found that the editor on Pontoon often didn’t work well with more complex FTL messages (bug list). Another option would be integration with something like transifex, but I don’t know if they have even heard of Fluent.

For the time being I may manage everything using VCS and a manual process for sending files back and forward to translators, which will be OK for my very small project (compared to Mozilla standards).

I’m definitely a beginner in this regard - I’m needing to i18n an app for the first time, for a side project, and that’s why I’m working on Fluent. In fact, if I had used gettext etc. extensively in the past, or this was a project for a client, I probably would have stuck with gettext. I’m using Fluent instead because for a long time I’ve been aware of the limitations of gettext and I thought if I’m going to invest in an i18n solution it should be a good one!

So, your input on translator/developer workflow is definitely appreciated!

spookylukey · March 14, 2019, 11:26am

@pmac - I’ve added a new status section to the django-ftl README that may be of use to you.

pmac · March 14, 2019, 5:25pm

Excellent! Yes. Very interesting. Do you have any idea when or if your modifications to the python-fluent project might land upstream?

spookylukey · March 15, 2019, 9:01am

I’ve no idea how long it will take at the moment - though it may depend on whether you will be able to influence your fellow Mozillians Axel and Stas .

Most of the work has been around in some form for about 9 months, so far we have managed to land the first FluentBundle implementation. I’ve had to rebase the patches many times, and realized that now the best thing is to tackle them one by one.

So:

the first is PR 92. To move forward I think Axel would like a spec for the resolver, probably in the form of test suite that includes what errors are produced for various error conditions. I think this is also blocking PR 104.
then there is the compiler branch, which is a large bit of code that may require some time to review. At the time of writing it is pretty much up to date with master and in a good shape, because I’ve updated it recently and it is relatively independent of the other things going on.
finally the escapers. This is currently incomplete - I have only implemented it for the compiler FluentBundle (which is what django-ftl uses), not the resolver. I did have an implementation for the resolver too, but with lots of changes on master it will be easiest to start over, and I’m not going to attempt that again until everything else is done and other bugs with the resolver are fixed. The whole subject of whether this is the right way to handle HTML will also need some design discussion - I think the last communication we’ve had on how to do this correctly are my comments here (see stas’s post that I quote).

Pike · April 3, 2019, 6:11pm

I’ve unblocked some of the topics in fluent.runtime today. Sorry, I’ve had that paged out of my brain for a while, in favor of other stuff.

I started paging in the functions stuff, but that’s the trickiest for me.

The warnings-vs-errors stuff needs some experiments on the details, but at a high level, I think we’re OK.

And today I’ve tackled two regressions in the resolver code that I introduced.

And I haven’t even started thinking about the escaper design, sorry.

spookylukey · April 4, 2019, 8:42am

From my perspective no worries here @Pike - I’m not blocked on anything I need to do for my own use of these projects. I also now have very little time, having taken on a new client, so I will only be able to work on these issues here and there, hopefully enough to keep making some progress.