Semantics of Bundles & fallbacks

Pike · July 3, 2020, 3:43pm

In bug 1642415, we’re hitting a problem with our understanding of what a Bundle is. This is in a way related to the message and term references thread here, but also not strictly identical.

Problem recap: We have 15 resources, each in two locations. If we have missing strings, we iterate over 2**15 combinations, and cache each. And all it takes for that is a single missing string in the last resource.

In that bug, I’m proposing to only have one Bundle per language. We do more IO on fallback scenarios, but add all Resources for that one language to the same bundle instance. We just re-yield the same instance each tiime. There’s a few interesting pros about that, in the scenario above in particular:

We only iterate over two versions of the bundle, and when we reiterate, we find all strings on the first iteration.

We also have better chances to resolve message and term references correctly, at least within a single language. There’s an option to extend this to work across languages, if we moved the language from bundle to pattern.

There’s also a caveat, and that is that we loose the ability to warn about conflicting resources in a bundle. Note, only the fluent-rs implementation actually gives feedback on that at this point, AFAIK.

Now, I see a few ways forward, but figured an open and documented conversation here would be best:

Find some other way to fix bug 1642415.
Be fine with no runtime checks on conflicting resources, either stopping to produce them in fluent-rs, or by not reporting them in the gecko bindings.
Add another option to addResource like ignoreDuplicates. We’d not ignore them on the first resource suite, but on the following ones.
Something I missed.

The latter option sounds easy on the JS/Python front, but Rust doesn’t seem to like that at all from an API POV?

Any opinions or concerns from other implementations?

macabeus · July 15, 2020, 11:28pm

I haven’t the entire context about this discussion, but the idea “only have one Bundle per language” caught me.

On my case, I have a couple of frontend projects.
In order to improve code reuse, I have one UI library, that is imported as a lib by the others projects.
So, I have two namespace: one for the UI library, and another one for the main project.

I’m saying namespace because I’m using i18next, but the idea would be very similar on fluent.js.

How the idea of just one bundle per language will work on this situation (one lib + main project)? Or will I have still two bundle, since I’m working with two different projects?
At this moment, we can import two resources on the same bundle, like that. Is it not enough?

Thanks for listening.

Pike · July 16, 2020, 11:42am

The library example is actually pretty close to what we’re doing in Firefox, you have a library with strings (toolkit in our case), and a couple of apps (Firefox and Thunderbird etc in our case).

If you want app-specific strings, you have the choice of providing a good default in the library layer. We used to do this in the pre-Fluent days of Firefox quite extensively, so it’s built into the Fluent architecture.

So, now you have a string hello-world in the library, and in the app, you want to create a custom version. Which means that for one language, you have two definitions. You can do that by using a bundle per language and app/library, or by using a single bundle.

The contrasting scenario is that you actually want two strings in your app, one being Hello World and the other Hello App, and you accidentally give both the same ID.

Some of our bundle implementations favor the latter (Rust, Firefox), while the js/python impl claim they should, but don’t really, and thus give room to liberal overloading.

All of this logic lives in the localization class layer, btw. The js impl isn’t very vocal about that. In python and rust, there are explicit library components for that.

The other part that’s making this hard is that we overload lazily. I.e., we’re loading the app strings, and only once we don’t find something, we load the library strings. If we weren’t keen to do so, we could just load the library strings into the bundle, and then add the app strings with allowOverride: true. Which is excessive IO, and also won’t show developer feedback if the app and the library strings had conflicting strings by mistake.

stas · July 16, 2020, 11:59am

That’s a good summary, @Pike. One question: can you explain whay you meant by all of this logic below?

Pike · July 16, 2020, 3:12pm

Which Resources to put into which Bundle for which locale, and which options to use there.

macabeus · July 17, 2020, 12:37am

On i18next, that I said before, we haven’t the scenario of overwriting a message with accidentally the same ID, because each resource has a namespace.

But I’m wondering if just adding a namespace concept on fluent.js, similar of i18next, would be a simpler approach to solve this same problem. We still could have a generic message bundle (on toolkit, for example) and another bundle for the app-specific strings.
The app-specific namespace is the first place to find, and the generic is the fallback.
We’ll still have multiple bundles for the same language, but naming it and with an explicit preference order to find each message.

Since we’ll need to find on each namespace until find the message, and I think that we won’t have many namespaces (maybe 2 or 3?), I think that it’s a fast approach.

Are you think that “multiple bundles per same language, and each one with a namespace” could be simpler?
Are we already have something like this and that didn’t solve the problem?

Thank you!