Planned changes to the FluentBundle API

stas · May 28, 2019, 12:08pm

Late in April I started working on a new FluentBundle API. I’d like to move forward with the new design soon. Since there has been little discussion lately in the related issues and PRs (mostly due to my absence), I’d like to bring this discussion to the attention of this forum.

The summary of the new design can be found in the following GitHub issue:

Store messages as FluentMessage in the runtime #365.

I’d like to get your thoughts by next Tuesday. If there’s an agreement to proceed in the proposed direction, I’d like to land the new API next week. If there are unaddressed concerns, we’ll discuss further. Thanks!

stas · May 28, 2019, 12:09pm

I’m looking for feedback from @eemeli, @Pike @spookylukey, and @zbraniecki.

eemeli · May 28, 2019, 4:16pm

I strongly prefer the compound() API of #208, mostly due to its simplicity and directness. It feels optimised for the most common use cases, and does not require processing an intermediate object.

Furthermore, if it’s a design target that “attributes shouldn’t really be used as dictionaries, or keyed collections”, that’s far more clear with compound(), which is communicating that you’re not supposed to access attributes individually, as FluentMessage#formatAttribute() allows you to do.

The only clear benefit that I see with the FluentMessage API is how it enables caching the results of a single map lookup for re-resolving a message with different parameters, but even then it still requires carrying around a reference to the bundle itself. That tradeoff does not seem worth it, when the cost is having three functions in place of just compound(), and needing to call at least two of them in order to get a string result.

And sure, the new format() would be non-backwards-compatible with the current one, but I’d argue that’s okay, and in any case the same format("key") pattern is a part of the FluentMessage API as well.

Pike · May 29, 2019, 10:53am

To me, the point of critique on FluentMessage is that the format methods are back on the bundle.

You tied this back to the error reporting, which I’ll comment on once more.

Which makes me wonder if this is best discussed with all proposals at once, instead of trying to make individual changes?

We might need to talk about the big-big picture, too. As I’ll mention in the error reporting issue, I think that we should do error fallback on runtime errors. That we don’t do that in gecko is a shortcut, but I see that as a bug, not a feature.

stas · May 31, 2019, 2:57pm

Thank you for sharing your thoughts, @eemeli.

FluentBundle is a very low-level API, meant to be used to build higher-level, more convenient APIs, optimized for their specific use-cases. The goal of the redesign is to support a wide range of different use-cases. The compound API has two shortcomings, which I think seriously hinder this goal:

It currently doesn’t allow the pull model; instead it formats all attributes up-front. This could be mitigated by changing the result’s attributes field to some kind of a “lazy Map” structure.
It doesn’t allow re-use or a message which has already been retrieved. I don’t think getMessage is a major performance bottleneck, but it would be nice to avoid the cost of the lookup by the identifier. In particular, this comes in handy for interfaces using long lists, like the list of emails in an email client, or the list of visited websites in a browser, and many more.

That’s a good point. To counter it: as soon as we want to extend the API to allow the pull approach, there needs to be a way to request a specific attribute. My goal with the proposal was to allow it, but not make it too easy, which is why format(id) doesn’t support it. (If it did, it would also need to split the id on the dot for every attribute of the same message, which again isn’t optimal performance wise.)

Yes, this was one of the goals of this proposal. While I don’t think it’s a must-have, I also believe that the low-level API should cater to a wide range of possible use-cases. The way in which I’d like to achieve it, is by creating single-purpose abstractions (bundle, scope, message). I realize that it makes the API slightly more complex by making the users juggle a few more objects. Is it an acceptable cost?

stas · May 31, 2019, 3:00pm

I did it this way to allow for full encapsulation in languages which support it. If format was on the message, than bundle would need a way to publicly expose its internals, such as the map of terms, or the cached Intl formatters. I would consider this a design bug.

zbraniecki · May 31, 2019, 3:50pm

I see the value in Stas’ proposal mainly as:

1) It moves the caching problem to the consumer.

If your app has a list of 100 items to be displayed using the same Message, but different args, you solve it above FluentBundle level.

For me, FluentBundle is the lowest level, non-consumer-friendly, API that should allow higher level APIs such as one called currently Localization or some bindings to do what they want.

With the current model, we basically enforce any optimizations of that type to requested out of the implementation of FluentBundle itself.

2) It allows for separation of Message retrieval from formatting

Currently, those two actions are combined into a single operation, indivisible.
If FluentBundle was meant to be used by consumers, I’d understand such API.

But since it is not, having ability to perform each operation separately has a value in composability and flexibility we offer to the higher level APIs.
Caching, listed above, is one, but I can imagine scenarios where one would want to retrieve a Message and format it using multiple separate FluentBundle instances.

I don’t think there’s any reason to disallow such operation, and Stas’ proposal seems to offer the right set of APIs to allow for such operations.

@eemeli’s suggestion to return an instance that has internal reference to the bundle feels off to me. It’s more hidden magic that doesn’t seem to be necessary, and locks us down in multiple concepts that don’t hold.

The Message in Fluent exists independently of any Bundle. It’s actually part of the Resource, not Bundle. Bundle on a data model level is just a list of references to Resource objects plus a cache of intl formatters used by the bundle.

We could also just allow for Message to be retrieved from Resource, and then be formatted by the Bundle.

3) It puts the error fallbacking in place

There are two classes of errors we’re discussing here: a) errors that prevent you from retrieving the Message or value you want. b) errors that happen during Message resolution.

The former type is a clear breaking scenario and the main result of FluentBundle.retrieveMessage(id) should be that it either succeeded and returns what you asked for, or failed with a single error.
This is the base of for a higher level API and bindings that do error fallbacking, and maybe also last-resort option of displaying l10n-id in place of a message if that fails.

The latter is very different, because we want to resolve as much of the message as possible, while accumulating errors on the way and in the end always return some string and a list of errors it accumulated.
Once again, the higher level API may decide what to do in case of errors and it may decide to fallback.

In particular, it’s perfectly possible for the same Message to fail when resolved in the context of one FluentBundle, but succeed when resolved in the context of another.

I’d like the low-level API to reflect that.

Dislike

On the dislike side, Stas’ proposal adds abstraction allocation and logic.
I never saw the value of “pull” approach, but that’s likely because I see compound messages as tightly related to UI widgets and in such scenarios I’ve never seen a use case of retrieving some, but not all, attributes.

Stas’ also in several issues lists “performance characteristics” of different approaches as something that can change in time (his last comment in issue 364).
Since we talk about low-level generic Fluent API, I believe the common denominator should be algorithmic and cyclical complexity, not benchmark performance, especially not our API should not be designed based on snapshot performance of some JIT engine.
For that reason, I believe Rust and C++ are much better languages to consider when designing the API for all languages. If we need to allocate more, call more functions, hold more references and perform more operations, we’re moving away from a performant API, even if SpiderMonkey 69 happens to JIT away it better.

eemeli · May 31, 2019, 5:33pm

How about merging the two proposals? I mean something like this:

bundle.addMessages(`
foo = Foo
    .bar = Bar
`)

const msg = bundle.getMessage('foo')
bundle.attributeNames(msg) // Set { 'bar' }
bundle.format(msg) // 'Foo'
bundle.compound(msg)
// { value: 'Foo', attributes: Map { 'bar' => 'Bar' } }

bundle.attributeNames('foo') // Set { 'bar' }
bundle.format('foo') // 'Foo'
bundle.compound('foo')
// { value: 'Foo', attributes: Map { 'bar' => 'Bar' } }

In other words, allow the first parameter of format or compound to be either a string identifier of a message, or a pre-retrieved message object. Wouldn’t that allow the strengths of both proposals to be combined, while keeping the API surface rather minimal?

Edit: Modified a bit to move the attribute name getter to the bundle, so that the message can be considered a completely opaque blob.

spookylukey · May 31, 2019, 5:43pm

Just a quick reply to say I don’t really have any thoughts on this. My usage of the API doesn’t really have a need for the performance points you need in the browser DOM bindings.

stas · May 31, 2019, 7:00pm

Thanks, @zbraniecki, that’s a nice summary.

Good point. I’d like to point out that in both proposals (compound and FluentMessage) the complexity is approximately the same, with the exception of it being hidden in the case of the compound API, and visible in FluentMessage. The same work needs to be done in both, but with FluentMessage, it’s the responsibility of the consumer to do it. This allows for greater flexibility, if it’s required. I see this as an advantage of the FluentMessage proposal.

stas · May 31, 2019, 7:02pm

Interesting. compound from your example could be easily implemented using the FluentMessage API Perhaps compound is a good candidate for a method on a slightly higher-level abstraction? E.g. fluent-dom and fluent-react bindings?

zbraniecki · May 31, 2019, 8:16pm

I concur. My comment was mainly referring to the error collecting bits of the API where not allocating is better than allocating.

eemeli · June 1, 2019, 7:24am

If that’s the case, then doesn’t the same apply to format as well? I think I’m starting to get what you want with this API level (not that I necessarily agree with it), but what threw me off is that the proposal still includes format('foo'), which with this argument would belong to a higher-level API as well.

Edit: Dropping format, the whole API could be just this:

bundle.addMessages(`
foo = Foo
    .bar = Bar
`)

const msg = bundle.getMessage('foo')
bundle.attributeNames(msg) // Set { 'bar' }
bundle.formatValue(msg) // 'Foo'
bundle.formatAttribute(msg, 'bar') // 'Bar'

Edit 2: And if this really is meant to be a lower-level API, addMessages probably ought to be dropped as well, in favour of addResource. It’s doing the same sort of two-step processing as format and compound.

eemeli · June 1, 2019, 7:33am

As context for my position and arguments, I’m looking at Resource, Bundle and Message from the point of view of someone that’s writing an alternative implementation for these “interfaces” in fluent-compiler. From that point of view I’m interested in minimising the API surface that will need to be duplicated by my implementation. Hence the desire to keep Resource and Message as opaque blobs, and having a small but sufficient set of methods on Bundle for working with them.

Pike · June 1, 2019, 12:24pm

I’m afraid that I have a completely opposite perspective on optimization opportunities, and how they’re helped or hindered in this API design.

In a pre-evaluating resolver, things like message or term references can be optimized out. You could also inspect if GLOBALs are static, and optimize them out. Both only make sense on the bundle level, though. I can see those being frequently used.

OTH, I don’t see that a program would go through the intermediate API, requesting a hot message, and then also the corresponding bundle, and then call it a ton of times. That seems niche at best, and quite awkward for the programmer.

Another thing that bothers me about tearing messages off of their bundle is that all locale data is in the bundle (at least in the current implementations). What would you expect if a German message was evaluated in a Russian bundle? That smells like a footgun to me.

stas · June 3, 2019, 1:28pm

Yes! That’s a great point. My original thinking was to keep format and addMessage as two entry-level methods suitable for quickly getting started with FluentBundle. Use-cases: a quick prototype, a README, just trying to get something show up on the screen. However, I’m in favor of removing them right now to keep the API to the absolute functional minimum. We can always consider re-adding them later.

eemeli:

the whole API could be just this:

const msg = bundle.getMessage('foo')
bundle.attributeNames(msg) // Set { 'bar' }
bundle.formatValue(msg) // 'Foo'
bundle.formatAttribute(msg, 'bar') // 'Bar'

Yeah, something minimal like that is what I’ve been thinking about With a small change: I think that we also need hasValue so that higher-level packages can decide not to format the value of messages which only have attributes. This can be considered a built-in optimization: checking hasValue allows the consumer to skip a lot of logic in formatValue.

And since the presence of the value, as well as the names of the attributes, are inherent to the message rather than the bundle, I still think hasValue and attributeNames should be public fields on the FluentMessage.

(They would have to be anyways in the bundle.attributeNames(msg) approach.)

Agreed! This would make @zbraniecki happy He had previously pointed out that addMessages was ambiguous wrt. to ownership of the input: https://github.com/projectfluent/fluent.js/issues/328. Even if this inconsistency isn’t harmful at all in JavaScript, it’s nice to see that the proposed API does away with it.

Thanks for the feedback, @eemeli! I’ll update the API design issue (#365) with the suggestions to trim format and addMessages.

stas · June 7, 2019, 2:40pm

These are good points, but the proposal is specifically tailored to the runtime implementation in @fluent/bundle, which uses an interpreting resolver. I’m not sure if we should influence it by other resolver designs, like the pre-evaluating one.

On the contrary, I can see the appeal and I can imagine this being easily done on the bindings level in fluent-react for instance. My earlier point is more important however: I think that a low-level API should be, to some extent, less guided by concrete use-cases, and more open to anything that the higher-level abstractions built on top of it wish to do. I don’t want to dismiss a use-case (here: caching a hot message) without good reasons to do so.

I’d expect it to produce a mixed translation.

I agree that it’s a trade-off. If we want 1) to allow to cache messages, and 2) to make them stateless and immutable (by not storing the reference to the bundle inside of them), then we need to accept it. Both 1) and 2) are very important pillars of the proposed design, and I see them as big benefits over the current implementation in fluent.js.

BTW. the current fluent.js API would also produce a mixed translation. I don’t think we’ve ever seen a problem caused by it?

Pike · June 12, 2019, 12:23pm

We’re sadly at a point where you have things that are important to you, and I don’t know why they would be. Or even think they’re dangerous.

Taking a step back and picking my hat for the Mozilla ecosystem, with timely context: I was just forced to look at all the wrapper implementations we have around the string bundle API, and to me, that’s enough evidence that at our scale, we need to expose good APIs, and only those. To be frank, the prospect of 1000 Mozilla engineers looking at an API and read “you can cache these” scares the shits out of me.

zbraniecki · June 17, 2019, 3:08am

I think there’s a misunderstanding. FluentBundle is not a public facing API. We do not expect users to ever make direct calls to it. Comparing it to how StringBundle is used is, to me, a mistake.
Fluent operates very differently from StringBundle and we spent a lot of time explaining the paradigm shift from imperative API calls to declarative. I’d be very interested in comparing the number of StringBundle API calls in Preferences prior to the switch to the number of Fluent API calls (minus setAttributes) after the switch.
I’d suspect a ~80% drop.

On top of that, the imperative API that we ask users to work with is called Localization, not FluentBundle. And I completely agree with you that we should be very careful with what methods we design for it.

FluentBundle‘s API should be designed to allow for easy development of different Localization, fluent-react, fluent-dom etc. bindings. And I believe Stas’ proposal to be a valuable proposal for that purpose.

stas · June 18, 2019, 7:56pm

Thanks, @zbraniecki. You’ve basically outlined my thinking, to a point where a feel like I have nothing more to add