Planned changes to the FluentBundle API

I strongly prefer the compound() API of #208, mostly due to its simplicity and directness. It feels optimised for the most common use cases, and does not require processing an intermediate object.

Furthermore, if it’s a design target that “attributes shouldn’t really be used as dictionaries, or keyed collections”, that’s far more clear with compound(), which is communicating that you’re not supposed to access attributes individually, as FluentMessage#formatAttribute() allows you to do.

The only clear benefit that I see with the FluentMessage API is how it enables caching the results of a single map lookup for re-resolving a message with different parameters, but even then it still requires carrying around a reference to the bundle itself. That tradeoff does not seem worth it, when the cost is having three functions in place of just compound(), and needing to call at least two of them in order to get a string result.

And sure, the new format() would be non-backwards-compatible with the current one, but I’d argue that’s okay, and in any case the same format("key") pattern is a part of the FluentMessage API as well.

To me, the point of critique on FluentMessage is that the format methods are back on the bundle.

You tied this back to the error reporting, which I’ll comment on once more.

Which makes me wonder if this is best discussed with all proposals at once, instead of trying to make individual changes?

We might need to talk about the big-big picture, too. As I’ll mention in the error reporting issue, I think that we should do error fallback on runtime errors. That we don’t do that in gecko is a shortcut, but I see that as a bug, not a feature.

Thank you for sharing your thoughts, @eemeli.

FluentBundle is a very low-level API, meant to be used to build higher-level, more convenient APIs, optimized for their specific use-cases. The goal of the redesign is to support a wide range of different use-cases. The compound API has two shortcomings, which I think seriously hinder this goal:

  • It currently doesn’t allow the pull model; instead it formats all attributes up-front. This could be mitigated by changing the result’s attributes field to some kind of a “lazy Map” structure.

  • It doesn’t allow re-use or a message which has already been retrieved. I don’t think getMessage is a major performance bottleneck, but it would be nice to avoid the cost of the lookup by the identifier. In particular, this comes in handy for interfaces using long lists, like the list of emails in an email client, or the list of visited websites in a browser, and many more.

That’s a good point. To counter it: as soon as we want to extend the API to allow the pull approach, there needs to be a way to request a specific attribute. My goal with the proposal was to allow it, but not make it too easy, which is why format(id) doesn’t support it. (If it did, it would also need to split the id on the dot for every attribute of the same message, which again isn’t optimal performance wise.)

Yes, this was one of the goals of this proposal. While I don’t think it’s a must-have, I also believe that the low-level API should cater to a wide range of possible use-cases. The way in which I’d like to achieve it, is by creating single-purpose abstractions (bundle, scope, message). I realize that it makes the API slightly more complex by making the users juggle a few more objects. Is it an acceptable cost?

I did it this way to allow for full encapsulation in languages which support it. If format was on the message, than bundle would need a way to publicly expose its internals, such as the map of terms, or the cached Intl formatters. I would consider this a design bug.

I see the value in Stas’ proposal mainly as:

1) It moves the caching problem to the consumer.

If your app has a list of 100 items to be displayed using the same Message, but different args, you solve it above FluentBundle level.

For me, FluentBundle is the lowest level, non-consumer-friendly, API that should allow higher level APIs such as one called currently Localization or some bindings to do what they want.

With the current model, we basically enforce any optimizations of that type to requested out of the implementation of FluentBundle itself.

2) It allows for separation of Message retrieval from formatting

Currently, those two actions are combined into a single operation, indivisible.
If FluentBundle was meant to be used by consumers, I’d understand such API.

But since it is not, having ability to perform each operation separately has a value in composability and flexibility we offer to the higher level APIs.
Caching, listed above, is one, but I can imagine scenarios where one would want to retrieve a Message and format it using multiple separate FluentBundle instances.

I don’t think there’s any reason to disallow such operation, and Stas’ proposal seems to offer the right set of APIs to allow for such operations.

@eemeli’s suggestion to return an instance that has internal reference to the bundle feels off to me. It’s more hidden magic that doesn’t seem to be necessary, and locks us down in multiple concepts that don’t hold.

The Message in Fluent exists independently of any Bundle. It’s actually part of the Resource, not Bundle. Bundle on a data model level is just a list of references to Resource objects plus a cache of intl formatters used by the bundle.

We could also just allow for Message to be retrieved from Resource, and then be formatted by the Bundle.

3) It puts the error fallbacking in place

There are two classes of errors we’re discussing here: a) errors that prevent you from retrieving the Message or value you want. b) errors that happen during Message resolution.

The former type is a clear breaking scenario and the main result of FluentBundle.retrieveMessage(id) should be that it either succeeded and returns what you asked for, or failed with a single error.
This is the base of for a higher level API and bindings that do error fallbacking, and maybe also last-resort option of displaying l10n-id in place of a message if that fails.

The latter is very different, because we want to resolve as much of the message as possible, while accumulating errors on the way and in the end always return some string and a list of errors it accumulated.
Once again, the higher level API may decide what to do in case of errors and it may decide to fallback.

In particular, it’s perfectly possible for the same Message to fail when resolved in the context of one FluentBundle, but succeed when resolved in the context of another.

I’d like the low-level API to reflect that.

Dislike

On the dislike side, Stas’ proposal adds abstraction allocation and logic.
I never saw the value of “pull” approach, but that’s likely because I see compound messages as tightly related to UI widgets and in such scenarios I’ve never seen a use case of retrieving some, but not all, attributes.

Stas’ also in several issues lists “performance characteristics” of different approaches as something that can change in time (his last comment in issue 364).
Since we talk about low-level generic Fluent API, I believe the common denominator should be algorithmic and cyclical complexity, not benchmark performance, especially not our API should not be designed based on snapshot performance of some JIT engine.
For that reason, I believe Rust and C++ are much better languages to consider when designing the API for all languages. If we need to allocate more, call more functions, hold more references and perform more operations, we’re moving away from a performant API, even if SpiderMonkey 69 happens to JIT away it better.

How about merging the two proposals? I mean something like this:

bundle.addMessages(`
foo = Foo
    .bar = Bar
`)

const msg = bundle.getMessage('foo')
bundle.attributeNames(msg) // Set { 'bar' }
bundle.format(msg) // 'Foo'
bundle.compound(msg)
// { value: 'Foo', attributes: Map { 'bar' => 'Bar' } }

bundle.attributeNames('foo') // Set { 'bar' }
bundle.format('foo') // 'Foo'
bundle.compound('foo')
// { value: 'Foo', attributes: Map { 'bar' => 'Bar' } }


In other words, allow the first parameter of format or compound to be either a string identifier of a message, or a pre-retrieved message object. Wouldn’t that allow the strengths of both proposals to be combined, while keeping the API surface rather minimal?

Edit: Modified a bit to move the attribute name getter to the bundle, so that the message can be considered a completely opaque blob.

Just a quick reply to say I don’t really have any thoughts on this. My usage of the API doesn’t really have a need for the performance points you need in the browser DOM bindings.

1 Like

Thanks, @zbraniecki, that’s a nice summary.

Good point. I’d like to point out that in both proposals (compound and FluentMessage) the complexity is approximately the same, with the exception of it being hidden in the case of the compound API, and visible in FluentMessage. The same work needs to be done in both, but with FluentMessage, it’s the responsibility of the consumer to do it. This allows for greater flexibility, if it’s required. I see this as an advantage of the FluentMessage proposal.

Interesting. compound from your example could be easily implemented using the FluentMessage API :slight_smile: Perhaps compound is a good candidate for a method on a slightly higher-level abstraction? E.g. fluent-dom and fluent-react bindings?

I concur. My comment was mainly referring to the error collecting bits of the API where not allocating is better than allocating.

1 Like

If that’s the case, then doesn’t the same apply to format as well? I think I’m starting to get what you want with this API level (not that I necessarily agree with it), but what threw me off is that the proposal still includes format('foo'), which with this argument would belong to a higher-level API as well.

Edit: Dropping format, the whole API could be just this:

bundle.addMessages(`
foo = Foo
    .bar = Bar
`)

const msg = bundle.getMessage('foo')
bundle.attributeNames(msg) // Set { 'bar' }
bundle.formatValue(msg) // 'Foo'
bundle.formatAttribute(msg, 'bar') // 'Bar'

Edit 2: And if this really is meant to be a lower-level API, addMessages probably ought to be dropped as well, in favour of addResource. It’s doing the same sort of two-step processing as format and compound.

As context for my position and arguments, I’m looking at Resource, Bundle and Message from the point of view of someone that’s writing an alternative implementation for these “interfaces” in fluent-compiler. From that point of view I’m interested in minimising the API surface that will need to be duplicated by my implementation. Hence the desire to keep Resource and Message as opaque blobs, and having a small but sufficient set of methods on Bundle for working with them.

I’m afraid that I have a completely opposite perspective on optimization opportunities, and how they’re helped or hindered in this API design.

In a pre-evaluating resolver, things like message or term references can be optimized out. You could also inspect if GLOBALs are static, and optimize them out. Both only make sense on the bundle level, though. I can see those being frequently used.

OTH, I don’t see that a program would go through the intermediate API, requesting a hot message, and then also the corresponding bundle, and then call it a ton of times. That seems niche at best, and quite awkward for the programmer.

Another thing that bothers me about tearing messages off of their bundle is that all locale data is in the bundle (at least in the current implementations). What would you expect if a German message was evaluated in a Russian bundle? That smells like a footgun to me.

Yes! That’s a great point. My original thinking was to keep format and addMessage as two entry-level methods suitable for quickly getting started with FluentBundle. Use-cases: a quick prototype, a README, just trying to get something show up on the screen. However, I’m in favor of removing them right now to keep the API to the absolute functional minimum. We can always consider re-adding them later.

Yeah, something minimal like that is what I’ve been thinking about :slight_smile: With a small change: I think that we also need hasValue so that higher-level packages can decide not to format the value of messages which only have attributes. This can be considered a built-in optimization: checking hasValue allows the consumer to skip a lot of logic in formatValue.

And since the presence of the value, as well as the names of the attributes, are inherent to the message rather than the bundle, I still think hasValue and attributeNames should be public fields on the FluentMessage.

(They would have to be anyways in the bundle.attributeNames(msg) approach.)

Agreed! This would make @zbraniecki happy :slight_smile: He had previously pointed out that addMessages was ambiguous wrt. to ownership of the input: https://github.com/projectfluent/fluent.js/issues/328. Even if this inconsistency isn’t harmful at all in JavaScript, it’s nice to see that the proposed API does away with it.

Thanks for the feedback, @eemeli! I’ll update the API design issue (#365) with the suggestions to trim format and addMessages.

These are good points, but the proposal is specifically tailored to the runtime implementation in @fluent/bundle, which uses an interpreting resolver. I’m not sure if we should influence it by other resolver designs, like the pre-evaluating one.

On the contrary, I can see the appeal and I can imagine this being easily done on the bindings level in fluent-react for instance. My earlier point is more important however: I think that a low-level API should be, to some extent, less guided by concrete use-cases, and more open to anything that the higher-level abstractions built on top of it wish to do. I don’t want to dismiss a use-case (here: caching a hot message) without good reasons to do so.

I’d expect it to produce a mixed translation.

I agree that it’s a trade-off. If we want 1) to allow to cache messages, and 2) to make them stateless and immutable (by not storing the reference to the bundle inside of them), then we need to accept it. Both 1) and 2) are very important pillars of the proposed design, and I see them as big benefits over the current implementation in fluent.js.

BTW. the current fluent.js API would also produce a mixed translation. I don’t think we’ve ever seen a problem caused by it?

1 Like

We’re sadly at a point where you have things that are important to you, and I don’t know why they would be. Or even think they’re dangerous.

Taking a step back and picking my hat for the Mozilla ecosystem, with timely context: I was just forced to look at all the wrapper implementations we have around the string bundle API, and to me, that’s enough evidence that at our scale, we need to expose good APIs, and only those. To be frank, the prospect of 1000 Mozilla engineers looking at an API and read “you can cache these” scares the shits out of me.

I think there’s a misunderstanding. FluentBundle is not a public facing API. We do not expect users to ever make direct calls to it. Comparing it to how StringBundle is used is, to me, a mistake.
Fluent operates very differently from StringBundle and we spent a lot of time explaining the paradigm shift from imperative API calls to declarative. I’d be very interested in comparing the number of StringBundle API calls in Preferences prior to the switch to the number of Fluent API calls (minus setAttributes) after the switch.
I’d suspect a ~80% drop.

On top of that, the imperative API that we ask users to work with is called Localization, not FluentBundle. And I completely agree with you that we should be very careful with what methods we design for it.

FluentBundle‘s API should be designed to allow for easy development of different Localization, fluent-react, fluent-dom etc. bindings. And I believe Stas’ proposal to be a valuable proposal for that purpose.

1 Like

Thanks, @zbraniecki. You’ve basically outlined my thinking, to a point where a feel like I have nothing more to add :slight_smile:

I was traveling last week and it was a good opportunity to take a step back and think about the larger picture. I’ve also had a chance to talk to @Pike and @zbraniecki in person about this topic; thanks for offering your feedback.

tl;dr: I’m proposing another alternative of the API design. It’s almost the same as the current API, and based on the realization that hiding the runtime representation of message is no longer a relevant goal.

The goal of the API re-design has been to hide the internal representation of messages as stored in a FluentBundle. But that already sounds like a solution, rather than the problem to solve. The actual problem is that for me, as the maintainer of fluent.js, the way FluentBundle stores messages is an implementation detail. However that’s not clear to the consumers of the API. We’ve already seen people create tools and scripts which run on the runtime AST.

As an implementation detail, I’d like to be able to change the internal representation of messages when working on the runtime parser and the runtime resolver. The representation is not intended to be human readable; it’s a derivative of the parser/resolver duo.

So far, I’ve presented two alternatives to achieve this goal:

  • Introduce FluentBundle.compound(id). Issue #208, PR #360.
  • Store messages as FluentMessage in the runtime. Issue #365, PR #366.

They both offer different approaches to hiding the internal representation from consumers, and they both come with different trade-offs.

Last week I started to think that perhaps rather than looking for a technical solution, it would be worthwhile to reconsider the original goal and intent. Do we actually gain that much by hiding the internal representation? And what happens if we don’t?

A hidden internal representation makes it possible to iterate on the parser/resolver code faster. This was a reasonable goal a year ago, when I first proposed changing the API. The syntax spec was still in flux, and Syntax 1.0 was still many moths away. With the stable 1.0 release of the syntax earlier this year, the need to change the internal representation is now much less pronounced: we know that the syntax won’t be changing as fast as it used to. Furthermore, FluentBundle is close to a 1.0 as well, and any changes to the internal representation would at this point likely come from changes to the parser/resolver code which should probably be reflected in a major version bump anyways.

OTOH, if the internal representation was public and documented as such, it would open up a whole new world of possibilities. There are many good things that can happen when data is just data, and when logic is separated from it. The runtime representation can be easily serialized and structure-cloned. The implementation stays simple, because it doesn’t have to worry about hiding things anymore. Testing becomes trivial with fixtures. Etc.

This is how I’ve arrived at yet another proposal for the API design:

  • Clean-up the runtime representation, and keep FluentBundle.getMessage and FluentBundle.format (renamed to .formatPattern). Work-in-progress PR #380.

The proposal could be viewed as a fix to #358 which @eemeli filed to document the access to values and attributes of messages stored in a FluentBundle. Rather than come up with elaborate schemes of encapsulating messages, it doubles down on the fact that one can getMessage a message and look inside of it. Here’s how I’d imagine a canonical way of formatting a message value:

let message = bundle.getMessage(id);
if (message.value) {
    bundle.formatPattern(message.value, args, errors);
}

The proposal also has the benefit of changing very little of how FluentBundle works today, which removes much of the friction related to migrating consumers to the new API. I’d still consider removing addMessages to keep the API lean, so at least some slight work would be required.

Thanks to everyone for your continued interest in this discussion. I know it’s taking a while and I hope to be able, with your help, to settle on the final design soon.

1 Like

The discussion about the formatPattern proposal happened largely on GitHub: #380. It is my understanding that we’re now in agreement to move forward with this design.

The discussion on GitHub focused on how much of the message (as returned by getMessage) should be considered public API.

The outcome is that the {value: Pattern, attributes: Record<string, Pattern>} shape will be part of the public API of @fluent/bundle. The Pattern type will remain “private” and implementation-specific, and other implementations will be free to use any structure they see fit for the purpose. This should enable much more experimentation and creativity wrt. the resolver’s implementation.

Thanks, everyone, for your feedback and patience. I’ll update the PR and close the other ones, corresponding to the previous iterations of the redesign. My plan is prepare the changes for a review by the end of this week, and merge and publish next week.