Train of thoughts for Fluent and DOM overlays

As we’re talking about bundle APIs, I think it’d be good to share my thoughts on how I envision DOM overlays to eventually work, and how escaping would play into it.

TL;DR: I suggest post-processing of formatToParts, creating DOM mutation tokens. Maybe pre-processing of the runtime message structure, too.

A couple of assertions:

  1. We can’t use an HTML5 parser. That’s because these parsers do a lot more in terms of HTML support and markup fixups than is good for Fluent.
  2. We can’t use an XML parser. That’s because XML parsing is way more strict than we want, and because XML parsing is full of actual XML namespaces, which we don’t expect localizers to understand.
  3. We want DOM overlay logic only on Fluent message values, and not on attributes.
  4. We want DOM overlays to look like HTML markup.
  5. The need for escaping DOM fragments comes from external arguments. Localization content itself just needs sanitization, but no escaping.
  6. We want to use HTML-like markup for both inline styling like bold or italics, as well as for external DOM arguments, aka <a href="/do_not_localize.html">.

Theorem 1

We need to create our own parsing algorithm.

Theorem 2

We don’t want to parse the output for formatPattern, aka, a fully serialized string.

Theorem 1 follows from the lack of an existing parser infrastructure (at least in the realm of W3C (X)HTML) that we can just use.

Theorem 2 is more of an actual opinion. The idea here is that hooking up DOM fragments at a point where we know the escaping story is the most secure way of doing it. Also the one that allows us to build on existing specified behavior as much as possible.

The algorithm I have in mind is like so:

For attribute value patterns, serialize the parts to text, and set the resulting DOM attribute to that text. In server-side implementations, that requires escaping the text as attribute value.

For message value patterns, each Pattern is analyzed, and all text nodes are parsed as DOM mutation tokens. Text content can create element start tag fragments, attributes, and element end tags. In each pattern, start and end tags need to be balanced (error handling TBD). Attribute values can be expressions, attribute names or element names cannot.

This works for each level in select expressions, which requires pre-processing. For tooling and build-time, this might be OK, but I think it’s a fair runtime optimization to only do this on the linearized FluentValue list as a post-processing step.

For message reference expressions, the algorithm just continues. Term references are interesting, but may or may not work the same way.

Variable references as well as call expressions are excluded, as are string and number literals.

Sanitization of the resulting fragment is part of the actual overlay logic. Escaping would work as part of the serialization step on all remaining text nodes, incl. serialized variable references and call expressions.

That’s quite a reduction of what’s currently possible with fluent-dom, but I think this should give us all the flexibility we need, and all the safety we want.

There’s a lot of open questions, but I’d like to put this part out there, 'cause it impacts how I think about our APIs, and also about how we should phrase the spec work for the resolver.


I think we want a bundle API that allows different post-processing of the resolver results, prior to the serialization to result strings. This API should allow for different handling of message values and attributes.

1 Like