String.normalize needs info on normalization forms

The article String.normalize needs a (brief) description or comparison between the four supported normalization forms. Reading the specs on this is hard, to say the least. It would be help to have a brief comparative description of each form, in one or two sentences.

The difference between composition and decomposition is sort of clear, if you have some understanding of how (de)composed characters work. People that don’t know about this, might easily choose the wrong form for their use case, or ignore having to normalize alltogether.

The difference between canonical and compatible, is complete abacadabra to me, although there surely are people that do get what they mean. Again, it needs a brief comparison to clear it up.

Since I don’t feel qualified to describe the differences myself, can I hereby request someone with a bit more knowledge on normalization, to do it for us? It’d be very much appreciated. Thanks.

Thanks for reporting this! I’ve filed

It’s not very brief, but I’ve written an expanded description of the concepts of canonical and compatibility normalization: and how they can be applied using normalize().

Let me know if it makes sense and looks correct, either here or in the issue. Thanks!

I’m also going to replace the interactive example for this method.

1 Like