Note that this implementation might not be reflective of the exact semantic proposed in the paper.
It is derived from the implementation here
The implementation in clang was straightforward; it took a couple hours to integrate.
However the bulk of the work here is wrangling the database to a sufficiently compressed state, and that work was already done.

I have not found a significant increase of size in my build.
The clang release build is over 100MB depending on options (although the debian package manages to make it under 50MB)
This adds ~500K which does not seem significant.
The overall size of clang is more likely to be affected by which compiler it is compiled with, whether LTO was used and so forth.
GCC seems to be a bit smaller - around 30MB or so If I accounted for all the files correctly.

So the answer to the size is that depending on compiler and compilation options, this would result in a 0.5%-2% or there
about increase in size of the compiler binaries.

Is that too significant? I do not know

On Fri, Sep 17, 2021 at 4:32 PM Steve Downey via SG16 <> wrote:
And I was surprised at how well it worked out of the box. That was just `htmldiff p2071r0.html d2071r1.html > diff.html `

On Fri, Sep 17, 2021 at 9:35 AM Tom Honermann <> wrote:
Thanks, Steve!  What did you use to generate the marked up diff?  Whatever it was produced a great result!


On 9/16/21 5:31 PM, Steve Downey wrote:
I've attached a marked up rich difference between the two versions on github. It does look like it's a sensible state, although yes, wording will need updates. 

On Wed, Sep 15, 2021 at 3:38 PM Tom Honermann <> wrote:
I attached a draft R1 with changes I had previously worked on. I briefly looked at it and I don't think I left it in a half-baked state, but it would be worth diffing it against the P0 revision with a reasonable HTML diffing tool to make sure.  The "Changes since P2071R0" section suggest I addressed the issues raised in Prague.

The todo list I have includes:
  • Add discussion regarding the use of \N{...} in identifiers.
  • Add a proposal option to allow use of \N{...} in identifiers.
  • Rebase wording on the current WD; particularly due to the adoption of P2029.
  • Implement the proposal.
Richard Smith had requested that \N{...} be allowed in identifiers for consistency with \u and \U.  We should, of course, just acknowledge that Richard is always right and do that :)

Wording changes may additionally be needed for P2314.  Maybe for one or more of Corentin's recent papers as well.


On 9/15/21 3:21 PM, Steve Downey wrote: has notes from JF

EWG Prague Thursday afternoon:

We’re interested in supporting named universal character escapes.

14 5 0 0 0

This should further support aliases.

18 2 1 0 0

It should further be case insensitive.

0 6 6 9 2

It should further support UAX44-LM2 with arbitrary spaces and dashes.

1 4 5 8 5

The paper is not tentatively ready yet. We want to see the updated paper before marking it as tentatively ready.

I missed Prague, but this might be enough, if you don't have any more detailed notes. I can check the wiki as well. 

On Wed, Sep 15, 2021 at 2:46 PM Tom Honermann <> wrote:
On 9/15/21 2:31 PM, Steve Downey wrote:
> If I am reading the github correctly, EWG would like to see some
> revision before picking it up again. Is there something I can
> help with? This looks like it's really close and desired and quite
> possible for 23?

Yes. I recall there not being much to do, but I need to find my list of
what that is. I would very much appreciate the help. I'll hunt down that
list and try to get it to you later today or tomorrow.


SG16 mailing list