C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Formatting code points to character names

From: Jeremy Rifkin <rifkin.jer_at_[hidden]>
Date: Fri, 23 May 2025 00:37:56 -0500
It might be worth posting to https://github.com/fmtlib/fmt/issues first

Jeremy

On Fri, May 23, 2025 at 12:27 AM Jan Schultke via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> I think it would be useful if you were able to std::format a char32_t to
> its character name. That is:
>
> char32_t c = U'\N{NO-BREAK SPACE}';
> std::string s = std::format("{?????}", c);
> // s is now "NO-BREAK SPACE"
>
> Software that deals with Unicode frequently has to print out its text
> input, possibly for the purpose of error messages, logging, and all sorts
> of things. When encountering a code point that is non-ASCII, there is a
> decent chance that it won't be displayed properly because the font is
> missing the necessary glyphs, or because the character has no visual
> representation (e.g. ZERO-WIDTH JOINER)
>
> To cover that eventuality, software often prints out the "U+NNNN"
> representation of code points, but this is very difficult for humans to
> comprehend, and unless you happen know the specific code point number (very
> few people do), you will have to look it up on the internet to comprehend
> what's going on. This is a waste of productivity.
>
> Therefore, I think the ability to print out code point names is something
> universally useful and a good fit for standardization.
>
> Is this feasible? I know very little about std::format, so I'm not sure
> if one could even retroactively add such a formatting option to char32_t.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2025-05-23 05:38:24