On Wed, Dec 8, 2021, 23:40 Tom Honermann <tom@honermann.net> wrote:

On 12/5/21 2:26 PM, Jens Maurer wrote:
On 05/12/2021 01.04, Tom Honermann wrote:
On 12/4/21 6:05 AM, Jens Maurer wrote:
If we impose a requirement for a code unit -> code point decoder for the
literal encoding at compile-time, we should make such a facility generally
available instead of hiding it in the guts of the std::format parser.
I think JeanHeyd's work on P1629 <https://wg21.link/p1629> will fill this niche. It would be nice if the features he proposes in N2730 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2730.htm> were usable at compile-time as well, but that will likely have to await some kind of constexpr support in C.
Why?  We've made functions constexpr that are inherited from C
before.
Sure, we have, and could do so again. In this case, there are behaviors that we would have to specify that should be decided in conjunction with WG14. For example, the N2730 "mc" and "mwc" function variants operate on the locale dependent execution encoding. We would have to specify what that means for compile-time evaluation. The obvious answer is, of course, that it means the ordinary/wide literal encoding. Since that encoding may differ from the run-time execution encoding, this presumably means defining a locale (or at least the LC_CTYPE locale category) for use at compile-time. We would then have to tie the behavior to std::is_constant_evaluated() (so that the separation of compile-time vs run-time is rigorously defined) for which there is presently no corresponding C facility.

These are not necessarily simple functions that can be readily be inlined or made builtins. As we've previously discussed, EBCDIC code pages do not all consistently encode '{' and '}'. An ISO-2022 escape mechanism that allows switching character sets presumably would require the implementation to track shift state and have access to character set tables in order to recognize all encodings of these characters. Though, perhaps such an encoding is disallowed by [lex.charset]p6? It isn't clear to me how to apply that wording to shift-state encodings.

Nothing precludes shift state literal encodings, see note in the same paragraph.