On Fri, Apr 16, 2021 at 7:08 PM Peter Brett via SG16 <sg16@lists.isocpp.org> wrote:
Hi all (esp. Victor),

We discussed adding C++23 support for homogeneous formatting in UTF-8,
UTF-16 and UTF-32.  For C++23, we would like to allow UTF-8 format
strings with UTF-8 substitutions, UTF-16 format strings with UTF-16
substitutions, etc.  In a future version of the standard (where UTF
transcoding is guaranteed to be available) we would like to extend this
to allowing e.g. UTF-32 substitutions into UTF-8 format strings.

Victor: {fmt} already does exactly this, right?

As far as I can tell, most of the wording is already in place for this,
and it will only be necessary to mandate the addition of specific
overloads and template specialisations.

My current sticking point is the way we have specified the
locale-specific form (with the `L` option).  Take the `{L}` substitution
for bool, for example.  In P1892 I chose to specify this in terms of
std::numpunct<charT>, but the standard only requires the standard
library to provide numpunct<char> and numpunct<wchar_t> specializations.
Similar problems arise for `L` with other standard format specifiers.

How can we word this so as to make `{L}` substitutions for UTF-8/16/32
formatting conditionally-supported, depending on whether the
implementation provides the necessary specializations of <locale>
facilities?

Let me ask more questions:
  • Do we think we *need* to have a dependency between transcoding and format in terms of schedule?
  • Same question applies for utf format string & non-utf arguments
  • Do we think we can't provide utf-8 support for print without worrying about locale? (which does the wrong thing as, utf8 or not the whole interface assumes 1 code unit == 1 codepoint)?
 

Advice appreciated.

                            Peter
--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16