Even with a locale (e.g., following a call to std::setlocale(LC_ALL, "")), the conversion may be silently lossy ;)

Tom.

On 5/3/19 11:32 AM, Steve Downey wrote:
Also keep in mind that without a locale using a multibyte charset, i.e. not the default "C" locale, conversion will be silently lossy. 

On Fri, May 3, 2019, 11:26 Tom Honermann <tom@honermann.net> wrote:
On 5/3/19 7:12 AM, Lyberta wrote:
So GCC 9 has been released and I'm starting migration to char8_t. The
first question is how to print std::u8string? Since std::cout works in
execution character set I need a way to convert string from UTF-8 to ECS.
I wish I had a better suggestion for you then std::c8rtomb.  JeanHeyd is working on a proposal for modern transcoding interfaces.
It looks like the only way is to use std::c8rtomb but the API is very
cryptic and I don't understand it. Can someone provide an example code?

As far as I know, no one has actually implemented std::c8rtomb yet.  It has been on my todo list for a long time to contribute an implementation to glibc, but I haven't found the time yet.

c8rtomb is intended to match the existing c16rtomb and c32rtomb functions, so inherits design and wording from them.  The wording in the C++ standard for c8rtomb is a lightly edited copy of the C standard's wording for c16rtomb.  That wording is, well, it could be improved.  A lot.  I intentionally chose to keep it aligned with the C standard so as not to cause confusion for implementors (presumably, they have already come to an understanding with C's wording).

I suggest looking at the examples for c16rtomb on cppreference.com.
- https://en.cppreference.com/w/cpp/string/multibyte/c16rtomb

Tom.


_______________________________________________
SG16 Unicode mailing list
Unicode@isocpp.open-std.org
http://www.open-std.org/mailman/listinfo/unicode


_______________________________________________
SG16 Unicode mailing list
Unicode@isocpp.open-std.org
http://www.open-std.org/mailman/listinfo/unicode