C++ Logo


Advanced search

Re: [SG16-Unicode] [ #embed_str ] Unicode Input

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Wed, 6 Nov 2019 13:23:11 -0700

Agreed, it's a byte thing - actually, have you considered std::byte ?
You showed that it is trivial to programmatically add a null terminator,
which seems sufficient to cover all use cases

We need a (completely orthogonal) feature which should have no bearing on
your paper: std::trust_me_these_bytes_are_utf8

On Wed, 6 Nov 2019 at 13:12, JeanHeyd Meneide <phdofthehouse_at_[hidden]>

> Dear SG16,
> I presented #embed_str to EWG. They said I need to take a trip
> through SG16.
> The semantics of #embed_str are that the contents of the file are
> loaded up as individual entries in a regular string literal. For example,
> if the contents of a file were
> foo
> Then the array that backs the string literal would be loaded as
> [f | o | o | \0]
> [0 | 1 | 2 | 3]
> A better name for it might be #embed_null_terminated`. I don't think
> this has much to do with Unicode at the end of the day, because it deals
> with code units. There would be no embed_u8str or embed_u16str or
> embed_u32str, because there's no guarantee the contents of the file would
> be valid UTFX and I am not about to get into the mess that is "source
> resource encoding" and "compile-time resource encoding" conversions.
> If #embed_str is too suggestive of text, I will be more than happy to
> put an axe through it. Do let me know how to proceed.
> Sincerely,
> JeanHeyd Meneide
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode

Received on 2019-11-06 21:23:24