On Wed, Nov 6, 2019 at 10:28 PM Thiago Macieira <thiago@macieira.org> wrote:
On Wednesday, 6 November 2019 12:34:23 PST JeanHeyd Meneide wrote:
> It is not exactly trivial for #embed or #embed_str. #embed generates a
> brace-delimeted list of the bytes. It's as if the contents are directly
> replaced by:
>
>      { 102, 111, 111 }
>
>      You cannot "just append" a null terminator in there, so it would
> require a copy. If that's okay (copying things), then we can throw
> #embed_str out the window. As far as requiring bytes, you would need to
> generate a brace-delimeted list with all of the entries cast to the right
> type, because each of those entries is not trivially convertible to a
> std::byte: https://godbolt.org/z/NRkSfK

It's easy to add the terminating null with constexpr. And that function should
be provided. Similarly, it should be easy to concatenate such arrays.

Arrays in C++ (and C) do not have any syntax or behavior for compile-time concatenation. String literals get away with it by having "foo" "bar" be acceptable syntax, meaning someone could add a null terminator with "\0" for #embed_str, but not #embed.

It should be easy to import non-terminated byte data, null-terminated byte
data and UTF-8 text.

SG16 should also provide a way to constexpr-time convert UTF-8 text to UTF-16
or UTF-32

That is something I am already working on (and a separate proposal); all of the UTF8/16/32 encoding objects are constexpr, and one of Corentin's upcoming papers is a consteval ways to detect the compile-time literal encoding. That should be enough.

I think this is highlighting that #embed is the only thing we need, and that #embed_str only real benefit is a null terminating code unit and that there should be better ways to provide that to the user.