C++ Logo

std-proposals

Advanced search

Re: Avoid copies when using string_view with APIs expecting null terminated strings

From: Thiago Macieira <thiago_at_[hidden]>
Date: Wed, 23 Dec 2020 16:27:24 -0300
On Wednesday, 23 December 2020 13:48:51 -03 Arthur O'Dwyer via Std-Proposals
wrote:
> Nobody's mentioned it yet for some reason, but this is far from the first
> time that the idea of "null-terminated string_view" has come up. The
> traditional name for it is "zstring_view", and you can find more
> information by googling that name.

I've seen both zstring_view and cstring_view, the difference between them is
that the former usually carries an explicit size whereas the latter does not.
Yes, it's duplicated information, but the size is a very common information
that is needed. That also makes it layout-compatible with string_view, whereas
cstring_view is layout-compatible with a plain pointer.

> An interesting runtime variation is bev::string_view —
> https://github.com/lava/string_view
> bev::string_view remembers which constructor was used to construct it — the
> pointer-length constructor, or the std::string/const char* constructor? —
> and therefore remembers whether it's safe to access the char at
> this.data()[this.size()]. This means it knows whether it's safe to test
> this.data()[this.size()] against '\0', which means it can frequently detect
> its own null-terminated-ness at runtime. (But if it was constructed with
> the pointer-length constructor, and never .remove_suffix'ed, then it would
> conservatively report that it didn't know itself to be null-terminated and
> couldn't find out without UB.)

It can't report, period. Even if it could determine that the NUL is there
after the string data, it can't guarantee that it will *remain* there. Since
that byte may belong to something, it may therefore be asynchronously
modified.

The only time when it's legal to access that NUL is when doing strlen() in the
constructor. And even then, this class as you described must have as part of
its contract that the size()+1 must remain unchanged while the object lives.

> Microsoft GSL used to provide (or at least, used to document) a
> `zstring_span` with zstring_view semantics. However, these days, GSL
> provides `czstring` only as a type alias for `const char*`, and has gotten
> rid of its pre-C++17 `string_span` and `zstring_span` types.

Ok, that's new to me. cstring, zstring, czstring... since it's MS, I suppose
they also had lpszwstring? :-)
-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel DPG Cloud Engineering

Received on 2020-12-23 13:27:30