C++ Logo

std-proposals

Advanced search

Re: Avoid copies when using string_view with APIs expecting null terminated strings

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Wed, 23 Dec 2020 11:48:51 -0500
Nobody's mentioned it yet for some reason, but this is far from the first
time that the idea of "null-terminated string_view" has come up. The
traditional name for it is "zstring_view", and you can find more
information by googling that name.

An interesting runtime variation is bev::string_view —
https://github.com/lava/string_view
bev::string_view remembers which constructor was used to construct it — the
pointer-length constructor, or the std::string/const char* constructor? —
and therefore remembers whether it's safe to access the char at
this.data()[this.size()]. This means it knows whether it's safe to test
this.data()[this.size()] against '\0', which means it can frequently detect
its own null-terminated-ness at runtime. (But if it was constructed with
the pointer-length constructor, and never .remove_suffix'ed, then it would
conservatively report that it didn't know itself to be null-terminated and
couldn't find out without UB.)

Microsoft GSL used to provide (or at least, used to document) a
`zstring_span` with zstring_view semantics. However, these days, GSL
provides `czstring` only as a type alias for `const char*`, and has gotten
rid of its pre-C++17 `string_span` and `zstring_span` types.

HTH,
–Arthur


On Wed, Dec 23, 2020 at 5:17 AM Tom Mason via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> I do think it's useful to use the one type. For example you might want to
> pass a file path to some image loading code. You support several different
> image formats, and some of the loaders are plain c libraries that expect a
> null terminated path. You check the file extension by iterating from the
> end of the string until you find a ".", then pass the string into the
> appropriate library. If you didn't have the length available too, then you
> would need an strlen before you start that backwards iteration.
> I know that you could use std::filesystem::path for that, but I mean it as
> a representative sample (also you might not want to force callers to copy
> into a path).
>
> Another example is if you're doing some normal modern c++ code and you
> just want to printf something along the way because printf is still pretty
> convenient.
>
> 23 Dec 2020 5:22:30 am Jason McKesson via Std-Proposals <
> std-proposals_at_[hidden]>:
>
> > On Tue, Dec 22, 2020 at 8:13 PM Thiago Macieira via Std-Proposals
> > <std-proposals_at_[hidden]> wrote:
> >>
> >> On Tuesday, 22 December 2020 21:33:35 -03 Tom Mason via Std-Proposals
> wrote:
> >>> The problem is that there is no way to know if a given string_view is
> null
> >>> terminated. My proposal is to add a method bool null_terminated() to
> >>> string_view. When a string_view is constructed, the caller can tell the
> >>> implementation that the pointer it is providing is null terminated. The
> >>> conversion from std::string to std::string_view would set this flag.
> >>
> >> This is an ABI-changing proposal. Please provide strong enough
> motivation to
> >> warrant it.
> >>
> >> Given the cost, it might be worth to have a separate class to indicate
> zero-
> >> terminated string views.
> >
> > It should also be noted that if you have an API that needs a
> > NUL-terminated string view... then it *needs* a NUL-terminated string
> > view. That strongly suggests a different type, as that's how we
> > typically spell such needs in APIs. That way, users using your API can
> > see what it needs and can provide what you need. They can even
> > propagate that need up the call graph.
> >
> > In any case, the thing about such a view is that it should just be a
> > `char const *` wrapped in an object. It shouldn't have a size member,
> > and it shouldn't be convertible to/from `string_view`. Also, its range
> > begin/end functions should be an iterator/sentinel pair, not a pair of
> > iterators. The sentinel would do the NUL check. I wrote a skeletal
> > version of such a class and found that compilers were pretty good
> > about optimizing uses of it down to the equivalent of common C-string
> > loops.
> > --
> > Std-Proposals mailing list
> > Std-Proposals_at_[hidden]
> > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2020-12-23 10:49:05