Date: Sat, 31 Jan 2026 22:03:05 -0800
On Saturday, 31 January 2026 21:42:33 Pacific Standard Time Jan Schultke wrote:
> Opinions on this are obviously a bit torn, but LEWG essentially decided to
> have this already. In Sofia, there was no consensus to ban nested nulls,
> and the status quo is to allow them. P3655R3 also has this constructor.
Indeed. I think it's a mistake.
> The point of this constructor is to permit zero-overhead construction from
> an existing null-terminated C-string, with size information available. This
> case is extremely common when initializing from strings returned by C APIs
> or from user-defined containers in the style of std::string.
If you ban embedded NULs (or, phrased better, pretend they don't exist), then
conversion from basic_string is fine.
I don't buy the argument on conversion from existing C strings. Those usually
don't have a length passed in the first place. And where an strlen() has been
done, as I said in my reply, compilers are good at remembering the value.
> From your other comments, it sounds like you would want a lack of nested
> nulls to be an invariant of std::cstring_view. I strongly disagree with
> this direction due to overhead. It's also not like string views can enforce
> invariants regarding their content anyway. A user could always construct a
> std::cstring_view and then embed a null in the original content.
Right. If we pretend that embedded NULs don't exist, then the problem goes
away.
Indeed we can add the requirement that the underlying string not be modified
once the cstring_view has been constructed. But unlike simply requiring that
the string exist for string_view, that's a much more stringent requirement.
In any case, my point to FPE was that they should argue the benefit vs cost of
keeping the size in the paper. P3655 must do the same, summarising the
discussion with LEWG.
> While one
> could equally argue that the string view cannot guarantee that its null
> terminator remains present, this can at least be asserted on debug builds
> whenever the type is used. That is, whenever performing an operation, one
> could assert that data[size()] is still null. Rescanning the entire string
> is not realistic due to performance cost, so this invariant is not only
> unenforceable, it also cannot be sanity-checked. That sounds like adding
> more undiagnosable UB to the library; we don't need that.
True.
cstring_view might solve one class of problems, that of using non-terminated
strings with APIs that require termination. A runtime check for the presence
of the NUL helps; but as violations go, all it can do is abort the program
early before further damage happens. It can't insert the NUL to let the
program continue. And as I argued, the act of verifying can crash the
application, so the failure to comply is UB.
But I don't agree it suffices. The bigger problem I see is one of security. We
have had security issues in the past where a NUL was passed and caused a
string to be terminated short of where the caller expected it to. An example
was in X.509 certificates issued with CN=bank.com%00.attacker.net: the percent-
decoder operating in C++ calculated the full length of the string and placed a
null terminator at the end, then this resulting decode was passed to a C API
that stopped short.
So, like the embedded NULs, I think not doing strlen() at minimum at
construction stops short.
> Opinions on this are obviously a bit torn, but LEWG essentially decided to
> have this already. In Sofia, there was no consensus to ban nested nulls,
> and the status quo is to allow them. P3655R3 also has this constructor.
Indeed. I think it's a mistake.
> The point of this constructor is to permit zero-overhead construction from
> an existing null-terminated C-string, with size information available. This
> case is extremely common when initializing from strings returned by C APIs
> or from user-defined containers in the style of std::string.
If you ban embedded NULs (or, phrased better, pretend they don't exist), then
conversion from basic_string is fine.
I don't buy the argument on conversion from existing C strings. Those usually
don't have a length passed in the first place. And where an strlen() has been
done, as I said in my reply, compilers are good at remembering the value.
> From your other comments, it sounds like you would want a lack of nested
> nulls to be an invariant of std::cstring_view. I strongly disagree with
> this direction due to overhead. It's also not like string views can enforce
> invariants regarding their content anyway. A user could always construct a
> std::cstring_view and then embed a null in the original content.
Right. If we pretend that embedded NULs don't exist, then the problem goes
away.
Indeed we can add the requirement that the underlying string not be modified
once the cstring_view has been constructed. But unlike simply requiring that
the string exist for string_view, that's a much more stringent requirement.
In any case, my point to FPE was that they should argue the benefit vs cost of
keeping the size in the paper. P3655 must do the same, summarising the
discussion with LEWG.
> While one
> could equally argue that the string view cannot guarantee that its null
> terminator remains present, this can at least be asserted on debug builds
> whenever the type is used. That is, whenever performing an operation, one
> could assert that data[size()] is still null. Rescanning the entire string
> is not realistic due to performance cost, so this invariant is not only
> unenforceable, it also cannot be sanity-checked. That sounds like adding
> more undiagnosable UB to the library; we don't need that.
True.
cstring_view might solve one class of problems, that of using non-terminated
strings with APIs that require termination. A runtime check for the presence
of the NUL helps; but as violations go, all it can do is abort the program
early before further damage happens. It can't insert the NUL to let the
program continue. And as I argued, the act of verifying can crash the
application, so the failure to comply is UB.
But I don't agree it suffices. The bigger problem I see is one of security. We
have had security issues in the past where a NUL was passed and caused a
string to be terminated short of where the caller expected it to. An example
was in X.509 certificates issued with CN=bank.com%00.attacker.net: the percent-
decoder operating in C++ calculated the full length of the string and placed a
null terminator at the end, then this resulting decode was passed to a C API
that stopped short.
So, like the embedded NULs, I think not doing strlen() at minimum at
construction stops short.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel Data Center - Platform & Sys. Eng.
Received on 2026-02-01 06:03:16
