C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Pre-proposal: Constraining accidental scalar append to std::basic_string

From: Qirong ZHANG <jiuwoyoubing_at_[hidden]>
Date: Mon, 18 May 2026 23:08:47 +0800
On Mon, May 18, 2026 at 4:38 PM Jan Schultke via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> You should also consider that it's probably not good to deprecate the
case where you append char to std::u8strng and append char8_t to
std::string.
>
> I imagine the warnings would be too noisy to pull that off in existing
code
> bases that make heavy use of these types and don't catch such conversions
> already.
>
> There is also the case of appending unsigned char to std::string I
suppose.

Thanks, that is a good point.

I agree that `char`/`char8_t` and `unsigned char` cases should probably not
be lumped together with obviously suspicious scalar conversions such as
`bool`, floating point, arbitrary `int`, or enum values.

In particular, an exact-`CharT` rule is attractive as a clean model, but it
would also diagnose cases like:

```cpp
std::u8string u;
u += 'x';

std::string s;
s += u8'x';

unsigned char b = ...;
s += b;
```

Those are likely to be common in code that treats `std::string` as a byte
or UTF-8 code-unit sequence, or in code gradually migrating to `char8_t`.

So I think the corpus/checker work should classify these cases separately,
rather than treating all non-`CharT` conversions as one bucket. A useful
initial checker might have at least two modes:

1. a conservative/default mode that diagnoses high-confidence cases such as
`bool`, floating point, enum, and perhaps non-character integer literals;
2. a strict mode that also diagnoses non-exact character-code-unit
conversions such as `char` <-> `char8_t` and `unsigned char` -> `char`.

That would let us measure how noisy the exact-`CharT` model would be before
proposing any standard wording.

Thanks again.

--
Qirong Zhang

Received on 2026-05-18 15:08:54