C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Pre-proposal: Constraining accidental scalar append to std::basic_string

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Sun, 17 May 2026 13:41:14 -0400
On Sun, May 17, 2026 at 11:44 AM 牧羊少年 via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> Today, std::basic_string<CharT>::operator+=(CharT) makes expressions such
> as the following well-formed:
>
> std::string s;
> s += 65; // appends char(65), commonly 'A', not "65"
> s += true; // appends char(1)
>
> [...] This is closely related to P2037R1, “String’s gratuitous
> assignment”, which discussed the analogous issue for:
>
> std::string s;
> s = 50; // assigns one character, not "50"
>
> Even more surprising, assignment and copy-initialization work differently
for `string`!
For *most* STL types, copy-initialization and assignment are equivalent in
effect (if often not in performance):
    std::vector<int> v = {1,2,3}; // OK
    v = {1,2,3}; // OK
    std::vector<int> v = 42; // invalid
    v = 42; // invalid
But for `string`, we have this instead:
    std::string s = 42; // invalid
    s = 42; // *OK*
    std::string s = 'x'; // *invalid*
    s = 'x'; // OK

And, as you point out, we also have this:
    s = s + 'x'; // OK
    s += 'x'; // OK
    s = s + 42; // invalid
    s += 42; // *OK*

That is, `string` has a lot of unnatural asymmetries.

 1. Is the asymmetry between s += 65 and s + 65 considered a defect worth
> addressing, or only an unfortunate but acceptable historical artifact?


Both. Personally I would love to see all the asymmetries above fixed in one
swoop — i.e., change the behaviors of all three *bold-and-underlined*
cases. But at the same time, I doubt you'll find much appetite for it.
It would be great if one or two compilers would begin warning on these
cases today, the same way Clang today warns about

    warning: adding 'int' to a string does not append to the string
[-Wstring-plus-int]
        5 | t = "s" + 1;
          | ~~~~^~~

 2. Would a deleted-overload approach be acceptable in principle?


Yes. Although it wouldn't surprise me if you can do better than that.


> 3. Should the rejection cover all non-CharT types convertible to CharT,
> or only arithmetic and enumeration types?


Personally I don't think adding any non-CharT type to a basic_string<CharT>
should be permitted.
But then, personally I would also be totally happy to
- remove `+=(char)` — make people use `s.push_back(ch)` if that's what they
mean
- remove `operator=(char)` — make people use `s = std::string(1, ch)` if
that's what they mean
- remove `operator+(char)` — make people use `s + std::string(1, ch)` if
that's what they mean
These are rare operations, and easy to optimize in today's world of
small-string optimization and high-powered optimizers.


> 4. Should this be limited to operator+=, or should a future paper also
> revisit operator=(CharT), as discussed in P2037R1?


I think any paper should handle both cases at once. It was a flaw in P2037
that it tried to handle `=` without `+=`, and it would be an even bigger
flaw in a new paper if it tried to handle `+=` without `=`.

5. Should push_back(CharT) remain unconstrained, since its name already
> clearly denotes single-code-unit append?


Yes, certainly. I see nothing dubious about `for (int k; (k = getc()) !=
EOF; ) s.push_back(k);`.

6. Would implementers be willing to experiment with a warning or extension
> mode to gather compatibility data?


Personally I think they *should*. In practice they probably won't. It might
be easiest to try getting something like that into `clang-tidy`.


> My current position is that operator+=(CharT) should remain valid, but
> accidental scalar append through operator+= should become ill-formed or
> at least diagnosable. The goal is to make the programmer choose explicitly
> between “append a code unit” and “append the textual representation of a
> value”.
>
FWIW, I'm not at all worried that a C++ programmer might think `s += 42;`
means "append the two chars '4' '2' to s." That's never going to happen,
and C++ programmers know that.
I'm much more concerned with the symmetry of the language. We don't want a
language where
    T t;
    t = u;
means something significantly different from
    T t = u;
or where
    t += u;
means something significantly different from
    t = t + u;
Having types like that in the STL just makes it difficult to write generic
code — which is supposed to be C++'s strong suit.

HTH,
Arthur

Received on 2026-05-17 17:41:29