C++ Logo

std-proposals

Advanced search

Breaking change - std::string should implicitly accept literals but not general c strings (Matthew Fioravante)

From: Walt Karas <wkaras_at_[hidden]>
Date: Sun, 16 Aug 2020 22:13:22 +0000 (UTC)
Date: Sun, 16 Aug 2020 17:07:47 -0400
From: Matthew Fioravante <fmatthew5876_at_[hidden]>
To: std-proposals_at_[hidden]
Subject: [std-proposals] Breaking change - std::string should
    implicitly accept literals but not general c strings
Message-ID:
    <CALM+jCxuTUckeRpr+1y0CX93ofL6FK32Phwp8x0LLntni888EA_at_[hidden]>
Content-Type: text/plain; charset="utf-8"

The idea is a simple, but breaking change:

Make explicit this constructor:
basic_string( const CharT* s,const Allocator& alloc = Allocator() );

Add this implicit constructor:
template <size_t N>
basic_string( const CharT (&s)[N],const Allocator& alloc = Allocator() );

Rationale
----
C++ has a bad habit of making copies behind your back if you're not
carefully examining your code. For a bare metal, speed oriented language
this is simply unacceptable. Ideally, idiomatic code should be fast code,
and slow operations such copies should be explicit and easy to recognize in
your code.
Assigning a const char* to a string results in a copy, and possibly memory
allocations.
Because we don't want to break the beautiful idiomatic code like:
std::string x = "foo";
the char* constructor was historically made implicit.
So the idea here is to require explicit conversions for any general C
string -> std::string copying conversion, but still allow the idiomatic
implicit literal usage.
Alternative - make all C strings explicit, even literals
--------------
Then we need to start writing this
auto x = std::string("foo");
The above is not so bad, but this is horrid:
std::array<std::string,2> = {{ std::string("aaa"), std::string("bb") }};
Arrays of strings are super common, it would be a sin to add any more noise
to these constructs.
User defined literals are supposed to be the answer here:
auto x = "foo"s;
std::array<std::string,2> = {{ "aaa"s, "bb"s }};
But unfortunately I don't think it's viable to force everyone to start
using udl's in all their code, especially header files.
Examples
--------
char* s = strdup("hello");
std::string x = s; //Error - constructor is explicit
std::string y = std::string(s); //Ok
std::string z = "hello"; //Ok
Justification for API break
--------------------
This is a API breaking change, but this is not a problem. Changing the
above invalid code to valid code is fully backwards compatible, and is
trivial for compilers to detect with warning ahead of time. A clang tool
could easily automate this for you.
There is no ABI break.
Character array problem
---------------
One problem is that by doing this, you still get implicit conversion of
local character arrays
struct S {
char a[4];
};
void f(int x, int y) {
char buf[128];
snprintf(buf, sizeof(buf), "%d %d", x, y);
std::string s1 = buf; // Ok
S s;
std::string s2 = s.a; // Ok
}
Unless the standard committee is willing to consider a unique and better
way to capture string literals as function arguments, this the best we have
in the C++ language right now.
At least in this particular case, it is also pretty rare. The most common
usage that comes to mind is the snprintf idiom above, but that is going to
be obsolete when we get fmt in the standard.
-------------- next part --------------
HTML attachment scrubbed and removed
WK:  In general, I'm skeptical of seeking to define, in the Standard, the set of static checks that's right for everyone.  I think what works best is very aggressive static checking, either integrated into the compiler or using stand-alone tools.  But with convenient, traceable mechanisms to override the flagging of potential problems.  It would be desirable if we could come up standardized override mechanisms that would work well with all commonly-used static checking tools.

Received on 2020-08-16 17:17:11