C++ Logo


Advanced search

Re: [SG16] Towards a better description of the execution encoding

From: Steve Downey <sdowney_at_[hidden]>
Date: Tue, 2 Mar 2021 10:41:20 -0500
On Tue, Mar 2, 2021 at 4:35 AM Corentin via SG16 <sg16_at_[hidden]>

> On Mon, Mar 1, 2021 at 10:32 PM Hubert Tong <
> hubert.reinterpretcast_at_[hidden]> wrote:
>> On Mon, Mar 1, 2021 at 10:24 AM Corentin via SG16 <sg16_at_[hidden]>
>> wrote:
>>> Hey folks!
>>> Also, having '\x5c' included in the UB is presumably unintended. The
>>> user intent is expressed by the numeric escape. Furthermore, consideration
>>> should be given (and documented) about whether the UB should apply to
>>> "unparsed" strings (like the argument given to printf for %s) for
>>> "locale sensitive" functions. Again though, I think that we're really
>>> talking about there being "natural consequences" of defined behaviour when
>>> "bad input" is involved.
> Are you saying that Undefined Behavior would be too big of a hammer?
> The way I see it, mojibake is not something that should happen in a
> well-behaved program, aka it is a precondition violation of locale-specific
> functions.
> And I think not having that precondition stated makes it harder to specify
> the behavior of std::print for example.
> My intent is to say " The standard assumes that all strings are
> interpreted by local specific functions as being encoded by the execution
> encoding and if that's not the case, you will get mojibake or any other
> behavior that may be the result of your input not being interpreted
> correctly"
>>> Undefined behavior is definitely too big a hammer for something that's
well defined. I don't want to imply that there are trap representations in
any encoding. There's no sense in giving implementations permissions they
don't need.

Received on 2021-03-02 09:41:29