C++ Logo


Advanced search

Subject: Re: Towards a better description of the execution encoding
From: Steve Downey (sdowney_at_[hidden])
Date: 2021-03-02 09:41:20

On Tue, Mar 2, 2021 at 4:35 AM Corentin via SG16 <sg16_at_[hidden]>

> On Mon, Mar 1, 2021 at 10:32 PM Hubert Tong <
> hubert.reinterpretcast_at_[hidden]> wrote:
>> On Mon, Mar 1, 2021 at 10:24 AM Corentin via SG16 <sg16_at_[hidden]>
>> wrote:
>>> Hey folks!
>>> Also, having '\x5c' included in the UB is presumably unintended. The
>>> user intent is expressed by the numeric escape. Furthermore, consideration
>>> should be given (and documented) about whether the UB should apply to
>>> "unparsed" strings (like the argument given to printf for %s) for
>>> "locale sensitive" functions. Again though, I think that we're really
>>> talking about there being "natural consequences" of defined behaviour when
>>> "bad input" is involved.
> Are you saying that Undefined Behavior would be too big of a hammer?
> The way I see it, mojibake is not something that should happen in a
> well-behaved program, aka it is a precondition violation of locale-specific
> functions.
> And I think not having that precondition stated makes it harder to specify
> the behavior of std::print for example.
> My intent is to say " The standard assumes that all strings are
> interpreted by local specific functions as being encoded by the execution
> encoding and if that's not the case, you will get mojibake or any other
> behavior that may be the result of your input not being interpreted
> correctly"
>>> Undefined behavior is definitely too big a hammer for something that's
well defined. I don't want to imply that there are trap representations in
any encoding. There's no sense in giving implementations permissions they
don't need.

SG16 list run by sg16-owner@lists.isocpp.org