C++ Logo

SG16

Advanced search

Subject: Re: Wide characters with multiple c-chars
From: Tom Honermann (tom_at_[hidden])
Date: 2020-07-02 11:47:46


On 7/2/20 12:10 PM, Peter Brett via SG16 wrote:
>
> Is there scope for a non-normative note with a recommendation that
> implementations do not implement the feature?
>
No, ISO rules prohibit normative encouragement in notes (as I've been
recently informed by Jens).

Tom.

>                         Peter
>
> *From:*SG16 <sg16-bounces_at_[hidden]> *On Behalf Of *Tom
> Honermann via SG16
> *Sent:* 02 July 2020 16:50
> *To:* sg16_at_[hidden]
> *Cc:* Tom Honermann <tom_at_[hidden]>; Corentin
> <corentin.jabot_at_[hidden]>
> *Subject:* Re: [SG16] Wide characters with multiple c-chars
>
> EXTERNAL MAIL
>
> On 7/2/20 3:31 AM, Corentin via SG16 wrote:
>
> On Thu, 2 Jul 2020 at 09:09, Corentin <corentin.jabot_at_[hidden]
> <mailto:corentin.jabot_at_[hidden]>> wrote:
>
> Hello,
>
> As part of https://wg21.link/p2178r0
> <https://urldefense.com/v3/__https:/wg21.link/p2178r0__;!!EHscmS1ygiU1lA!VD8xKUvbEjOjC98OPihN7v4ZossmCZCNZOm8ErXEKF_uKp8IFAuTssmRnFsuGg$>,
>
> I would like to make *wide* characters litteral with multiple
> c-char (ie: L'abc') ill-formed
>
> https://compiler-explorer.com/z/MHExrk
> <https://urldefense.com/v3/__https:/compiler-explorer.com/z/MHExrk__;!!EHscmS1ygiU1lA!VD8xKUvbEjOjC98OPihN7v4ZossmCZCNZOm8ErXEKF_uKp8IFAuTssnHXqv2Ww$>
>
>
> I forgot to mention that it's extra fun with combining characters
> https://compiler-explorer.com/z/ndyyAD
> <https://urldefense.com/v3/__https:/compiler-explorer.com/z/ndyyAD__;!!EHscmS1ygiU1lA!VD8xKUvbEjOjC98OPihN7v4ZossmCZCNZOm8ErXEKF_uKp8IFAuTssnm89UUtA$>
> (the value of b is equivalent to L'e\u0301')
>
> In both of the compiler-explorer examples, the options to the MSVC
> compiler are incorrect; they should be '/O2' and '/utf-8' respectively
> (though I don't think it affects the results for these cases).
>
> All compilers but MSVC emit a warning by default, some
> implementations pick the first c-char, others pick the last.
>
> There is no use (no occurrence in any of the packages in
> vcpkg) or usage for this feature.
>
> Did you check for cases like _TEXT('xx'), _T('xx'), and __T('xx') in
> addition to L'xx'?
>
> Regardless, the packages in the vcpkg ecosystem can prove that a
> feature is used, but cannot prove that a feature is not used.
>
> So why do this?
>
> - Someone unfamiliar with C++ might do auto str = L'abc'
> instead of L"abc"
>
> - Things that are not useful should not linger for 40 more
> years in the standard;  Tom and I talked too much about how to
> word this "feature" as part of P2029, so it's not free.
>
> - It's part of a wider "Bogus conversions in phase 5" should
> be ill-formed rather than doing their best to compile *something*
>
> P2029 makes these explicitly conditionally-supported in order to, as I
> understand it, match the C standard (in the C standard, my
> understanding is that implementation-defined semantics can include
> rejecting the code, but in the C++ standard, we use
> conditionally-supported for the same allowance).  Therefore,
> implementations are not (will not be) required to accept them with
> current direction.
>
> Please note that I am not proposing to make (narrow) multi
> character literals ill-formed or deprecated at this point,
> there are some uses, and these uses are intended.
>
> I would really like your opinion so we can propose that change
> to EWG and make the change without taking too much of anyone's
> time (the process really isn't tuned for very small changes,
> which is why that is part of a larger paper)
>
> I am weakly against for three reasons:
>
> 1. I think there are more valuable efforts for us to spend our time
> on.  I don't find the motivation above compelling.  These are
> legacy features that, as far as I am aware, don't actively cause
> problems.  If evidence can be found to demonstrate that they do
> actively cause problems, then I might change my mind.
> 2. Existing compilers are unlikely to change their behavior. Clang,
> gcc, and icc already issue warnings for use of multicharacter
> literals (both ordinary and wide) and I suspect they would
> continue to accept these as extensions (even in their strict
> compliance modes).  The cumulative affect of the proposed change
> seems to be to require MSVC to issue a warning.
> 3. Requiring a diagnostic creates an unnecessary divergence from C.
>
> Tom.
>
> Thanks a lot,
>
> Corentin
>
>
>
>



SG16 list run by sg16-owner@lists.isocpp.org