C++ Logo


Advanced search

Re: libu8ident 0.1 released

From: Aaron Ballman <aaron_at_[hidden]>
Date: Wed, 26 Jan 2022 07:38:55 -0500
On Wed, Jan 26, 2022 at 1:51 AM Reini Urban via SG16
<sg16_at_[hidden]> wrote:
> On Tue, Jan 25, 2022 at 7:38 PM Jens Maurer via SG16 <sg16_at_[hidden]> wrote:
>> On 25/01/2022 17.13, Tom Honermann via SG16 wrote:
>> > On 1/25/22 3:13 AM, Corentin Jabot via SG16 wrote:
>> > The standard could (I think) also provide normative encouragement to implementors to emit a diagnostic for identifiers that are not inline with TR39 guidance. I'm not sure if we already have examples of encouragement for additional diagnostics elsewhere.
>> I'm not sure SG16 is the right place to discuss such fundamental matters.
>> For example, some people like to compile their code with -Werror, and
>> thus a recommended warning that they cannot possibly avoid (because e.g.
>> it is inevitably caused by a third-party library) is indistinguishable
>> from "ill-formed" for them.
> true. but it's still a security issue, not just a style issue. security concerns should be handled upfront, else they leak in.
> esp. potential insecure third-party libraries.

This suggests the paper also needs to be seen by the SG12 study group
on undefined behavior and vulnerabilities (likely with SG16 experts in
the room to help answer questions).


>> Back to the paper at hand: Its unit of consideration is the
>> "translation unit", which might be formed from header files
>> from various third-party sources. In a world permeated by
>> Unicode, it seems very reasonable that each third party would
>> choose their own script for e.g. identifiers of local
>> variables in inline functions, yet that inevitably would
>> cause conflicts under Reini's suggestion.
> nope, I made 3 suggestions for the "context" in which to check, not only the translation unit.
> 1. before-cpp (really in-cpp)
> 2. private (lexical contexts)
> 3. after-cpp (all in one)
> before-cpp would do the checks in cpp, each header file in its own context, with its own scripts,
> with only the resulting ABI causing potential conflicts, but easiest to check and understand for the user.
> after-cpp (i.e. in the compiler lexer, not in the cpp lexer) with the strictest and most secure implementation.
> also encouraging headers to go with their own unicode identifiers is the totally wrong way to me. nobody is using them, thanksfully.
> it was a false start from the beginning, and we should not encourage them even more.
> it only leads to balkanization and diversion, as taken literally from the CIA sabotage field manual. only very few developers
> can identify all the scripts. did you see a mixed-language wikipedia encouraging all languages at once? there's none,
> only the respective islands.
>> I don't think it's a good use of SG16's time to discuss this
>> paper until these concerns are addressed.
> these were addressed, since you added these concerns.
> Reini
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2022-01-26 12:39:08