C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Revising #pragma once

From: Gašper Ažman <gasper.azman_at_[hidden]>
Date: Fri, 30 Aug 2024 12:35:57 +0100
I'm trying to tell you that things that are easy to achieve on the source
side are bloody difficult to do on the build side where you are not
actually building from sources, but from build artifacts deriving from said
sources.

Like, literally, the curl-based filesystems you find are such that, even if
nothing changes, two fopen() calls with the *same path* will result in a
different inode if you make them more than the cache-TTL apart and some
piece of metadata changed. It's not a path uniqueness thing.

Please, believe me when I tell you it's not that simple.



On Fri, Aug 30, 2024 at 12:28 PM Tiago Freire <tmiguelf_at_[hidden]> wrote:

>
> > Guess which #pragma once depends on.
>
> paths being globally unique
>
> Your linter is doing the work that pragma once should be doing. You are
> already relying on the paths being unique to handle this situation. You are
> just not translating that to your compilation.
> This is the problem, this has always been the problem, every single time
> I've seen this.
>
>
>
> *From:* Gašper Ažman <gasper.azman_at_[hidden]>
> *Sent:* Friday, August 30, 2024 1:11:23 PM
> *To:* Tiago Freire <tmiguelf_at_[hidden]>
> *Cc:* std-proposals_at_[hidden] <std-proposals_at_[hidden]>
> *Subject:* Re: [std-proposals] Revising #pragma once
>
> Yes, paths being globally unique *semantically* is a lot easier to ensure
> than inodes matching up when you make them available to 3 layers of fuse
> and sandboxing.
> Guess which #pragma once depends on.
>
> On Fri, Aug 30, 2024 at 12:09 PM Tiago Freire <tmiguelf_at_[hidden]>
> wrote:
>
>> > hanks Lakos) where we do INCLUDED_{company}_{full_path_to_header}
>> where the paths are globally unique,
>>
>> EXACTLY!!! And it's something that you don't afford to ensure for your
>> compiler.
>> That's why it doesn't work for you.
>>
>>
>> > 3p libraries each have their own thing, of course, but they have an
>> interest in keeping it reasonable and deconflicted and I bother them if
>> their include guards aren't unique
>>
>> So, it doesn't fix it?
>>
>> ------------------------------
>> *From:* Gašper Ažman <gasper.azman_at_[hidden]>
>> *Sent:* Friday, August 30, 2024 12:53:05 PM
>> *To:* Tiago Freire <tmiguelf_at_[hidden]>
>> *Cc:* std-proposals_at_[hidden] <std-proposals_at_[hidden]>
>> *Subject:* Re: [std-proposals] Revising #pragma once
>>
>> Well:
>> - 3p libraries each have their own thing, of course, but they have an
>> interest in keeping it reasonable and deconflicted and I bother them if
>> their include guards aren't unique
>> - 1p, we have a modified bloomberg-style scheme (thanks Lakos) where we
>> do INCLUDED_{company}_{full_path_to_header} where the paths are globally
>> unique, and I have a linter that autofixes it so if you do #pragma once it
>> just gets replaced, and if you copy-paste it somewhere else you get a fixit.
>>
>> This solved A LOT of build breakages, and was a prerequisite for getting
>> cloud builds working at all.
>>
>> So, yeah, I will strongly oppose any standardization of #pragma once on a
>> "same file" basis; C++ is more and more a "large systems" language, because
>> small projects can choose Rust and be fine. #once INCLUDE_GUARD_SLUG solves
>> all the annoyance of 3-lines for one feature and doesn't have same-file
>> issues.
>>
>> On Fri, Aug 30, 2024 at 11:47 AM Tiago Freire <tmiguelf_at_[hidden]>
>> wrote:
>>
>>> And what does "aware of the global namespacing" mean in practice?
>>>
>>> ------------------------------
>>> *From:* Gašper Ažman <gasper.azman_at_[hidden]>
>>> *Sent:* Friday, August 30, 2024 12:22:14 PM
>>> *To:* Tiago Freire <tmiguelf_at_[hidden]>
>>> *Cc:* std-proposals_at_[hidden] <std-proposals_at_[hidden]>
>>> *Subject:* Re: [std-proposals] Revising #pragma once
>>>
>>> Because my linter is aware of the global namespacing of libraries and
>>> can fix stuff *before they get into the source control structure*. My build
>>> tools have to deal with build outputs, not sources. Substantially different
>>> thing.
>>>
>>> On Fri, Aug 30, 2024 at 11:20 AM Tiago Freire <tmiguelf_at_[hidden]>
>>> wrote:
>>>
>>>> > This is why we have linters. At least that's a problem I can fix and
>>>> diagnose; I can't fix #pragma once.
>>>>
>>>>
>>>>
>>>> If your linter could figure that out, why your compiler wouldn’t?
>>>>
>>>>
>>>>
>>>> *From:* Gašper Ažman <gasper.azman_at_[hidden]>
>>>> *Sent:* Friday, August 30, 2024 12:16 PM
>>>> *To:* Tiago Freire <tmiguelf_at_[hidden]>
>>>> *Cc:* std-proposals_at_[hidden]
>>>> *Subject:* Re: [std-proposals] Revising #pragma once
>>>>
>>>>
>>>>
>>>> This is why we have linters. At least that's a problem I can fix and
>>>> diagnose; I can't fix #pragma once.
>>>>
>>>>
>>>>
>>>> On Fri, Aug 30, 2024 at 11:13 AM Tiago Freire <tmiguelf_at_[hidden]>
>>>> wrote:
>>>>
>>>> How do you ensure that your “INCLUDE_GUARD_STRING_YOU_CANNOT_OMIT” is
>>>> unique across different libraries?
>>>>
>>>> Given standard practice, if the headers the same name, they are very
>>>> likely going to have the same “INCLUDE_GUARD_STRING_YOU_CANNOT_OMIT”. And
>>>> we are back to the same problem you just describe.
>>>>
>>>>
>>>>
>>>> Now multiply this by, “I renamed the file and forget to change it”, “I
>>>> copy pasted a template header file and forgot to change the
>>>> INCLUDE_GUARD_STRING_YOU_CANNOT_OMIT, and wasn’t caught in review because
>>>> nobody looks at that”.
>>>>
>>>>
>>>> You know what they do not have? The same filepath.
>>>>
>>>> If I don’t have to write it, I don’t have to worry about these
>>>> problems.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Gašper Ažman <gasper.azman_at_[hidden]>
>>>> *Sent:* Friday, August 30, 2024 12:06 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Tiago Freire <tmiguelf_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Revising #pragma once
>>>>
>>>>
>>>>
>>>> Simple as in #once INCLUDE_GUARD_STRING_YOU_CANNOT_OMIT?
>>>>
>>>>
>>>>
>>>> On Fri, Aug 30, 2024 at 11:04 AM Tiago Freire via Std-Proposals <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> I agree, bit-wise comparison is not the way to go, that’s why no
>>>> compiler does this.
>>>>
>>>> There’s no need to over-engineer this. Keep it simple.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Gašper Ažman via Std-Proposals
>>>> *Sent:* Friday, August 30, 2024 11:51 AM
>>>> *To:* marcinjaczewski86_at_[hidden]
>>>> *Cc:* Gašper Ažman <gasper.azman_at_[hidden]>;
>>>> std-proposals_at_[hidden]; Tom Honermann <tom_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Revising #pragma once
>>>>
>>>>
>>>>
>>>> If you knew up-front you wouldn't do it :).
>>>>
>>>>
>>>>
>>>> This happens though. There are people who generate code - whole include
>>>> trees - and for plugins they end up looking very very similar or identical.
>>>> And then as the build engineer my users complain that "their build doesn't
>>>> work on the cloud build environment" and I have to somehow *find their bug*.
>>>>
>>>>
>>>>
>>>> Think about it. How the hell do you diagnose this silent noninclusion?
>>>>
>>>>
>>>>
>>>> On Fri, Aug 30, 2024 at 9:30 AM Marcin Jaczewski <
>>>> marcinjaczewski86_at_[hidden]> wrote:
>>>>
>>>> pt., 30 sie 2024 o 01:20 Gašper Ažman via Std-Proposals
>>>> <std-proposals_at_[hidden]> napisał(a):
>>>> >
>>>> > Darlings,
>>>> >
>>>> > byte-identical is just plain incorrect. Consider.
>>>> >
>>>> > [library1/public.hpp]
>>>> > #pragma once
>>>> > #include "utils.hpp"
>>>> >
>>>> > [library1/utils.hpp]
>>>> > class lib1 {};
>>>> >
>>>> > library2/public.hpp
>>>> > #pragma once
>>>> > #include "utils.hpp"
>>>> >
>>>> > [library2/utils.hpp]
>>>> > class lib2 {};
>>>> >
>>>> > [main.cpp]
>>>> > #include "library1/public.hpp"
>>>> > #include "library2/public.hpp" # boom, library2/utils.hpp does not
>>>> get included
>>>> >
>>>> > same-contents also means same-relative-include-trees. Congratulations.
>>>> >
>>>>
>>>> Question is do you know upfront that files are problematic?
>>>> Then we could do something like this:
>>>> a) we only consider normalized paths
>>>> b) use have option to add path mappings that are used in path comparison
>>>>
>>>> Like:
>>>>
>>>> We have folders `libs/fooV1` and `libs/fooV2`. Both folders are symlinks
>>>> but by default when compiler load `libs/fooV1/bar.h` and
>>>> `libs/fooV2/bar.h`
>>>> the compiler considers them as separate files even when they are the
>>>> same file.
>>>> Now we add compiler flag `-PI libs/fooV1:libs/fooV2` (similar to `-I`)
>>>> and now
>>>> when compiler load `libs/fooV1/foo.h` he consider it as if he load
>>>> `libs/fooV2/bar.h` and thus next loading of `libs/fooV2/bar.h` will be
>>>> blocked.
>>>>
>>>> And this should be ver dumb process it could be cases when
>>>> `libs/fooV1/bar.h`
>>>> and `libs/fooV2/bar.h` are unrelated but if you make specific maping it
>>>> will
>>>> override even diffrnet files. This could be useful to hacking some
>>>> broken
>>>> includes like:
>>>>
>>>> ```
>>>> -PI hack/foo/bar.h:libs/fooV2/bar.h
>>>> ```
>>>>
>>>> and we can in our files do:
>>>>
>>>> ```
>>>> #include "hack/foo/bar.h"
>>>> #include "libs/fooV2/foo.h" // it will ignore `#include "bar.h"`
>>>> ```
>>>>
>>>> Could mechnism like this work on your build system?
>>>>
>>>> > On Fri, Aug 30, 2024 at 12:15 AM Jeremy Rifkin via Std-Proposals <
>>>> std-proposals_at_[hidden]> wrote:
>>>> >>
>>>> >> > In this very thread there are examples showing why taking only the
>>>> content into account doesn't work but it was brushed off as "that can be
>>>> fixed".
>>>> >>
>>>> >> I'm sorry you feel I have brushed any presented examples off. I have
>>>> >> found them all immensely helpful for consideration. It's hard for me
>>>> >> to imagine times when you'd want the same include-guarded content
>>>> >> included twice, however I found the example of a "main header"
>>>> >> compelling. The example of a header that only defines macros you
>>>> undef
>>>> >> is also not impractical.
>>>> >>
>>>> >> However, there are also compelling examples for filesystem identity.
>>>> >> Mainly the multiple mount point issue.
>>>> >>
>>>> >> I think both can be reasonable, however, I have been trying to
>>>> >> understand the most probable failure modes. While I originally
>>>> >> proposed a content-based definition, I do think a filesystem-based
>>>> >> definition is closer to current semantics and expectations.
>>>> >>
>>>> >> Jeremy
>>>> >>
>>>> >> On Thu, Aug 29, 2024 at 5:24 PM Breno Guimarães via Std-Proposals
>>>> >> <std-proposals_at_[hidden]> wrote:
>>>> >> >
>>>> >> > To add to that, the whole idea is to standardize standard
>>>> practice. If the first thing you do is to change spec to something else,
>>>> then you're not standardizing standard practice, you are adding a new
>>>> feature that inconveniently clashes with an existing one.
>>>> >> >
>>>> >> > In this very thread there are examples showing why taking only the
>>>> content into account doesn't work but it was brushed off as "that can be
>>>> fixed".
>>>> >> >
>>>> >> > None of this make sense to me.
>>>> >> >
>>>> >> >
>>>> >> > Em qui., 29 de ago. de 2024 18:59, Tiago Freire via Std-Proposals <
>>>> std-proposals_at_[hidden]> escreveu:
>>>> >> >>
>>>> >> >> Again, hashing content... totally unnecessary.
>>>> >> >>
>>>> >> >> There's no need to identify "same content" which as far as I can
>>>> see can be defeated by modifications that don't change the interpretation,
>>>> like spaces, which although not technically a violation of "same content"
>>>> it clearly defeats the intent.
>>>> >> >>
>>>> >> >> An include summons a resource, a pragma once bars that resources
>>>> from bey re-summonable. That's it. File paths should be more than enough.
>>>> >> >>
>>>> >> >> I'm unconvinced that the "bad cases" are not just a product of
>>>> bad build architecture, if done properly a compiler should never be
>>>> presented with multiple alternatives of the same file. And putting such
>>>> requirements on compilers puts an unnecessary burden on developers to
>>>> support a scenario that it is that is arguably bad practice.
>>>> >> >>
>>>> >> >> The argument is "prgma once" is supported everywhere it is good,
>>>> we should make it official in the standard, effectively no change to a
>>>> compiler should occur as a consequence.
>>>> >> >> If a change needs to occur, then in fact "your version" of what
>>>> you mean by "pragma once" is actually "not supported" by all the major
>>>> compilers.
>>>> >> >>
>>>> >> >> Current compiler support of "pragma once" and it's usage on cross
>>>> platform projects have a particular way of dealing with dependencies in
>>>> mind. That workflow works. It's pointless to have this discussion if you
>>>> don't understand that flow, and you shouldn't tailor the tool to a workflow
>>>> that doesn't exist to the detriment of all.
>>>> >> >>
>>>> >> >>
>>>> >> >> ________________________________
>>>> >> >> From: Std-Proposals <std-proposals-bounces_at_[hidden]> on
>>>> behalf of Jeremy Rifkin via Std-Proposals <
>>>> std-proposals_at_[hidden]>
>>>> >> >> Sent: Thursday, August 29, 2024 9:56:18 PM
>>>> >> >> To: Tom Honermann <tom_at_[hidden]>
>>>> >> >> Cc: Jeremy Rifkin <rifkin.jer_at_[hidden]>;
>>>> std-proposals_at_[hidden] <std-proposals_at_[hidden]>
>>>> >> >> Subject: Re: [std-proposals] Revising #pragma once
>>>> >> >>
>>>> >> >> Performance should be fine if using a content definition. An
>>>> implementation can do inode/path checks against files it already knows of,
>>>> as a fast path. The first time a file is #included it’s just a hash+table
>>>> lookup to decide whether to continue.
>>>> >> >>
>>>> >> >> Regarding the filesystem definition vs content definition
>>>> question, while I think a content-based definition is robust I can see
>>>> there is FUD about it and also an argument about current practice being a
>>>> filesystem-based definition. It may just be best to approach this as
>>>> filesystem uniqueness to the implementation’s ability, with a requirement
>>>> that symbolic links/hard links are handled. This doesn’t cover the case of
>>>> multiple mount points, but we’ve discussed that that’s impossible with
>>>> #pragma once without using contents instead.
>>>> >> >>
>>>> >> >> Jeremy
>>>> >> >>
>>>> >> >> On Thu, Aug 29, 2024 at 13:06 Tom Honermann <tom_at_[hidden]>
>>>> wrote:
>>>> >> >>>
>>>> >> >>> On 8/28/24 12:32 AM, Jeremy Rifkin via Std-Proposals wrote:
>>>> >> >>>
>>>> >> >>> Another question is whether the comparison should be post
>>>> translation
>>>> >> >>> phase 1.
>>>> >> >>>
>>>> >> >>> I gave this some thought while drafting the proposal. I think it
>>>> comes
>>>> >> >>> down to whether the intent is single inclusion of files or single
>>>> >> >>> inclusion of contents.
>>>> >> >>>
>>>> >> >>> Indeed. The proposal currently favors the "same contents"
>>>> approach and offers the following wording.
>>>> >> >>>
>>>> >> >>> A preprocessing directive of the form
>>>> >> >>> # pragma once new-line
>>>> >> >>> shall cause no subsequent #include directives to perform
>>>> replacement for a file with text contents identical to this file.
>>>> >> >>>
>>>> >> >>> The wording will have to define what it means for contents to be
>>>> identical. Options include:
>>>> >> >>>
>>>> >> >>> The files must be byte-for-byte identical. This makes source
>>>> file encoding observable (which I would be strongly against).
>>>> >> >>> The files must encode the same character sequence post
>>>> translation phase 1. This makes comparisons potentially expensive.
>>>> >> >>>
>>>> >> >>> Note that the "same contents" approach obligates an
>>>> implementation to consider every previously encountered file for every
>>>> #include directive. An inode based optimization can help to determine if a
>>>> file was previously encountered based on identity, but it doesn't help to
>>>> reduce the costs when a file that was not previously seen is encountered.
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> Tom.
>>>> >> >>>
>>>> >> >>> Jeremy
>>>> >> >>>
>>>> >> >>> On Tue, Aug 27, 2024 at 3:39 PM Tom Honermann via Std-Proposals
>>>> >> >>> <std-proposals_at_[hidden]> wrote:
>>>> >> >>>
>>>> >> >>> On 8/27/24 4:10 PM, Thiago Macieira via Std-Proposals wrote:
>>>> >> >>>
>>>> >> >>> On Tuesday 27 August 2024 12:35:17 GMT-7 Andrey Semashev via
>>>> Std-Proposals
>>>> >> >>> wrote:
>>>> >> >>>
>>>> >> >>> The fact that gcc took the approach to compare file contents I
>>>> consider
>>>> >> >>> a poor choice, and not an argument to standardize this
>>>> implementation.
>>>> >> >>>
>>>> >> >>> Another question is whether a byte comparison of two files of
>>>> the same size is
>>>> >> >>> expensive for compilers.
>>>> >> >>>
>>>> >> >>> #once ID doesn't need to compare the entire file.
>>>> >> >>>
>>>> >> >>> Another question is whether the comparison should be post
>>>> translation
>>>> >> >>> phase 1. In other words, whether differently encoded source
>>>> files that
>>>> >> >>> decode to the same sequence of code points are considered the
>>>> same file
>>>> >> >>> (e.g., a Windows-1252 version and a UTF-8 version). Standard C++
>>>> does
>>>> >> >>> not currently allow source file encoding to be observable but a
>>>> #pragma
>>>> >> >>> once implementation that only compares bytes would make such
>>>> differences
>>>> >> >>> observable.
>>>> >> >>>
>>>> >> >>> Tom.
>>>> >> >>>
>>>> >> >>> --
>>>> >> >>> Std-Proposals mailing list
>>>> >> >>> Std-Proposals_at_[hidden]
>>>> >> >>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Std-Proposals mailing list
>>>> >> >> Std-Proposals_at_[hidden]
>>>> >> >> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>> >> >
>>>> >> > --
>>>> >> > Std-Proposals mailing list
>>>> >> > Std-Proposals_at_[hidden]
>>>> >> > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>> >> --
>>>> >> Std-Proposals mailing list
>>>> >> Std-Proposals_at_[hidden]
>>>> >> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>> >
>>>> > --
>>>> > Std-Proposals mailing list
>>>> > Std-Proposals_at_[hidden]
>>>> > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>>
>>>
>>
>

Received on 2024-08-30 11:36:14