Date: Tue, 28 Nov 2023 10:56:01 +0800
Hi Boris,
> I think
if this is merged into Clang and doesn't require a separate tool for
querying (clang-named-modules-querier), I can try to support it in build2.
Out of curiosity, when a separate tool is harder for build system to support? I chose separate tools to support this since I feel it reduces the complexity of the compiler and not harder for build systems. e.g., I feel like the build systems don't pay a lot to use the clang-scan-deps tool.
> I also wonder whether you have considered a third approach, which is to
provide a hash of the module interface. Essentially:
>
> clang-named-modules-querier a.pcm --all | sha256sum
Yeah, this is my first idea and I expressed it in https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/ <https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/ >. The key point is source locations too. If we compute the source locations to the decls hash, how useful will it be? If we don't compute it, how can we keep the semantics well. Both of them are discussed. Maybe I should present this in the paper too. I just thought the methods presented in paper are more powerful. But maybe just as Ben mentioned, the `push` mechanism sounds much more safer than `pull` mechanism.
Thanks,
Chuanqi
Thanks,
Chuanqi
------------------------------------------------------------------
From:Boris Kolpackov <boris_at_[hidden]>
Send Time:2023 Nov. 27 (Mon.) 19:37
To: SG15 <sg15_at_lists.isocpp.org>
Cc:Chuanqi <chuanqi.xcq_at_[hidden]>
Subject:Re: [SG15] [Modules] [P3057] Two finer-grained compilation models for named modules
Chuanqi Xu via SG15 <sg15_at_[hidden]> writes:
> Feedbacks or concerns are highly appreciated.
My understanding is that the first approach essentially creates distinct
sets of prerequisites for the BMI and for the object file, even though
they are produced with the same compiler invocation. I think this will
be a pretty hard thing to handle for most build systems, including build2,
where we model the BMI+OBJ pair as a target group with a single set of
prerequisites.
While the second approach feels like it will be easier to support, it is
still quite a bit of housekeeping (and a separate process invocation just
for an up-to-date check). But an interesting idea, nevertheless. I think
if this is merged into Clang and doesn't require a separate tool for
querying (clang-named-modules-querier), I can try to support it in build2.
I also wonder whether you have considered a third approach, which is to
provide a hash of the module interface. Essentially:
clang-named-modules-querier a.pcm --all | sha256sum
This approach should be pretty easy to handle for build systems that
don't hard-code mtime-based out-of-date checking semantics. Quite a few
build systems these days also (or instead) support content hashing as
a more accurate (but more expensive) mechanism. For such build systems
supporting this approach should be pretty painless: instead of calling
sha256sum to get the hash, simply call the compiler (or another special
tool) to get the interface hash.
I think the only catch here is that the compiler (or the special tool)
must make sure the hash tracks the relevant changes accurately. In
particular, I believe that if any exported declaration's location changes,
the hash must change as well (since it may affect the diagnostics the user
sees). I think this may make the whole idea a lot less appealing, at least
for certain use-case. Using the example from your paper:
export module a;
export int a() {
return 43;
}
namespace nn {
export int a(int x, int y) {
return x + y;
}
}
If I change the implementation of a() by adding another line:
export int a() {
// TODO
return 43;
}
The hash will have to change because the exported a(int,int) is now
declared on a different line.
> I think
if this is merged into Clang and doesn't require a separate tool for
querying (clang-named-modules-querier), I can try to support it in build2.
Out of curiosity, when a separate tool is harder for build system to support? I chose separate tools to support this since I feel it reduces the complexity of the compiler and not harder for build systems. e.g., I feel like the build systems don't pay a lot to use the clang-scan-deps tool.
> I also wonder whether you have considered a third approach, which is to
provide a hash of the module interface. Essentially:
>
> clang-named-modules-querier a.pcm --all | sha256sum
Yeah, this is my first idea and I expressed it in https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/ <https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/ >. The key point is source locations too. If we compute the source locations to the decls hash, how useful will it be? If we don't compute it, how can we keep the semantics well. Both of them are discussed. Maybe I should present this in the paper too. I just thought the methods presented in paper are more powerful. But maybe just as Ben mentioned, the `push` mechanism sounds much more safer than `pull` mechanism.
Thanks,
Chuanqi
Thanks,
Chuanqi
------------------------------------------------------------------
From:Boris Kolpackov <boris_at_[hidden]>
Send Time:2023 Nov. 27 (Mon.) 19:37
To: SG15 <sg15_at_lists.isocpp.org>
Cc:Chuanqi <chuanqi.xcq_at_[hidden]>
Subject:Re: [SG15] [Modules] [P3057] Two finer-grained compilation models for named modules
Chuanqi Xu via SG15 <sg15_at_[hidden]> writes:
> Feedbacks or concerns are highly appreciated.
My understanding is that the first approach essentially creates distinct
sets of prerequisites for the BMI and for the object file, even though
they are produced with the same compiler invocation. I think this will
be a pretty hard thing to handle for most build systems, including build2,
where we model the BMI+OBJ pair as a target group with a single set of
prerequisites.
While the second approach feels like it will be easier to support, it is
still quite a bit of housekeeping (and a separate process invocation just
for an up-to-date check). But an interesting idea, nevertheless. I think
if this is merged into Clang and doesn't require a separate tool for
querying (clang-named-modules-querier), I can try to support it in build2.
I also wonder whether you have considered a third approach, which is to
provide a hash of the module interface. Essentially:
clang-named-modules-querier a.pcm --all | sha256sum
This approach should be pretty easy to handle for build systems that
don't hard-code mtime-based out-of-date checking semantics. Quite a few
build systems these days also (or instead) support content hashing as
a more accurate (but more expensive) mechanism. For such build systems
supporting this approach should be pretty painless: instead of calling
sha256sum to get the hash, simply call the compiler (or another special
tool) to get the interface hash.
I think the only catch here is that the compiler (or the special tool)
must make sure the hash tracks the relevant changes accurately. In
particular, I believe that if any exported declaration's location changes,
the hash must change as well (since it may affect the diagnostics the user
sees). I think this may make the whole idea a lot less appealing, at least
for certain use-case. Using the example from your paper:
export module a;
export int a() {
return 43;
}
namespace nn {
export int a(int x, int y) {
return x + y;
}
}
If I change the implementation of a() by adding another line:
export int a() {
// TODO
return 43;
}
The hash will have to change because the exported a(int,int) is now
declared on a different line.
Received on 2023-11-28 02:56:07