Date: Mon, 5 May 2025 20:35:04 -0400
Hi Khalil,
I took a look. Inlining control is actually one of the topics on my
to-learn-and-then-blog list. We generally tell people "The `inline` keyword
has semantic effects; it doesn't have *anything* to do with the inlining
optimization," and to a first approximation that's true; but I'm aware that
some (all?) compilers actually do feed "presence of the `inline` keyword"
into their inlining heuristic. I'd be very interested to see a survey of
existing compilers showing:
- The compiler's exact inlining calculation (which could be given
concretely for, say, GCC 15, even though one couldn't assume it would stay
unchanged forever)
- A concrete example where adding or removing `inline` from a template
affects the compiler's inlining decision
- A concrete example where adding or removing `inline` from an implicitly
inline member/friend affects the compiler's inlining decision
(I say "to see," but I assume I mean "to produce," which is why I say it's
on my to-do list. This would have roughly the same flavor as my post from
2020 <https://quuxplusone.github.io/blog/2020/09/11/unstable-sort-inputs/>
showing concrete examples where `std::sort` was unstable. If you're
interested in helping and/or forcing me to start writing this post, email
me offline.)
So, the first step is to collect information on what compilers' inlining
heuristics look like *today*, and what facilities they offer to adjust
their heuristics. P3676R0 <https://isocpp.org/files/papers/P3676R0.pdf> is
missing that survey. You do allude to [[noinline]], [[always_inline]], and
__forceinline, but I'd want to see an actual table showing (1) *exactly*
who of the Big Four support each of those facilities and (2) *exactly* what
effect each one has.
After that, I also disagree with the paper's direction, which is to make up
a new and inflexible core-language syntax. Just use the existing,
implemented, flexible syntax!
If your problem is that __forceinline isn't portable, then the obvious
solution would be either to ask GCC/Clang to support __forceinline, or to
ask MSVC to support [[always_inline]]. Inventing a third thing that
*nobody* yet
supports doesn't help any existing codebase — and C++ is all about existing
codebases.
Why do I say "inflexible"? Well, you propose a three-way toggle: inline(0)
means "do not inline," inline(1) means "no effect," and inline(2) means
"always inline." What should happen when I write inline(3) or inline(99)?
Compare GCC's behavior when you pass -O0, -O1, -O2, -O3, and -O99. IIRC,
-O99 is a synonym for -O3 because "bigger number should mean more." Compare
the behavior of GCC/Clang's [[init_priority(n)]] attribute
<https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/C_002b_002b-Attributes.html#C_002b_002b-Attributes>
.
What if I want to bump the heuristic by a little bit, but not force it to
*always* inline? Surely "always inline" should be, like, inline(9999), so
that I can use the intermediate values in a sensible way.
More importantly, what if I want to mark my function as
"never-do-the-inlining-optimization" *without* simultaneously marking it as
"inline" from the linker's point of view? I mean, presumably `inline(0)`
would still have the *semantic* effect of making the function inline,
right? That semantic effect that we tell people is the fundamental purpose
of the "inline" keyword?
Or, if `inline(0)` would *not* have the semantic effect of `inline`, then
will it be legal for me to write `inline(0) inline int f() { return 42; }`
in the case where today I'd simply write `[[noinline]] inline int f() {
return 42; }`?
In the same vein, notice that this thing will have to be the right "part of
speech" to fit into everyone's existing macros. So if today people are
writing
#define NOINLINE __attribute__((noinline))
NOINLINE [[gnu::pure]] int f();
then tomorrow you'll want them to be able to write
#define NOINLINE inline(0)
NOINLINE [[gnu::pure]] int f();
I suspect you're proposing to make that a syntax error instead; which means
your new syntax is D.O.A.
The existing attribute syntax is perfectly capable of expressing everything
in P3676 — and more — *and* it already exists. I see no reason to duplicate
all that effort in the core language grammar.
"Typically the macro is like that of llama.cpp:"
The code snippet seems to be cut off — it ends with the word "#else".
I'd like to see a Tony Table in the paper. On the left side: here's what
llama.cpp does today, i.e., the complete macro definition and several
sample usages drawn from their codebase. On the right side: here's what
they will be able to do tomorrow, i.e., those sample usages with the new
"portable" syntax. And if the usage portion of that table is *not* simply
- ALWAYS_INLINE void operator()(const Func& f, Args... args) const {
+ [[always_inline]] void operator()(const Func& f, Args... args) const {
and
- XXH_NO_INLINE XXH_PUREF XXH64_hash_t
+ [[noinline]] XXH_PUREF XXH64_hash_t
XXH3_hashLong_64b_default(const void* XXH_RESTRICT input, size_t len,
XXH64_hash_t seed64, const xxh_u8* XXH_RESTRICT secret, size_t secretLen)
then you'll have to have a really really *really* good reason why not.
Procedural nitpick: P3676R0 has your and Stephen's names, but it needs to
have at least one email address as well, for readers to give feedback. (I
would have cc'ed Stephen on *this* message, if I'd known his email address.)
HTH,
Arthur
On Mon, May 5, 2025 at 7:11 PM Khalil Estell via SG14 <sg14_at_[hidden]>
wrote:
> Hello SG14 reflector!
>
> Let me introduce P3676R0 <https://isocpp.org/files/papers/P3676R0.pdf> which
> seeks to bring commonly used attributes like *__force_inline* and
> *[[gnu::always_inline]]* into the standard by enhancing the inline
> keyword. I'm co-authoring this along with Stephen Berry.
>
> Best,
> ----
> Khalil Estell
> Volunteer & Mentor @ SJSU College of Engineering
> <https://www.sjsu.edu/engineering/>
> Voting ISO C++ <https://isocpp.org/> Committee member
> Founder of the libhal <https://github.com/libhal> organization & ecosystem
> _______________________________________________
> SG14 mailing list
> SG14_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg14
>
I took a look. Inlining control is actually one of the topics on my
to-learn-and-then-blog list. We generally tell people "The `inline` keyword
has semantic effects; it doesn't have *anything* to do with the inlining
optimization," and to a first approximation that's true; but I'm aware that
some (all?) compilers actually do feed "presence of the `inline` keyword"
into their inlining heuristic. I'd be very interested to see a survey of
existing compilers showing:
- The compiler's exact inlining calculation (which could be given
concretely for, say, GCC 15, even though one couldn't assume it would stay
unchanged forever)
- A concrete example where adding or removing `inline` from a template
affects the compiler's inlining decision
- A concrete example where adding or removing `inline` from an implicitly
inline member/friend affects the compiler's inlining decision
(I say "to see," but I assume I mean "to produce," which is why I say it's
on my to-do list. This would have roughly the same flavor as my post from
2020 <https://quuxplusone.github.io/blog/2020/09/11/unstable-sort-inputs/>
showing concrete examples where `std::sort` was unstable. If you're
interested in helping and/or forcing me to start writing this post, email
me offline.)
So, the first step is to collect information on what compilers' inlining
heuristics look like *today*, and what facilities they offer to adjust
their heuristics. P3676R0 <https://isocpp.org/files/papers/P3676R0.pdf> is
missing that survey. You do allude to [[noinline]], [[always_inline]], and
__forceinline, but I'd want to see an actual table showing (1) *exactly*
who of the Big Four support each of those facilities and (2) *exactly* what
effect each one has.
After that, I also disagree with the paper's direction, which is to make up
a new and inflexible core-language syntax. Just use the existing,
implemented, flexible syntax!
If your problem is that __forceinline isn't portable, then the obvious
solution would be either to ask GCC/Clang to support __forceinline, or to
ask MSVC to support [[always_inline]]. Inventing a third thing that
*nobody* yet
supports doesn't help any existing codebase — and C++ is all about existing
codebases.
Why do I say "inflexible"? Well, you propose a three-way toggle: inline(0)
means "do not inline," inline(1) means "no effect," and inline(2) means
"always inline." What should happen when I write inline(3) or inline(99)?
Compare GCC's behavior when you pass -O0, -O1, -O2, -O3, and -O99. IIRC,
-O99 is a synonym for -O3 because "bigger number should mean more." Compare
the behavior of GCC/Clang's [[init_priority(n)]] attribute
<https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/C_002b_002b-Attributes.html#C_002b_002b-Attributes>
.
What if I want to bump the heuristic by a little bit, but not force it to
*always* inline? Surely "always inline" should be, like, inline(9999), so
that I can use the intermediate values in a sensible way.
More importantly, what if I want to mark my function as
"never-do-the-inlining-optimization" *without* simultaneously marking it as
"inline" from the linker's point of view? I mean, presumably `inline(0)`
would still have the *semantic* effect of making the function inline,
right? That semantic effect that we tell people is the fundamental purpose
of the "inline" keyword?
Or, if `inline(0)` would *not* have the semantic effect of `inline`, then
will it be legal for me to write `inline(0) inline int f() { return 42; }`
in the case where today I'd simply write `[[noinline]] inline int f() {
return 42; }`?
In the same vein, notice that this thing will have to be the right "part of
speech" to fit into everyone's existing macros. So if today people are
writing
#define NOINLINE __attribute__((noinline))
NOINLINE [[gnu::pure]] int f();
then tomorrow you'll want them to be able to write
#define NOINLINE inline(0)
NOINLINE [[gnu::pure]] int f();
I suspect you're proposing to make that a syntax error instead; which means
your new syntax is D.O.A.
The existing attribute syntax is perfectly capable of expressing everything
in P3676 — and more — *and* it already exists. I see no reason to duplicate
all that effort in the core language grammar.
"Typically the macro is like that of llama.cpp:"
The code snippet seems to be cut off — it ends with the word "#else".
I'd like to see a Tony Table in the paper. On the left side: here's what
llama.cpp does today, i.e., the complete macro definition and several
sample usages drawn from their codebase. On the right side: here's what
they will be able to do tomorrow, i.e., those sample usages with the new
"portable" syntax. And if the usage portion of that table is *not* simply
- ALWAYS_INLINE void operator()(const Func& f, Args... args) const {
+ [[always_inline]] void operator()(const Func& f, Args... args) const {
and
- XXH_NO_INLINE XXH_PUREF XXH64_hash_t
+ [[noinline]] XXH_PUREF XXH64_hash_t
XXH3_hashLong_64b_default(const void* XXH_RESTRICT input, size_t len,
XXH64_hash_t seed64, const xxh_u8* XXH_RESTRICT secret, size_t secretLen)
then you'll have to have a really really *really* good reason why not.
Procedural nitpick: P3676R0 has your and Stephen's names, but it needs to
have at least one email address as well, for readers to give feedback. (I
would have cc'ed Stephen on *this* message, if I'd known his email address.)
HTH,
Arthur
On Mon, May 5, 2025 at 7:11 PM Khalil Estell via SG14 <sg14_at_[hidden]>
wrote:
> Hello SG14 reflector!
>
> Let me introduce P3676R0 <https://isocpp.org/files/papers/P3676R0.pdf> which
> seeks to bring commonly used attributes like *__force_inline* and
> *[[gnu::always_inline]]* into the standard by enhancing the inline
> keyword. I'm co-authoring this along with Stephen Berry.
>
> Best,
> ----
> Khalil Estell
> Volunteer & Mentor @ SJSU College of Engineering
> <https://www.sjsu.edu/engineering/>
> Voting ISO C++ <https://isocpp.org/> Committee member
> Founder of the libhal <https://github.com/libhal> organization & ecosystem
> _______________________________________________
> SG14 mailing list
> SG14_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg14
>
Received on 2025-05-06 00:35:22