C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Constant-time selection primitive following memset_explicit precedent

From: Shivam Kunwar <shivam.kunwar_at_[hidden]>
Date: Sun, 14 Dec 2025 12:05:57 +0530
On 2025-12-11 15:39, Jens Maurer wrote:
> On 12/11/25 07:21, Shivam Kunwar via Std-Proposals wrote:
>> but here is the thing, the C and C++ Standard can't mandate CPU
>> Behavior, what it can do is, guarantee the compiler won't introduce
>> timing dependencies, it can specify the intent (just like
>> memset_explicit does) so implementations know what's needed, and then
>> it
>> leaves room for implementations to use hardware features like Intel
>> DOIT, ARM DIT
>
> In the current framework of the C++ abstract machine, which underlies
> both the core language and the standard library, there is no way to
> prevent compilers from messing with your intended constant-time
> instruction in undesirable ways.
>
> I've tried that 10 years ago; see N4534.
>
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4534.html
>
> and the paper was rejected by LEWG in Lenexa 2015.
> I remember compiler people telling me they have no way to achieve
> the intended outcome within e.g. the current LLVM machinery.
>
>> and afaik this is exactly what memset_explicit does, it states the
>> intent ("make this data inaccessible") without defining "optimization"
>> in the abstract machine, and I am thinking same approach works here.
>
> Similar to memset_explicit, I strongly object to trying to standardize
> something like that without changes to the abstract machine.
>
> (We have memset_explicit because it is inherited from C23, not because
> C++ has standardized it themselves.)
>
> Jens

Hey Jens,

Thank you for pointing me to N4534, astonishing paper, and understood
why it was rejected

"the compiler can move evaluations across the barrier, as long as they
don't access any variables, such as a subexpression taking prvalues and
producing a prvalue"

so even with barriers, pure expressions can get hoisted out of the
protected zone.

But clear me out, what I am wondering , does the intrinsic approach have
the same problem?

With __builtin_ct_select (Probably coming in LLVM 22), you're creating a
constant time zone, and the intrinsic is just a black box, the optimizer
evaluates the arguments, hands them to something it can't see through,
and gets a result back, so there's nowhere for expressions to escape to.

And I would say hide_value idea is even more minimal:
T hide_value(T x) noexcept;
// Returns x unchanged, but optimizer must forget everything about its
value

and then we build select on top:
T select_explicit(bool cond, T a, T b) {
     T mask = -(T)(!!cond);
     return (a & hide_value(mask)) | (b & hide_value(~mask));
}

so here what we are noticing that there is no zones, no barriers, no
wrapper types, just "this value is now opaque to you" , and the
optimizer can't simplify (a & hide_value(mask)) because it genuinely
doesn't know what the mask is anymore.

I guess what I'm asking is, does hide_value hit the same wall as
value<T>, or is it simple enough to actually work? It's not trying to
make timing observable in the abstract machine, it's just making the
optimizer forget specific facts about values.

What do you think?

Received on 2025-12-14 06:36:00