C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Proposal interest: constexpr std::static_unordered_set<T, N>

From: Robert Baumgartner <baumgar.robert_at_[hidden]>
Date: Tue, 14 Apr 2026 18:53:58 +0200
Thank you for the detailed feedback — let me address each point.

> Well, if the values are known at compile-time, I'd expect that you
initialize the
> unordered_set during process startup (as global initialization), so this
overhead
> doesn't appear during execution.

> The run-time overhead you do have with std::unordered_set is the
pointer-based data
> structure, which is likely to cause quite a bit of cache footprint.

You are right to say that this can be a global object, but this introduces
other problems, e.g. the
static initialization order fiasco
<https://en.cppreference.com/w/cpp/language/siof.html>. constexpr objects
cannot be accidentally used before initialization,
hence removing a class of bugs altogether. The proposed structure works
fundamentally different,
giving compile time initialization, i.e. much stronger guarantees on when
initialization happens.

Additionally ultra-low latency users, who want as few cycles as possible,
will benefit both from zero-initialization
as well as from removing the pointer based data access, which as you
already stated has an implicit penalty
by being less cache friendly.

> Feel free to define a constexpr or consteval helper function that
> initializes the array appropriately, and ensures duplicates are
> flagged or ignored.

Even if I did so and I sorted my array, lookup will still remain at
O(log(n)), as opposed to O(1) of a proper
hash set data structure.

But on top of that it also pushes complexity onto every user needing this
pattern. The standard library exists to provide correct,
efficient, reusable implementations of common patterns so engineers don't
have to reimplement them. A constexpr sort helper,
duplicate detection, and open addressing hash table are non-trivial to
implement correctly. The proposed solution makes this
accessible behind a familiar interface.

> What would the main data structure of this beast look like?
> Similar to std::unordered_set? Then you're already sub-optimal.

I have in mind a std::array (already constexpr-implemented) with open
addressing.

However, for the case of small integral types/enums a std::bitset could
also serve the proposed purpose
instead of the std::array. This can be considered for an optimization.

> What doesn't work, for your use-case, using each of these approaches?

frozen -> no guaranteed interface, no standard availability, adds an
external dependency. The standard library should
provide guaranteed portable solutions without external dependencies.

flat_map and constexpr std::map -> O(log(n)) lookup time again.

> What's the size (number of enumerators) of your enum?
> A std::bitset might work well for that case, if all bits
> fit in a few cache lines.

This is indeed a valid alternative for small dense enums, elegant and
cache-friendly. The solution becomes inefficient however
under sparse enums or enums with large values. Here the bitset introduces
redundancies and can easily exceed the size of a
single cache line.

The proposed solution handles this naturally through hashing, allocating
only memory that is needed.

> If you use a hash table, you likely want power-of-2 sizing
> (for efficient slot number computation). Ideally, you want
> to precompute the optimal size for the given (constant) values
> such that no collisions appear, and maybe fiddle with the hash
> function a bit to achieve that. (And consider that no-collisions
> might not be achievable in the general case.)

You are right in that no-collisions might not be achievable in the general
case.
I consider perfect hashing as an optimization, not a requirement. The
primary design
would use open addressing with collision handling, just like a runtime
std::unordered_set.
O(1) at runtime is guaranteed through immutability of the array, hence
fixed load factor.

> For simple "int"s (e.g. enum values), CRC32 works fine as a hash
> function for me, and comes with a cost of very few clock cycles
> if you use the compiler intrinsic.

CRC32 is a good candidate for integral types — the hash function would be a
template parameter with a sensible constexpr default, following the same
pattern as
std::unordered_set<T, Hash>, giving users control when needed.

Best regards,

Robert Baumgartner


Am Mo., 13. Apr. 2026 um 23:09 Uhr schrieb Jens Maurer <jens.maurer_at_[hidden]
>:

>
>
> On 4/13/26 22:31, Robert Baumgartner via Std-Proposals wrote:
> > I want to solve the following problem: I have an enum type, potentially
> a large one like a set of actions. A subset of that enum type satisfies
> certain conditions that I want to check at runtime. That would look like
> this
> >
> > if(someSet.contains(x)) {
> > // do something
> > }
> >
> > std::unordered_set<T> does this job efficiently, but has to be
> initialized during run-time. This is suboptimal whenever the values are
> known at compile time, especially in ultra-low latency environments, where
> the run-time initialization adds overhead that can be avoided.
>
> Well, if the values are known at compile-time, I'd expect that you
> initialize the
> unordered_set during process startup (as global initialization), so this
> overhead
> doesn't appear during execution.
>
> The run-time overhead you do have with std::unordered_set is the
> pointer-based data
> structure, which is likely to cause quite a bit of cache footprint.
>
> > The currently best compile solution that I am aware of is using a
> std::array like
> >
> > inline static constexpr array mySet{a, b, c, …};
> >
> > , which is O(log(n)) run-time at best for a lookup operation, but
> requires manual sorting for optimal performance and lacks the semantics of
> a set.
>
> Feel free to define a constexpr or consteval helper function that
> initializes the array appropriately, and ensures duplicates are
> flagged or ignored.
>
> > A combined std::static_unordered_set<T, N> merges both solutions in one,
> combining run-time efficiency with compile-time initialization.
>
> What would the main data structure of this beast look like?
> Similar to std::unordered_set? Then you're already sub-optimal.
>
> > I'm aware of third-party solutions like |frozen|, and
> of |std::flat_map| (C++23) and |constexpr std::map| (C++26),
>
> What doesn't work, for your use-case, using each of these approaches?
>
> > which suggest the committee has appetite for this direction. I'd
> welcome pointers to any prior proposals I may have missed, and feedback on
> whether this is worth developing into a formal paper.
>
> What's the size (number of enumerators) of your enum?
> A std::bitset might work well for that case, if all bits
> fit in a few cache lines.
>
> If you use a hash table, you likely want power-of-2 sizing
> (for efficient slot number computation). Ideally, you want
> to precompute the optimal size for the given (constant) values
> such that no collisions appear, and maybe fiddle with the hash
> function a bit to achieve that. (And consider that no-collisions
> might not be achievable in the general case.)
>
> For simple "int"s (e.g. enum values), CRC32 works fine as a hash
> function for me, and comes with a cost of very few clock cycles
> if you use the compiler intrinsic.
>
> Jens
>

Received on 2026-04-14 16:54:14