Thank you for the detailed feedback — let me address each point.

> Well, if the values are known at compile-time, I'd expect that you initialize the
> unordered_set during process startup (as global initialization), so this overhead
> doesn't appear during execution.

> The run-time overhead you do have with std::unordered_set is the pointer-based data
> structure, which is likely to cause quite a bit of cache footprint.

You are right to say that this can be a global object, but this introduces other problems, e.g. the 
static initialization order fiasco. constexpr objects cannot be accidentally used before initialization, 
hence removing a class of bugs altogether. The proposed structure works fundamentally different, 
giving compile time initialization, i.e. much stronger guarantees on when initialization happens. 

Additionally ultra-low latency users, who want as few cycles as possible, will benefit both from zero-initialization
as well as from removing the pointer based data access, which as you already stated has an implicit penalty 
by being less cache friendly. 

> Feel free to define a constexpr or consteval helper function that
> initializes the array appropriately, and ensures duplicates are
> flagged or ignored.

Even if I did so and I sorted my array, lookup will still remain at O(log(n)), as opposed to O(1) of a proper 
hash set data structure.

But on top of that it also pushes complexity onto every user needing this pattern. The standard library exists to provide correct, 
efficient, reusable implementations of common patterns so engineers don't have to reimplement them. A constexpr sort helper, 
duplicate detection, and open addressing hash table are non-trivial to implement correctly. The proposed solution makes this 
accessible behind a familiar interface.

> What would the main data structure of this beast look like?
> Similar to std::unordered_set?  Then you're already sub-optimal.

I have in mind a std::array (already constexpr-implemented) with open addressing.

However, for the case of small integral types/enums a std::bitset could also serve the proposed purpose
instead of the std::array. This can be considered for an optimization. 

> What doesn't work, for your use-case, using each of these approaches?

frozen -> no guaranteed interface, no standard availability, adds an external dependency. The standard library should 
provide guaranteed portable solutions without external dependencies.

flat_map and constexpr std::map -> O(log(n)) lookup time again.

> What's the size (number of enumerators) of your enum?
> A std::bitset might work well for that case, if all bits
> fit in a few cache lines.

This is indeed a valid alternative for small dense enums, elegant and cache-friendly. The solution becomes inefficient however 
under sparse enums or enums with large values. Here the bitset introduces redundancies and can easily exceed the size of a 
single cache line.

The proposed solution handles this naturally through hashing, allocating only memory that is needed.

> If you use a hash table, you likely want power-of-2 sizing
> (for efficient slot number computation).  Ideally, you want
> to precompute the optimal size for the given (constant) values
> such that no collisions appear, and maybe fiddle with the hash
> function a bit to achieve that.  (And consider that no-collisions
> might not be achievable in the general case.)

You are right in that no-collisions might not be achievable in the general case. 
I consider perfect hashing as an optimization, not a requirement. The primary design 
would use open addressing with collision handling, just like a runtime std::unordered_set.
O(1) at runtime is guaranteed through immutability of the array, hence fixed load factor.

> For simple "int"s (e.g. enum values), CRC32 works fine as a hash
> function for me, and comes with a cost of very few clock cycles
> if you use the compiler intrinsic.

CRC32 is a good candidate for integral types — the hash function would be a 
template parameter with a sensible constexpr default, following the same pattern as 
std::unordered_set<T, Hash>, giving users control when needed.

Best regards, 

Robert Baumgartner


Am Mo., 13. Apr. 2026 um 23:09 Uhr schrieb Jens Maurer <jens.maurer@gmx.net>:


On 4/13/26 22:31, Robert Baumgartner via Std-Proposals wrote:
> I want to solve the following problem: I have an enum type, potentially a large one like a set of actions. A subset of that enum type satisfies certain conditions that I want to check at runtime. That would look like this 
>
> if(someSet.contains(x)) {
>   // do something 
> }
>
> std::unordered_set<T> does this job efficiently, but has to be initialized during run-time. This is suboptimal whenever the values are known at compile time, especially in ultra-low latency environments, where the run-time initialization adds overhead that can be avoided.

Well, if the values are known at compile-time, I'd expect that you initialize the
unordered_set during process startup (as global initialization), so this overhead
doesn't appear during execution.

The run-time overhead you do have with std::unordered_set is the pointer-based data
structure, which is likely to cause quite a bit of cache footprint.

> The currently best compile solution that I am aware of is using a std::array like
>
> inline static constexpr array mySet{a, b, c, …};
>
> , which is O(log(n)) run-time at best for a lookup operation, but requires manual sorting for optimal performance and lacks the semantics of a set. 

Feel free to define a constexpr or consteval helper function that
initializes the array appropriately, and ensures duplicates are
flagged or ignored.

> A combined std::static_unordered_set<T, N> merges both solutions in one, combining run-time efficiency with compile-time initialization.

What would the main data structure of this beast look like?
Similar to std::unordered_set?  Then you're already sub-optimal.

> I'm aware of third-party solutions like |frozen|, and of |std::flat_map| (C++23) and |constexpr std::map| (C++26),

What doesn't work, for your use-case, using each of these approaches?

>  which suggest the committee has appetite for this direction. I'd welcome pointers to any prior proposals I may have missed, and feedback on whether this is worth developing into a formal paper.

What's the size (number of enumerators) of your enum?
A std::bitset might work well for that case, if all bits
fit in a few cache lines.

If you use a hash table, you likely want power-of-2 sizing
(for efficient slot number computation).  Ideally, you want
to precompute the optimal size for the given (constant) values
such that no collisions appear, and maybe fiddle with the hash
function a bit to achieve that.  (And consider that no-collisions
might not be achievable in the general case.)

For simple "int"s (e.g. enum values), CRC32 works fine as a hash
function for me, and comes with a cost of very few clock cycles
if you use the compiler intrinsic.

Jens