C++ Logo

std-discussion

Advanced search

Re: std::hash

From: Nate Eldredge <nate_at_[hidden]>
Date: Sat, 25 Mar 2023 11:00:15 -0600 (MDT)
On Sat, 25 Mar 2023, Marcin Jaczewski via Std-Discussion wrote:

> pt., 24 mar 2023 o 23:35 Chris Ryan <chrisr98008_at_[hidden]> napisaƂ(a):
>>
>> `std::vector<bool>` is a specialized implementation that is done differently from the other vectors & types because of the need to do the underlying bit packing.
>> Other types come out in the wash on their own.
>>
>
> My point is that I need more hash for `std::vector<int>` than for
> `std::vector<bool>` as it is a bit a black sheep of containers.
> Writing hash for `int` it is not so hard as we can easily find
> hundreds of examples on StackOverflow or Boost.
> But if it is so easy why do standard not provide it? or at least some
> basic building blocks.

I think the idea is that in general, you are meant to provide your own
implementation to hash a vector, as there's not any obvious "sensible
default". It depends on what you know about your data. Yes, that may be
inconvenient, but "convenience" is not generally the main design
consideration of the C++ language.

In principle, it would likewise be best for you to also provide your own
hash for std::vector<bool>. But there is a competing *performance* issue.
If you wrote your own hash, you would have to iterate over the elements of
the vector, one bit at a time. But internally, std::vector<bool> is
likely implemented as a vector of machine words, and it should be possible
to iterate over it a word at a time, which would then be 32 or 64 times
faster. However, that cannot be done by you the programmer, at least not
portably, because you don't have access to the internal representation of
std::vector<bool>.

Since performance *is* a key design consideration for C++, then as a
compromise, in this particular case, the library provides a hash function.
It still may not be algorithmically ideal for your data, but you may be
willing to accept that tradeoff in exchange for a speedup by 1-2 orders of
magnitude.

Historically, std::vector<bool> seems to be a closer relative of
std::bitset than of other std::vector<T>, and should be thought of
similarly. (Arguably it should have been made its own type "std::bitvec"
instead of a specialiation of std::vector.) You'll note that std::bitset
also provides a hash specialization, for much the same reason as above.

-- 
Nate Eldredge
nate_at_[hidden]

Received on 2023-03-25 17:00:19