On Wed, Jan 27, 2021 at 5:22 AM BAILLY Yves <yves.bailly@hexagon.com> wrote:

I think I get your point. After more thinking about it (which I should have done before, sorry for that), [...]

No worries. I'm not expressing my points in the directest and clearest possible manner, either. I think we started at opposite vague extremes and are slowly "haggling" our way to meet somewhere in the middle. I started at "This is not possible"; you started at "This is possible." I'm still 100% certain that when we meet it'll be on the "This is not possible" side of the line... but we're still working our way there.

> Is f the same function as, or a different function from, f<T>? (I don't know but I think so.)

If T and U are two different types, then f is a different function from f<T>.

However, because U and T refer to the same Platonic type and a U can be seen as a T, then if f is syntactically correct, well-formed with regard to the restrictions put on U (for example, inside f<> there’s no assignment of a T to a U without an explicit cast), then the actual instantiation of f can be the same as the instantiation of f<T>

I know what you mean, but I hope we'll agree to ignore the possibility for compiler optimizations and just focus on the formal language semantics.

For example, in C++20 if you write

extern int i;

int *f1(int *x) { return x+i; }

float *f2(float *x) { return x+i; }

the compiler is perfectly well permitted to optimize those two functions into

f1: nop

f2: movslq i(%rip), %rax

leaq (%rdi,%rax,4), %rax

ret

No mainstream compiler actually bothers to do this, but there's nothing stopping a "sufficiently smart compiler" from doing it. (Similar optimizations at the linker/LTO level are more common.)

However, whatever happens behind the scenes in the compiler must still respect the "As-If Rule": it isn't allowed to break any conforming program. For example, if some program asks

assert((void*)f1 != (void*)f2);

the optimized program must still answer correctly. (That's what that extra `nop` is doing in the codegen — it's giving f1 and f2 distinct addresses at the machine level, even though f1's control flow just flows straight into f2.)

So yes, of course a sufficiently smart compiler could also make your `f<T>` and `f` share the same code at the machine level. But it would still have to ensure that `&f<T> != &f` in C++, right? That's what I mean by f<T> and f being different functions.

For template instantiations in particular, there's another way to tell them apart. Different template specializations have different sets of function-local-static variables.

template<class A>

void f() {

static int i = 0;

printf(" %d", ++i);

}

struct T {};

using U = new T;

int main() {

f<T>(); f<T>(); f<T>(); // prints 1 2 3

f(); f(); f(); // should print 1 2 3 — not 4 5 6, as it would with an ordinary type alias!

}

If you remove the keyword `new`, so that U and T are just two names for the same type, then C++'s formal semantics are that &f<T> == &f and that the output of the program is "1 2 3 4 5 6" (because if T and U are the same type then we're just calling the same function 6 times, instead of two different functions each 3 times).

This would apply to the std::hash<> specialization: when required to instantiate std::hash, if it has not been explicitly specialized, then as you said the compiler may realize that std::hash is the same (has the same contents although it doesn’t have the same identity) as std::hash<T> - again, if and only if the code for std::hash is well-formed and there’s no explicit specialization defined provided by the user.

Okay, consider this snippet, then:

using size_t = unsigned long;

template<> struct hash<size_t> {

size_t operator()(const size_t& x) const { return x; }

};

using Width = new size_t;

std::hash<Width> hasher;

Width width = 42;

size_t bytecount = 42;

auto x = hasher(width); // OK?? What is decltype(x)?

x = bytecount; // OK if decltype(x) is size_t... but not OK if decltype(x) is Width, correct?

AIUI, you're postulating some yet-to-be-fleshed-out mechanism by which the compiler is going to look at the C++ source code of the `hash<size_t>` specialization and generate a copy substituting `Width` for `size_t` in some yet-to-be-fleshed-out manner.

(1) Ideally the compiler would know that the signature of hash<Width>::operator() should be `size_t operator()(const Width&) const`, not `Width operator()(const Width&) const`. But how can it possibly know that one of the size_ts should be substituted and not the other?

(2) What happens if the declaration of `hash<size_t>::operator()` is visible in this TU, but the definition is not visible? That seems like a problem similar to what would happen if you put a function template definition into a .cpp file. What implications does that have for usability?

(3) How does the compiler even know that I want `Width` to have a `hash` specialization? Will it apply the same logic to say, well, `std::rotl` exists for size_t so it should also exist for `Width`? That actually seems like the kind of thing that I want strong typedefs to protect me from.

(4) Also, wait a minute, how does `Width width = 42;` even compile at all? `42` is not a `Width`; it's an int. If implicitly converting size_t(42) to Width is forbidden, then surely it should be equally forbidden to implicitly convert int(42) to Width. So, how do you envision this facility interacting with integer literals?

#1 is the vastly most important issue here, because it strikes directly at your vague "substitution" mechanism for creating new specializations of `hash`.

#3 and #4 are relatively unexplored territory and I foresee them having easy vague answers that would then have to be unpacked via further discussion. So please don't use #3 and #4 as an excuse to procrastinate on #1. #1 is important!

HTH,

Arthur