On Wed, Jan 27, 2021 at 5:22 AM BAILLY Yves <yves.bailly@hexagon.com> wrote:

I think I get your point. After more thinking about it (which I should have done before, sorry for that), [...]


No worries. I'm not expressing my points in the directest and clearest possible manner, either. I think we started at opposite vague extremes and are slowly "haggling" our way to meet somewhere in the middle. I started at "This is not possible"; you started at "This is possible." I'm still 100% certain that when we meet it'll be on the "This is not possible" side of the line... but we're still working our way there.
 

> Is f<U> the same function as, or a different function from, f<T>? (I don't know but I think so.)

If T and U are two different types, then f<U> is a different function from f<T>.

However, because U and T refer to the same Platonic type and a U can be seen as a T, then if f<U> is syntactically correct, well-formed with regard to the restrictions put on U (for example, inside f<> there’s no assignment of a T to a U without an explicit cast), then the actual instantiation of f<U> can be the same as the instantiation of f<T>


I know what you mean, but I hope we'll agree to ignore the possibility for compiler optimizations and just focus on the formal language semantics.

For example, in C++20 if you write
    extern int i;
    int *f1(int *x) { return x+i; }
    float *f2(float *x) { return x+i; }
the compiler is perfectly well permitted to optimize those two functions into 
    f1: nop
    f2: movslq i(%rip), %rax
        leaq (%rdi,%rax,4), %rax
        ret
No mainstream compiler actually bothers to do this, but there's nothing stopping a "sufficiently smart compiler" from doing it. (Similar optimizations at the linker/LTO level are more common.)
However, whatever happens behind the scenes in the compiler must still respect the "As-If Rule": it isn't allowed to break any conforming program. For example, if some program asks
    assert((void*)f1 != (void*)f2);
the optimized program must still answer correctly. (That's what that extra `nop` is doing in the codegen — it's giving f1 and f2 distinct addresses at the machine level, even though f1's control flow just flows straight into f2.)

So yes, of course a sufficiently smart compiler could also make your `f<T>` and `f<U>` share the same code at the machine level. But it would still have to ensure that `&f<T> != &f<U>` in C++, right?  That's what I mean by f<T> and f<U> being different functions.


For template instantiations in particular, there's another way to tell them apart. Different template specializations have different sets of function-local-static variables.

    template<class A>
    void f() {
        static int i = 0;
        printf(" %d", ++i);
    }
    struct T {};
    using U = new T;
    int main() {
        f<T>(); f<T>(); f<T>();  // prints 1 2 3
        f<U>(); f<U>(); f<U>();  // should print 1 2 3 — not 4 5 6, as it would with an ordinary type alias!
    }

If you remove the keyword `new`, so that U and T are just two names for the same type, then C++'s formal semantics are that &f<T> == &f<U> and that the output of the program is "1 2 3 4 5 6" (because if T and U are the same type then we're just calling the same function 6 times, instead of two different functions each 3 times).


This would apply to the std::hash<> specialization: when required to instantiate std::hash<U>, if it has not been explicitly specialized, then as you said the compiler may realize that std::hash<U> is the same (has the same contents although it doesn’t have the same identity) as std::hash<T> - again, if and only if the code for std::hash<U> is well-formed and there’s no explicit specialization defined provided by the user.


Okay, consider this snippet, then:

    using size_t = unsigned long;
    template<> struct hash<size_t> {
        size_t operator()(const size_t& x) const { return x; }
    };

    using Width = new size_t;
    std::hash<Width> hasher;
    Width width = 42;
    size_t bytecount = 42;
    auto x = hasher(width);  // OK?? What is decltype(x)?
    x = bytecount;  // OK if decltype(x) is size_t... but not OK if decltype(x) is Width, correct?

AIUI, you're postulating some yet-to-be-fleshed-out mechanism by which the compiler is going to look at the C++ source code of the `hash<size_t>` specialization and generate a copy substituting `Width` for `size_t` in some yet-to-be-fleshed-out manner.
(1) Ideally the compiler would know that the signature of hash<Width>::operator() should be `size_t operator()(const Width&) const`, not `Width operator()(const Width&) const`. But how can it possibly know that one of the size_ts should be substituted and not the other?
(2) What happens if the declaration of `hash<size_t>::operator()` is visible in this TU, but the definition is not visible? That seems like a problem similar to what would happen if you put a function template definition into a .cpp file. What implications does that have for usability?
(3) How does the compiler even know that I want `Width` to have a `hash` specialization?  Will it apply the same logic to say, well, `std::rotl` exists for size_t so it should also exist for `Width`? That actually seems like the kind of thing that I want strong typedefs to protect me from.
(4) Also, wait a minute, how does `Width width = 42;` even compile at all?  `42` is not a `Width`; it's an int. If implicitly converting size_t(42) to Width is forbidden, then surely it should be equally forbidden to implicitly convert int(42) to Width. So, how do you envision this facility interacting with integer literals?

#1 is the vastly most important issue here, because it strikes directly at your vague "substitution" mechanism for creating new specializations of `hash`.
#3 and #4 are relatively unexplored territory and I foresee them having easy vague answers that would then have to be unpacked via further discussion. So please don't use #3 and #4 as an excuse to procrastinate on #1. #1 is important!

HTH,
Arthur