Date: Fri, 9 Apr 2021 12:04:10 -0400
First, apologies: it escaped my notice you folks were having a meeting yesterday; I should not have brought up this unrelated issue so shortly before the meeting, but thank you Andrew for addressing it in detail despite that. Hopefully it was at least useful to that meeting by demonstrating the extensibility of [::] and ^.
> On Apr 8, 2021, at 8:38 AM, Andrew Sutton <asutton_at_[hidden]> wrote:
>
> I wouldn’t give up so easily; the same issue Daveed raises (with auto-deduced return types) applies for references, but we have explicitly defined the semantics for to allow the user to distinguish:
> ```
> int &f() {…}
> auto j = f(); //j is an int
> auto &jref = f(); //jref is an int&
> ```
>
> The same rules could conceivably be applied by simply substituting ^ for &:
> ```
> class A {};
> consteval class^ getA() { return [:^A:]; } //returns a "reference to a class"
> consteval auto^ getA2() { return getA(); } //okay
> consteval auto error() { return getA(); } //error
> class Foo : public getA2() {};
> ```
>
> The analogy between reflection/splicing and pointers/dereferencing is extremely strong, such that the possibility of a "reference"-analog which handles the dereferencing automatically is still possible (assuming expressions can be parsed in arbitrary contexts, which I agree would seem to be possible.)
>
> Sure, and in a language where types and values can be used in the same expression language, this is an attractive idea. I'd planned on doing almost exactly this as an extension of a toy compiler I've been (very slowly) writing that implements the C++-like language in Elements of Programming.
>
> That said, have serious doubts that this feature can be easily added to C++. My company has some experience writing new languages that could also do this, but build on top of the Clang C++ AST. It is... not pretty, and that's just the semantic side of things. I think some ideas in this space are going to prove to be fundamentally incompatible with C++ syntax.
Agree. Forcing the user to use [::] to make sure the user and the compiler are on the same page seems a small price to pay, and might well be necessary in certain contexts.
>
>
> *But* let’s set the reference issue aside. This whole matter raises the much more important question: just why *aren’t* reflections strongly-typed? Why only meta::info? Why is the user not allowed to specify further type information? (Note this is *not* the same as whether to use object-oriented reflection — discussed in P1240 pp5-6, totally different issue, those arguments do not apply here.)
>
> I've been thinking about this for a long time, but haven't had the opportunity to finish writing the paper. Mostly, because I get stuck on how a strongly typed system should work. Sometimes, it's obvious.
>
> ^int // a type entity reflection
> ^std // a namespace reflection
> ^0 // an expression entity? int expression? value reflection?
>
> int x;
> ^x // expression or declaration? what kind of either?
Tentatively, I think it should be an int-typed expression, i.e. `int^` per your desired syntax, since IIUC we cannot declare another entity with that splice (e.g. `int [:^x:] = 3;`), but only use it in an expression (e.g. `int j = [:^x:];`).
>
> void f();
> void f(int);
>
> ^f() // an expression
> ^f // yields... what?
Much trickier.
Secondary problem for function reflection types, independent of the overload problem: does ^ type-declaration syntax conflict with block type syntax (https://clang.llvm.org/docs/BlockLanguageSpec.html <https://clang.llvm.org/docs/BlockLanguageSpec.html>), e.g.:
```
void g();
void (^grefl)() = ^g; //grefl: Reflected type, or block type?
```
But maybe the overload problem and the block-conflict problem can be resolved by introducing a built in `meta::function<OverloadTypes…>` template, a la:
```
meta::function<void(), void(int)> freflA = ^f;
meta::function<void()> freflB = freflA;
meta::function<void(int)> freflC = ^f;
```
Or this could be done with symbols somehow, e.g.
```
<void(), void(int)>^ freflA = ^f;
<void()>^ freflB = freflA;
<void(int)>^ freflC = ^f;
```
This would definitely require further thought, I’m sure there are considerations I’m missing.
>
> And then there's also the issue that we also reflect things that are not directly nameable/addressable, like base specifiers and partial template specializations.
Bases would probably be easy, since there seems to be no real semantic analysis to be done; a type named `meta::basespec` would suffice, or some symbolic equivalent.
As for partial specializations, easiest solution is to say a) they should also be of `template ^` type`, and b) they should not be spliceable, precisely because they are not directly addressable. E.g.:
```
template<typename T> class A {};
template<typename U> class A<vector<U>> {};
template<> class A<int> {};
template ^Arefl = ^A;
typename ^Aintrefl = get_int_explicit_spec(^A);
template ^Avecrefl = get_vec_partial_spec(^A);
[:Arefl:]<float> afloat;
[:Arefl:]<vector<float>> afloatvec;
[:Avecrefl:]<float> afloatvecB; //ERROR: can’t splice a partial specialization.
[:Arefl:]<int> aint;
[:Aintrefl:] aintB;
```
> We can also differentiate between different kinds of expressions, declarations, etc. There's a tension between how reflections interact with the type/kind system and what kinds of information we can actually reflect. Exposing the extent of reflectable data to the type system seems like a bad idea.
I am not convinced it’s a bad idea — I think we should give the compiler the ability to perform as many checks on templates as possible, i.e. to minimize dependencies wherever there are not really dependencies (see the cppx.godbolt example linked below). Any information beyond these needs is a bad idea, though. E.g. with bases, the compiler doesn’t seem to need any specific information about the base class, so none should be provided via the reflection type system.
> Having a minimum set of types/kinds is viable, but will ultimately fail to satisfy some people's expectations for type safety.
Almost certainly concepts would also need to play a role as well, providing the user with whatever additional type safety they require — you obviously know much better than I how they might be incorporated.
> Is there going to be a Top reflection (basically meta::info)? We know there are concrete use cases for that.
Yes, but I’m thinking it should be called `meta::any` instead of `meta::info`, see below.
>
>
> By adopting a universal meta::info, we are knee-capping the compiler, so that it cannot perform basic type checking on templates prior to instantiation:
>
> That's just wrong. Every use of a dependent reflection is checked at parse time. It has to be---that's how C++ works. We require the same annotations for dependent reflections as we do for dependent member names because they are effectively the same problem.
>
> If we did have a more richly typed reflection system that included a Top reflection, you'd need those annotations anyway.
I think we’re on the same page about the problem, given that you’ve clearly been thinking about this already, but for clarity: "type checking" was not quite the right word; rather it is an even more basic form of semantic analysis. Here’s a clarified version of the old example, in which the compiler is not allowed to verify in the template that the thing following `typename ` is in fact a type, even though all the information needed to make that assessment is present, excepting only the strong typing needed to convey that information to where it is needed.
https://cppx.godbolt.org/z/5fWxhs33d <https://cppx.godbolt.org/z/5fWxhs33d>
>
>
> That seems like a big issue. Among other things, that type checking has to be done upon each instantiation probably affects efficiency, which is the whole reason for `meta::info` in the first place.
>
> Type checking of dependent expressions is already deferred to instantiation time. Your suggestion that "this is the whole reason for meta::info" is wrong. The real reasons are:
>
> 1. There is no computational overhead using scalars during constant expression evaluations
> 2. They can be passed as template arguments without synthesizing template parameter objects
>
Correct, efficiency is the whole rationale for meta::info (not type checking - ambiguous wording on my part). Re type checking of dependent expressions, see above example - the issue is that we are creating more dependencies than there really are, by not having stronger typing.
>
> I think adopting syntax analogous to pointers (and, arguably, references) should be considered. First step along these lines: change `meta::info` to `[:auto:]`, which is in turn analogous to `auto *` : we know it’s a reflection/pointer, but we won’t say any more about what kind of thing it reflects/points to. Same semantics as meta::info has currently.
>
> Just as a starter, spelling meta::info as [:auto:] is a non-starter for me. It looks too much like a splice.
Good point, and what is more, whatever suitability `[:auto:]` has as a return type, it would look weird as, say, the type of a parameter.
And the confusion with splicing would get even worse if we ever needed to put identifiers between those, e.g. `[:plain_struct:] my_plain_struct_refl;`.
So I agree using `[:T:]` to specify a type which is a reflection of `T` is not viable.
> The design I had in mind for my EoP language would allow this:
>
> typename^ t = ^int; // types
> auto^ e = ^0; // expressions
> int^ z = ^0; // integer expressions
> template^ x = std::vector;
> namespace^ n = std;
This is the optimal syntax. It is not perfect, because it breaks the analogy between ^ and & vs. [::] and *, but given
- that reference style (i.e. splicing without [::]) would almost certainly be problematic and therefore will probably not ever make it into the language,
- the problems you have identified with [::] type-declaration syntax, and
- that ^ is already underused relative to [::],
this is *definitely* the right path, syntactically.
But as mentioned above the syntax might not generalize — the type of reflections of overloaded function and bases might better to be expressed as meta::function<> or meta::base, barring a more creative solution.
>
> I haven't decided what I want for Top/meta::info. void^ would be kinda nice, except that you should also be able to reflect void expressions. I also haven't decided what splicing look likes.
I agree we need a "Top" reflection type, for e.g. use in containers which may contain multiple kinds.
But maybe the best solution is just to rename `meta::info` to `meta::any`: that’s basically what it is under the hood, a union of the various kinds.
Users should be encouraged to use `auto` or `auto ^` instead wherever possible, but can always fall back on `meta::any` when needed, or when quickly prototyping.
>
>
> I suppose this needs a paper. If anyone else has thoughts or has done work along these lines, please weigh in.
>
> I would welcome a paper considering these ideas, but I'd also welcome not writing it :)
>
Darn, I would also welcome not writing it :). You are probably further along than me on this, and can better conceive of how to incorporate concepts, so if the ideas crystallize and the motivation emerges, by all means write the paper. I might brood on it awhile as well — certainly no immediate paper forthcoming from me.
Dave
> On Apr 8, 2021, at 8:38 AM, Andrew Sutton <asutton_at_[hidden]> wrote:
>
> I wouldn’t give up so easily; the same issue Daveed raises (with auto-deduced return types) applies for references, but we have explicitly defined the semantics for to allow the user to distinguish:
> ```
> int &f() {…}
> auto j = f(); //j is an int
> auto &jref = f(); //jref is an int&
> ```
>
> The same rules could conceivably be applied by simply substituting ^ for &:
> ```
> class A {};
> consteval class^ getA() { return [:^A:]; } //returns a "reference to a class"
> consteval auto^ getA2() { return getA(); } //okay
> consteval auto error() { return getA(); } //error
> class Foo : public getA2() {};
> ```
>
> The analogy between reflection/splicing and pointers/dereferencing is extremely strong, such that the possibility of a "reference"-analog which handles the dereferencing automatically is still possible (assuming expressions can be parsed in arbitrary contexts, which I agree would seem to be possible.)
>
> Sure, and in a language where types and values can be used in the same expression language, this is an attractive idea. I'd planned on doing almost exactly this as an extension of a toy compiler I've been (very slowly) writing that implements the C++-like language in Elements of Programming.
>
> That said, have serious doubts that this feature can be easily added to C++. My company has some experience writing new languages that could also do this, but build on top of the Clang C++ AST. It is... not pretty, and that's just the semantic side of things. I think some ideas in this space are going to prove to be fundamentally incompatible with C++ syntax.
Agree. Forcing the user to use [::] to make sure the user and the compiler are on the same page seems a small price to pay, and might well be necessary in certain contexts.
>
>
> *But* let’s set the reference issue aside. This whole matter raises the much more important question: just why *aren’t* reflections strongly-typed? Why only meta::info? Why is the user not allowed to specify further type information? (Note this is *not* the same as whether to use object-oriented reflection — discussed in P1240 pp5-6, totally different issue, those arguments do not apply here.)
>
> I've been thinking about this for a long time, but haven't had the opportunity to finish writing the paper. Mostly, because I get stuck on how a strongly typed system should work. Sometimes, it's obvious.
>
> ^int // a type entity reflection
> ^std // a namespace reflection
> ^0 // an expression entity? int expression? value reflection?
>
> int x;
> ^x // expression or declaration? what kind of either?
Tentatively, I think it should be an int-typed expression, i.e. `int^` per your desired syntax, since IIUC we cannot declare another entity with that splice (e.g. `int [:^x:] = 3;`), but only use it in an expression (e.g. `int j = [:^x:];`).
>
> void f();
> void f(int);
>
> ^f() // an expression
> ^f // yields... what?
Much trickier.
Secondary problem for function reflection types, independent of the overload problem: does ^ type-declaration syntax conflict with block type syntax (https://clang.llvm.org/docs/BlockLanguageSpec.html <https://clang.llvm.org/docs/BlockLanguageSpec.html>), e.g.:
```
void g();
void (^grefl)() = ^g; //grefl: Reflected type, or block type?
```
But maybe the overload problem and the block-conflict problem can be resolved by introducing a built in `meta::function<OverloadTypes…>` template, a la:
```
meta::function<void(), void(int)> freflA = ^f;
meta::function<void()> freflB = freflA;
meta::function<void(int)> freflC = ^f;
```
Or this could be done with symbols somehow, e.g.
```
<void(), void(int)>^ freflA = ^f;
<void()>^ freflB = freflA;
<void(int)>^ freflC = ^f;
```
This would definitely require further thought, I’m sure there are considerations I’m missing.
>
> And then there's also the issue that we also reflect things that are not directly nameable/addressable, like base specifiers and partial template specializations.
Bases would probably be easy, since there seems to be no real semantic analysis to be done; a type named `meta::basespec` would suffice, or some symbolic equivalent.
As for partial specializations, easiest solution is to say a) they should also be of `template ^` type`, and b) they should not be spliceable, precisely because they are not directly addressable. E.g.:
```
template<typename T> class A {};
template<typename U> class A<vector<U>> {};
template<> class A<int> {};
template ^Arefl = ^A;
typename ^Aintrefl = get_int_explicit_spec(^A);
template ^Avecrefl = get_vec_partial_spec(^A);
[:Arefl:]<float> afloat;
[:Arefl:]<vector<float>> afloatvec;
[:Avecrefl:]<float> afloatvecB; //ERROR: can’t splice a partial specialization.
[:Arefl:]<int> aint;
[:Aintrefl:] aintB;
```
> We can also differentiate between different kinds of expressions, declarations, etc. There's a tension between how reflections interact with the type/kind system and what kinds of information we can actually reflect. Exposing the extent of reflectable data to the type system seems like a bad idea.
I am not convinced it’s a bad idea — I think we should give the compiler the ability to perform as many checks on templates as possible, i.e. to minimize dependencies wherever there are not really dependencies (see the cppx.godbolt example linked below). Any information beyond these needs is a bad idea, though. E.g. with bases, the compiler doesn’t seem to need any specific information about the base class, so none should be provided via the reflection type system.
> Having a minimum set of types/kinds is viable, but will ultimately fail to satisfy some people's expectations for type safety.
Almost certainly concepts would also need to play a role as well, providing the user with whatever additional type safety they require — you obviously know much better than I how they might be incorporated.
> Is there going to be a Top reflection (basically meta::info)? We know there are concrete use cases for that.
Yes, but I’m thinking it should be called `meta::any` instead of `meta::info`, see below.
>
>
> By adopting a universal meta::info, we are knee-capping the compiler, so that it cannot perform basic type checking on templates prior to instantiation:
>
> That's just wrong. Every use of a dependent reflection is checked at parse time. It has to be---that's how C++ works. We require the same annotations for dependent reflections as we do for dependent member names because they are effectively the same problem.
>
> If we did have a more richly typed reflection system that included a Top reflection, you'd need those annotations anyway.
I think we’re on the same page about the problem, given that you’ve clearly been thinking about this already, but for clarity: "type checking" was not quite the right word; rather it is an even more basic form of semantic analysis. Here’s a clarified version of the old example, in which the compiler is not allowed to verify in the template that the thing following `typename ` is in fact a type, even though all the information needed to make that assessment is present, excepting only the strong typing needed to convey that information to where it is needed.
https://cppx.godbolt.org/z/5fWxhs33d <https://cppx.godbolt.org/z/5fWxhs33d>
>
>
> That seems like a big issue. Among other things, that type checking has to be done upon each instantiation probably affects efficiency, which is the whole reason for `meta::info` in the first place.
>
> Type checking of dependent expressions is already deferred to instantiation time. Your suggestion that "this is the whole reason for meta::info" is wrong. The real reasons are:
>
> 1. There is no computational overhead using scalars during constant expression evaluations
> 2. They can be passed as template arguments without synthesizing template parameter objects
>
Correct, efficiency is the whole rationale for meta::info (not type checking - ambiguous wording on my part). Re type checking of dependent expressions, see above example - the issue is that we are creating more dependencies than there really are, by not having stronger typing.
>
> I think adopting syntax analogous to pointers (and, arguably, references) should be considered. First step along these lines: change `meta::info` to `[:auto:]`, which is in turn analogous to `auto *` : we know it’s a reflection/pointer, but we won’t say any more about what kind of thing it reflects/points to. Same semantics as meta::info has currently.
>
> Just as a starter, spelling meta::info as [:auto:] is a non-starter for me. It looks too much like a splice.
Good point, and what is more, whatever suitability `[:auto:]` has as a return type, it would look weird as, say, the type of a parameter.
And the confusion with splicing would get even worse if we ever needed to put identifiers between those, e.g. `[:plain_struct:] my_plain_struct_refl;`.
So I agree using `[:T:]` to specify a type which is a reflection of `T` is not viable.
> The design I had in mind for my EoP language would allow this:
>
> typename^ t = ^int; // types
> auto^ e = ^0; // expressions
> int^ z = ^0; // integer expressions
> template^ x = std::vector;
> namespace^ n = std;
This is the optimal syntax. It is not perfect, because it breaks the analogy between ^ and & vs. [::] and *, but given
- that reference style (i.e. splicing without [::]) would almost certainly be problematic and therefore will probably not ever make it into the language,
- the problems you have identified with [::] type-declaration syntax, and
- that ^ is already underused relative to [::],
this is *definitely* the right path, syntactically.
But as mentioned above the syntax might not generalize — the type of reflections of overloaded function and bases might better to be expressed as meta::function<> or meta::base, barring a more creative solution.
>
> I haven't decided what I want for Top/meta::info. void^ would be kinda nice, except that you should also be able to reflect void expressions. I also haven't decided what splicing look likes.
I agree we need a "Top" reflection type, for e.g. use in containers which may contain multiple kinds.
But maybe the best solution is just to rename `meta::info` to `meta::any`: that’s basically what it is under the hood, a union of the various kinds.
Users should be encouraged to use `auto` or `auto ^` instead wherever possible, but can always fall back on `meta::any` when needed, or when quickly prototyping.
>
>
> I suppose this needs a paper. If anyone else has thoughts or has done work along these lines, please weigh in.
>
> I would welcome a paper considering these ideas, but I'd also welcome not writing it :)
>
Darn, I would also welcome not writing it :). You are probably further along than me on this, and can better conceive of how to incorporate concepts, so if the ideas crystallize and the motivation emerges, by all means write the paper. I might brood on it awhile as well — certainly no immediate paper forthcoming from me.
Dave
Received on 2021-04-09 11:04:16