On Oct 28, 2020, at 9:40 AM, David Vandevoorde <daveed@edg.com> wrote:

On Oct 27, 2020, at 6:46 PM, Herb Sutter via SG7 <sg7@lists.isocpp.org> wrote:

I thought string-based and token-based approaches were already proposed and considered, and SG7 did not favor them… is that right?

Yes.

Of course, we can always revisit decisions when new data comes about. I’d also claim that those decisions were made before any of the current proposals were published, and thus before the “shape” of our anticipated solution became a bit less hazy. That in itself could be considered “new data”, I suppose.

I have heard about experience with string-based approaches in D. The experience I heard from production users who were not in the language’s core design team was that the general ability to generate more code at compile time was extremely useful and game-changing (which is something we all want in C++ but is not specific to providing it using strings in particular) but using strings to do it frequently led them to producing write-only code that couldn’t be easily maintained and frequently had to be rewritten or discarded (in at least the experience of the people I heard from).

Also please remember IDEs and other tools: I would think it’s easier to be able to syntax-highlight, debug / step-into, refactor, etc. a fragment that looks like actual C++ grammar (i.e., treat code as code, even if it contains placeholders which we already have today with templates), than a series of string/stream concatentations/insertions (which would feel like the weaker parts of treating code as data).

Yes, I think so too. It’s also quite a bit harder to given solid diagnostics with string-based injection (because keeping track of the “origin” of the string content is tricky and/or expensive).

Consider further the analogy with templates: If we had a time machine to reinvent templates with all the experience we have today, would we ever consider expressing them as string concatenation? I doubt it. Doing that would be unarguably more flexible and allow more things, but I think it would also be clearly less-integrated and outside the language – it would be closer to using compile-time I/O to create a file and then #include-ing it (really, closer to an expanded preprocessor) than writing actual first-class generic code in the language. There were good reasons we didn’t implement templates using a preprocessor approach. This feels a subset of the same question, or at least a related question.

Finally, it’s not only about syntax – IIUC, in Andrew’s excellent paper, both | | and |# #| are done after parsing, not before.

I’m not sure what you mean by “are done after parsing” (what does “done” refer to)?
Also, I believe that as with templates, there is more than one possible strategy to implement such constructs (e.g., re-parse vs. AST substitution).

I think asking for string support the way it seems to be described below is asking to change that model to be before parsing?

Not per se, I think.

I used to favor that, but I’ve been convinced I was probably wrong about that.

Daveed

These are good points.

On further reflection, Andrew’s current syntax stands out, is easy to get used to after the initial shock, and — most importantly to me — leaves plenty of room for further expansion as is: you can just expand the contexts in which |# #| may be considered semantically valid, as in fact I originally thought it allowed for.

(Small point though: I have carefully not considered why the unquote operator %{ } is needed, and was a little confused by the various usages — why not always assume any generated string therein is unquoted when parsed? Something for Andrew to explain/you all to discuss in your meeting.)

Re future expansion, I may at some point fork Andrew’s implementation and bring over my old semantics to allow |# #| to be semantically valid in additional places, under a flag. (And yes those semantics are done after parsing/during constant evaluation, not during preprocessing/before parsing — i.e same way |# #| is processed; they are consistent). If so I will share it here in case others want to consider additional string injection points beyond identifiers, now or in the future.

There are other points in Andrew’s paper to discuss, and that should not be held up by this issue, so I will leave you all to it, thanks!

Dave