Juniper Public

> From: Marcin Jaczewski <marcinjaczewski86@gmail.com <mailto:marcinjaczewski86@gmail.com>>

> Date: Friday, October 20, 2023 at 1:18 PM

> 

> > I'd argue that would be less confusing for programmers, faster for preprocessing stage, and less error-prone than trying to convert formats using macros.

 

> Is preprocessor stage a bottleneck in compilation aside from

> `#include`? Overall most cost will be similar to `_X` as needed to do

> the same work.

 

I was thinking more about all the little string-literals for every substring, that would be created for each f/x-string. Both for the preprocessor and then concatenation right afterwards.

 

But of course I have no data to back up any claim of performance differences, since this proposal is not implemented. I do know that some people complain about preprocessing performance in general in public forums, but I don’t know what they're doing/details. They could be very wrong. It just makes me cautious of preprocessing perf, is all. But maybe I'm overthinking it.

 

Anecdotally: in our own build times the preprocessing time is not trivial, but I do _not_ know which parts of its internals are consuming time vs. others. Regardless, it's still all small potatoes relative to everything else (especially template instantiation) - but at my work we just happen to use distcc in non-pump-mode, so preprocessing is more noticeable because it happens locally vs. everything else which happens in parallel on distcc server farms.

 

 

> And even if C would add new print family then not everybody could use

> this new C function like freestanding,

> or already compiled third party libraries that will not be updated but

> use `%` like formats.

 

I get it, but it just feels wrong to do it that way to me, personally.

 

Like even if we wanted to support printf's format (or WG14 did), I would think it would be better to just create a new builtin operator to do that, rather than use a combination of a builtin and macro tricks to solve it.

 

It would generate better error messages, be less brittle, and be more efficient (even if efficiency doesn't matter). No?

 

 

---

 

For example, suppose we created a `P""` p-string-literal type, which is like an X"" except it converts to a %-based format for use in printf() and such.

 

And we specify the preprocessor takes that p-string-literal and invokes a _Pstring() macro with it. (shown below)

 

And we create a _XtoPstring() operator that is variadic, and which converts its first argument from a format-string string-literal to %-based format.

 

And "_Xtract()" is the name of the builtin we were previously calling "_Xstring()", that takes the interpolation-format string and extracts it into separate tokens.

 

 

So we’d do this:

 

    #define _Pstring(...) _XtoPstring( _Xtract( _Concat( __VA_ARGS__ ) ) )

 

Usage:

 

    printf(P"name={name:s} value={i:d}");

 

 

And the processing steps:

 

1. => printf( _Pstring( _XtoPstring( _Xtract( _Concat( P"name={name:s} value={i:d}" ) ) ) ) )

 

2. => printf( _Pstring( _XtoPstring( _Xtract( P"name={name:s} value={i:d}" ) ) ) )

 

3. => printf( _Pstring( _XtoPstring( "name={:s} value={:d}", name, i ) ) )

 

4. => printf( _Pstring( "name=%s value=%d", name, i ) )

 

5. => printf("name=%s value=%d", name, i);

 

 

Meanwhile these would be the ones for x-string and f-string in C++:

 

    #define _Xstring(...) _Xtract( _Concat( __VA_ARGS__ ) )

    #define _Fstring(...) ::std::format( _Xstring( __VA_ARGS__ ) )

 

 

-hadriel