C++ Logo

std-proposals

Advanced search

Re: +AFs-std-proposals+AF0 Supporting f-strings in C+-+-

From: Hadriel Kaplan <hkaplan_at_[hidden]>
Date: Sun, 15 Oct 2023 03:51:41 +0000
BTW, there +AF8-is+AF8- an alternative that might be a good compromise?:

1. Have the f-string convert the code to use a new C+-+- template type instead of a std::format() function call - let's call that type a +ACI-format+AF8-params+ACI-.

2. The format+AF8-params would essentially be a +AGA-tuple+ADw-format+AF8-string+ADw-Args...+AD4-, Args...+AD4AYA-, holding the split-apart discrete format-string and arguments.

3. Give format+AF8-params an implicit conversion operator to +AGA-std::string+AGA- that internally invokes +AGA-std::format()+AGA- during that.

  - That way it will be a +AGA-std::format()+AGA- invocation by default in most use-cases.

  - But one could also write new functions or overloads to accept a +AGA-std::format+AF8-params+ADwAPgBg- type and extract the components for whatever purpose.

4. Define a new +AGA-std::make+AF8-format+AF8-params(T+ACYAJg-...)+AGA- template function to create this new +AGA-std::format+AF8-params+ADwAPgBg- type, and that +AGA-make+AF8-format+AF8-params()+AGA- is what an +AGA-F+ACIAewB9ACIAYA- f-string would actually be converted to.

In other words, this:

    std::cout +ADwAPA- F+ACIAew-prefix+AH0AXwB7-name+AH0-: got +AHs-calculate()+AH0- for +AHs-bits:+ACM-06x+AH0AIgA7-


Would be converted to this:

    std::cout +ADwAPA- ::std::make+AF8-format+AF8-params(+ACIAewB9AF8AewB9-: got +AHsAfQ- for +AHs-:+ACM-06x+AH0AIg-, prefix, name, calculate(), bits)+ADs-

The rest of it would be new standard library code.

And it would still work without needing to change what +AGA-operator+ADwAPA-(ostream+ACY-)+AGA- does, or any other function that already accepts a +AGA-std::string+AGA-.

However, the downside: doing all this would make f-strings no longer work for functions that +ACo-only+ACo- accept a +AGA-std::string+AF8-view+AGA-, since that would require double conversion.

-hadriel



Juniper Business Use Only
+AD4- From: Barry Revzin +ADw-barry.revzin+AEA-gmail.com+AD4-
+AD4- Date: Saturday, October 14, 2023 at 9:00 PM
+AD4- To: Hadriel Kaplan +ADw-hkaplan+AEA-juniper.net+AD4-
+AD4- Cc: +ACI-std-proposals+AEA-lists.isocpp.org+ACI- +ADw-std-proposals+AEA-lists.isocpp.org+AD4-, +ACI-vittorio.romeo+AEA-outlook.com+ACI- +ADw-vittorio.romeo+AEA-outlook.com+AD4-
+AD4- Subject: Re: +AFs-std-proposals+AF0- Supporting f-strings in C+-+-
+AD4-
+AD4-
+AD4- On Sat, Oct 14, 2023 at 3:55+IC8-PM Hadriel Kaplan +ADw-mailto:hkaplan+AEA-juniper.net+AD4- wrote:
+AD4- +AD4- From: Barry Revzin +ADw-mailto:barry.revzin+AEA-gmail.com+AD4-
+AD4-
+AD4- +AD4- I think this approach is kind of a non-starter. We can't have f+ACI-x+AD0Aew-x+AH0AIg- just mean std::format(+ACI-x+AD0AewB9ACI-, x) for an important reason.
+AD4-
+AD4- +AD4- This means that std::print(f+ACI-x+AD0Aew-x+AH0AIg-) doesn't and can't work - which is the sort of thing that seems important to support - having to write std::print(+ACIAewB9ACI-, f+ACI-x+AD0Aew-x+AH0AIg-) is... less than ideal.
+AD4-
+AD4- +AD4- More generally, the issue is that there are a lot of uses of the format API that are not literally std::format(). In the standard we have std::print(), std::format+AF8-to(), etc. It would be really nice if those worked with f-strings as well.
+AD4-
+AD4- I explicitly covered that topic in section 5 of the draft...
+AD4-
+AD4- We only need to add another overload to std::print(), std::println(), and format+AF8-to() - one that accepts a std::string+AF8-view or std::string - to make these work:
+AD4-
+AD4- std::print(F+ACI-hello C+-+-+AHs-20 +- 6+AH0AIg-)+ADs-
+AD4- std::println(F+ACI-hello C+-+-+AHs-13 +ACo- 2+AH0AIg-)+ADs-
+AD4- std::format+AF8-to(std::back+AF8-inserter(buffer), F+ACI-hello C+-+-+AHs-26+AH0AIg-)+ADs-
+AD4-
+AD4- You can't do that though, since you can already call std::print, etc., with just a string literal - and the meaning of std::print with just a format string is distinct from the meaning of what you're suggesting calling std::print with a std::string would be. For instance, std::print(+ACI-X+AHsAewB9AH0AIg-) is a valid call today - which prints +ACI-X+AHsAfQAi- - because we always interpret the first argument as a format string. If we add this overload, it would suddenly print +ACI-X+AHsAewB9AH0AIg-, because we would interpret the argument as just a string.
+AD4-
+AD4- Incidentally, Rust tried to do this for a while where panic+ACE-(+ACIAewB9ACI-, 1) would panic with the message +ACI-1+ACI- but panic+ACE-(+ACIAewB9ACI-) would not be interpreted as a format string due to the lack of arguments - the latter would have panicked with the message +ACIAewB9ACI-. But Rust 2021 fixed that by making panic+ACE-(+ACIAewB9ACI-) ill-formed (since it's a format string expecting one argument, which isn't provided). I'm not saying we should avoid making this change from one to the other simply because Rust just made the change in the opposite direction, but it is a data point.
+AD4-
+AD4-
+AD4- The first two functions might not be as efficient as using a separate format+AF8-string and args, because an intermediate std::string is created which might be avoidable otherwise.
+AD4-
+AD4- But they're already writing to stdout, so an extra temporary string seems rather meaningless? At least I think it's reasonable tradeoff for convenience.
+AD4-
+AD4- This isn't a good argument though, because while std::print does write to stdout, not all uses of the format API do. format+AF8-to() doesn't. Not all uses of the format API are even formatting synchronously - some simply do type-checking in the front end and serialize their arguments to be formatted later. For such uses, an extra temporary string is a complete non-starter.
+AD4-
+AD4-
+AD4- Likewise, this should work without spdlog being changed at all:
+AD4-
+AD4- spdlog::info(f+ACI-x+AD0Aew-x+AH0AIg-)+ADs-
+AD4-
+AD4- ...because +ACo-anything+ACo- that accepts a std::string rvalue should work - because ultimately the f-string resolves to the std::string returned by std::format().
+AD4-
+AD4- TIL apparently spdlog makes the mistake that I describe above, where spdlog::info(+ACIAewB9ACI-) actually works and logs +ACIAewB9ACI- (instead of being ill-formed) while spdlog::info(+ACIAewB9ACI-, 1) logs +ACI-1+ACI-. I consider that a design mistake, and definitely not one the standard library should adopt.
+AD4-
+AD4- It's not even documented behavior, spdlog just says it uses fmt. But fmt::print(+ACIAewB9ACI-) isn't valid.

Received on 2023-10-15 03:51:49