C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Low-level float parsing functions

From: Jan Schultke <janschultke_at_[hidden]>
Date: Mon, 28 Apr 2025 19:48:54 +0200
> OK, so what you want is to create a generic parsing framework that
> doesn't really do much on its own, but can be used to *build* number
> parsers that could handle arbitrary formats of numbers. That's
> interesting.
>

You're making it out way more complicated than it actually is.
std::from_chars works with "1.0", and we may as well pass "1" and "0"
separately so that we're not restricted to using a dot as a separator. The
use case can be as simple as getting numbers that use a comma instead,
perhaps because you're making software localized for Germany. It doesn't
take a whole framework to want a slightly different number format.

But why does this need to be in the *standard*?


Because the job of the standard library is to provide low-level,
hard-to-implement functionality that others can build software on. Only
letting you parse a very-specific floating-point format leaves room for a
lower-level function, and that lower-level function exists in all standard
libraries anyway so that std::from_chars can be implemented. Don't believe
me? The first thing that libstdc++ does is a bunch of parsing to split up
the string, obtain the mantissa, etc.:
https://github.com/gcc-mirror/gcc/blob/4c40e3d7b9152f40e5a3d35060b6822ddc743624/libstdc%2B%2B-v3/src/c%2B%2B17/fast_float/fast_float.h#L2974
This is the easy part that the user could have done themselves. The hard
part is the load of numerics to turn that into a floating-point number.

If your standard library leaves room for lower-level functions for no good
reason, the design is bad. Sometimes there are good reasons, like the
standard library needing to be platform-independent, and an OS-specific
function exposes more functionality than the standard. However, this is not
such a case.

Also remember that std::from_chars exists precisely because std::istream,
std::stof and etc. all left room for a lower-level function.
std::from_chars just settled for a half measure.

The primary compelling motivation for `to/from_char`s is this: there
> is a lingua-franca number format, a de-facto standard for numeric
> interchange between programs. This standard is supported by most
> textual interchange formats: JSON, YAML, CSV (to the extent that this
> can be considered a format), etc. Almost all of these formats use the
> lingua-franca encoding for textual interchange (UTF-8), and they all
> basically agree on how numbers should be formatted. Lots of programs
> generate such numbers, and lots of programs have to read such numbers.
>
> Because of the substantial widespread use of textual numbers in such
> formats, there is an obvious benefit to having a tool that can rapidly
> handle numbers in those formats.
>
> Once you start talking about an omni-number parser, one which is
> basically a set of parsing tools that you compose to parse any format,
> the justification for standardization becomes a lot less clear cut.
> Such a tool can be very useful for particular users. But the question
> of why it needs to be in the C++ standard? Is it useful enough to be
> worth adding, or can people just use the library?


It seems like you're contradicting what you've previously said. I am not
proposing an "omni-parser"; I am proposing slightly lower-level access to
floating-point parsing than std::from_chars, and that functionality already
exists.

Arguing that "1.0E3" should be the only format parsable by the C++ standard
library is just as nonsensical as arguing that std::from_chars for integers
should only accept "0xff" as a format for hex integers, rather than taking
a digit sequence and an int base. I could go through a list of places where
the "0x" format is used, just like you've done with JSON and YAML, but it
wouldn't make the argument any better.

Received on 2025-04-28 17:49:08