Date: Mon, 28 Apr 2025 15:41:35 +0200
Hey folks,
I am not a big fan of the interface of std::from_chars for floating-point
numbers we have currently. The issue is that it conflates the "text-based
parsing" of floating-point numbers with the numeric conversion. The latter
part is vastly more complicated, and it would be nice if we had more direct
access to it.
To clarify, it's possible that due to localization, we have digit
separators, different radix points than a comma, etc. It's possible that we
use "p", "P", "d", "D", "+10^" or other exponent separators than "E", which
is what std::from_chars accepts. There are also use cases where we know
that no exponent part exists, and it would be a waste of time to look for
it during parsing. Some of these problems are covered by the
std::chars_format options, but the world of floating-point formats is more
complex than what fits into four enum constants.
A possible solution is to introduce a lower-level set of std::parse_float
functions, with an interface like:
template<floating_point F>
struct parse_float_result {
F result;
errc ec;
};
// decimal float, only integer part
parse_float_result
parse_float_integer(string_view integer, int exponent = 0);
parse_float_result
parse_float_integer(string_view integer, string_view exponent);
// decimal float, only fractional part
parse_float_result
parse_float_fraction(string_view fraction, int exponent = 0);
parse_float_result
parse_float_fraction(string_view fraction, string_view exponent);
// decimal float, integer and fractional part
parse_float_result
parse_float(string_view integer, string_view fraction, int exponent = 0);
parse_float_result
parse_float(string_view integer, string_view fraction, string_view
exponent);
// hex float, only integer part
parse_float_result
parse_float_hex_integer(string_view integer, int exponent = 0);
parse_float_result
parse_float_hex_integer(string_view integer, string_view exponent);
// hex float, only fractional part
parse_float_result
parse_float_hex_fraction(string_view fraction, int exponent = 0);
parse_float_result
parse_float_hex_fraction(string_view fraction, string_view exponent);
// hex float, integer and fractional part
parse_float_result
parse_float_hex(string_view integer, string_view fraction, int exponent =
0);
parse_float_result
parse_float_hex(string_view integer, string_view fraction, string_view
exponent);
The implementation effort is relatively low. The same underlying
floating-point parser can be used as for std::from_chars; we're just
bypassing the initial steps of looking for the radix point, exponent
separator, etc. and jumping straight into the computational part.
Note that this may have benefits even if we could have used
std::from_chars. Floating-point parsing is often preceded by a
tokenization, possibly with regex matching, and we get the subdivision into
integer/fraction/exponent "for free" as a side product.
I am not a big fan of the interface of std::from_chars for floating-point
numbers we have currently. The issue is that it conflates the "text-based
parsing" of floating-point numbers with the numeric conversion. The latter
part is vastly more complicated, and it would be nice if we had more direct
access to it.
To clarify, it's possible that due to localization, we have digit
separators, different radix points than a comma, etc. It's possible that we
use "p", "P", "d", "D", "+10^" or other exponent separators than "E", which
is what std::from_chars accepts. There are also use cases where we know
that no exponent part exists, and it would be a waste of time to look for
it during parsing. Some of these problems are covered by the
std::chars_format options, but the world of floating-point formats is more
complex than what fits into four enum constants.
A possible solution is to introduce a lower-level set of std::parse_float
functions, with an interface like:
template<floating_point F>
struct parse_float_result {
F result;
errc ec;
};
// decimal float, only integer part
parse_float_result
parse_float_integer(string_view integer, int exponent = 0);
parse_float_result
parse_float_integer(string_view integer, string_view exponent);
// decimal float, only fractional part
parse_float_result
parse_float_fraction(string_view fraction, int exponent = 0);
parse_float_result
parse_float_fraction(string_view fraction, string_view exponent);
// decimal float, integer and fractional part
parse_float_result
parse_float(string_view integer, string_view fraction, int exponent = 0);
parse_float_result
parse_float(string_view integer, string_view fraction, string_view
exponent);
// hex float, only integer part
parse_float_result
parse_float_hex_integer(string_view integer, int exponent = 0);
parse_float_result
parse_float_hex_integer(string_view integer, string_view exponent);
// hex float, only fractional part
parse_float_result
parse_float_hex_fraction(string_view fraction, int exponent = 0);
parse_float_result
parse_float_hex_fraction(string_view fraction, string_view exponent);
// hex float, integer and fractional part
parse_float_result
parse_float_hex(string_view integer, string_view fraction, int exponent =
0);
parse_float_result
parse_float_hex(string_view integer, string_view fraction, string_view
exponent);
The implementation effort is relatively low. The same underlying
floating-point parser can be used as for std::from_chars; we're just
bypassing the initial steps of looking for the radix point, exponent
separator, etc. and jumping straight into the computational part.
Note that this may have benefits even if we could have used
std::from_chars. Floating-point parsing is often preceded by a
tokenization, possibly with regex matching, and we get the subdivision into
integer/fraction/exponent "for free" as a side product.
Received on 2025-04-28 13:41:51
