ISOCPP std-proposals List: Re: [std-proposals] Low-level float parsing functions

From: Tiago Freire <tmiguelf_at_[hidden]>
Date: Mon, 28 Apr 2025 21:14:02 +0000

The example I gave: https://github.com/tmiguelf/utilities/blob/345421be243b7b6a6e70988a77422b5a0b1fe0a1/CoreLib/include/CoreLib/string/core_fp_charconv.hpp#L155
It's code that already exists, I can do this today, it doesn't care how many leading digits you have, you just need to tell me where the boundaries are (it doesn't see the dots or the exponent marker).
This implementation however has the limitation that requires the text encoding to map to ASCII (if it doesn't this doesn't work).

An alternative implementation that I'm currently working that uses the "digits" concept would be able to bypass this limitation, but would require copying data to a temporary buffer for processing.
However, this is not a significant issue, you only need a maximum of 18 non-leading 0 digits in order to completely decode a double (and a maximum of 3 exponent digits), any digits after that are irrelevant and contribute nothing to the final result. You can fit that quite comfortably on the stack.
And it is relatively easy to write an algorithm that only gets those 18 non-leading 0 digits and discard the rest (only confirming that they are indeed valid numbers to deal with the possibility that the text doesn't actually encode a number).

-----Original Message-----
From: Std-Proposals <std-proposals-bounces_at_lists.isocpp.org> On Behalf Of Thiago Macieira via Std-Proposals
Sent: Monday, April 28, 2025 10:03 PM
To: std-proposals_at_lists.isocpp.org
Cc: Thiago Macieira <thiago_at_macieira.org>
Subject: Re: [std-proposals] Low-level float parsing functions

On Monday, 28 April 2025 12:41:17 Pacific Daylight Time Jan Schultke via Std- Proposals wrote:
> Also you don't need to reallocate in any case. There is a relatively
> small upper limit to how many characters a float could need to be
> represented with no loss of information, although it's not easy to
> determine that upper bound without pessimizing.

That's for std::to_chars.

For std::from_chars, there is no upper limit on the buffer size to be due to the problem of leading zeroes, like 0.00000000000000000000001e20[*]. To shrink it to relevant data and fit a maximum-size buffer, you'll need to parse a great deal of the floating point number anyway and not just searching for tokens like sign, 0x prefix, the decimal separator and the exponent one.

[*] this example would fit a maximum-buffer for DBL_MAX of 309 characters, but I could definitely write 0.<310 zeroes>1e310
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel DCAI Platform & System Engineering

--
Std-Proposals mailing list
Std-Proposals_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-04-28 21:14:10