C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Character classification functions should return bool

From: Jason McKesson <jmckesson_at_[hidden]>
Date: Mon, 17 Oct 2022 22:43:15 -0400
On Mon, Oct 17, 2022 at 9:37 PM JeanHeyd Meneide via Std-Proposals
<std-proposals_at_[hidden]> wrote:
>
> There was a proposal to change this sent to WG14 already, with motivated reasoning due to in-the-wild code using them wrong:
>
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2541.pdf
>
> The problem is that the wording was extremely intentional for C. People use the int-based return type to add additional stored information into the return value, such as classification bits for the given input character (packed as bit flags in the return value). It was meant to make mixed table-based, macro-based implementations be fast, back in the day when compilers were terrible. (And for C, given the wide berth of quality of implementations, still are too terrible to optimize properly.) Even if you could overcome ABI arguments ("bool is returned in different registers from int"), you would need to tell all the implementations using the implementation-definedness non-zero return value to stop doing that.
>
> While everyone in the room evaluating the proposal understood, ultimately for many of the stability reasons the proposal was rejected. You'd have better luck designing new, multi-code-unit encoding (e.g. UTF-8) compatible character classification functions for C and C++ to get the ball rolling, killing 2 birds with one stone (can handle variable-encoding data, and have better return values).

That last part is already in (slow) progress.

Received on 2022-10-18 02:44:42