C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Agenda for the 2024-05-22 SG16 meeting

From: Tiago Freire <tmiguelf_at_[hidden]>
Date: Sat, 18 May 2024 20:10:13 +0000
I have some quick notes:
> 1. Add support for encoding conversions.
Support 100%

> 2. Add support for charN_t in std::from_chars() and std::to_chars().
This is easily doable but I think we can do better. But there is something else in this field that I’m experimenting with right now, and I think we can even do 1-step further with “encoding agnostic numerical conversions”. Instead of a to_chars and from_chars, we have a to_digits and from_digits, which instead of converting directly to/from a character encoding it converts between an intermediate numerical representation of the individual digits that then can easily be converted to any encoding be it charN_t or otherwise. I expect a minimal penalty in performance (<10%) or even parity to doing the conversion directly to charN_t. I will be working on this and provide with a playground repo soon.
The “hard” part of the algorithm is to figure out which digits goes where, the encoding itself is trivial, and I think we can gain by separating these two things.

> 3. Add support for Unicode-aware case conversions and case-insensitive comparisons.

I would discourage against doing this. And the main reason for doing this is considering why a developer would want to do this.
I personally don’t know many applications where “case-conversion” is a thing that you want to do (it is mostly a weird quirk of how humans have abused a writing system rather than anything else, many cultures don’t have this problem), except for the most common ones related to operating system interfaces that are case-insensitive such as files on NTFS. Everyone wants to do case comparisons/conversions because of NTFS, but this would be wrong, not only because NTFS is now “optionally” case-insensitive, the set of characters in NTFS that have a case and are considered matching sans-case is not the same as what Unicode defines (or will define in the future).

If we provide this to users, it will far more likely do something that the user did not intend but will look right in testing. I’m not sure if a user would be better of not to be provided with this, and be forced to implement a solution that actually works for their use case.

Received on 2024-05-18 20:10:19