C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Agenda for the 2024-05-22 SG16 meeting

From: Tiago Freire <tmiguelf_at_[hidden]>
Date: Sun, 19 May 2024 09:19:19 +0000
>> 3. Add support for Unicode-aware case conversions and case-insensitive comparisons.
> [cut for brevity…]
> We've discussed case-insensitive filesystems in the past and we recognize that the only right way to work with them is to let the filesystem driver handle whatever case folding and normalization is required. Apple's HFS+ got locked to Unicode 3.2 and Apple's APFS is locked to Unicode 9.0.

I agree. But that doesn’t mean that developers aren’t using it (primarily using it) in this way.



>> If we provide this to users, it will far more likely do something that the user did not intend but will look right in testing. I’m not sure if a user would be better of not to be provided with this, and be forced to implement a solution that actually works for their use case.
> Correct handling of case is complicated and there are use cases that are generally applicable to many domains; searching documents for example. The possibility of a feature being misused isn't great motivation for not making the feature available for appropriate uses.

Yes… form validation also comes to mind (did the user input his name in Upper case?) or writing in formal text in an editor (did you start that sentence with upper case? ...and here is a suggestion to correct it).

And although I agree in principle that “the possibility of misuse is not a motivation for not providing a feature”, what I was arguing is that there isn’t a use case (or at least the justification of who wants to use it) that isn’t a misuse of it.


This is one of the bullet points that was extracted from the summary, that mentions “case” and was used as the basis to justify why casing is important, the SINGULAR point that mentions casing at all.

[Proper Unicode support. In MS Windows development, virtually all user input is UTF-16LE in the form of wchar_t and variants. I convert that to UTF-8 via wrapper functions that use third-party Unicode libraries (uni-algo in my case) that (can) use std::string. Things that should be simple but aren't in Unicode, like case conversion and case-insensitive comparison, should be provided for. This would reduce the pain point of third-party libraries.]

Keep in mind that the Unicode standard is evolving, and that users that have a “legitimate use” probably don’t want to wait for a standard/library update to accommodate new rules, and are thus incentivized to have their own implementations anyway.

If those with legitimate use cases are incentivized to not use a standard feature and aren’t asking for one, and the only ones left using it are shooting themselves in the foot, are we providing a useful feature or just a gotcha foot gun?

If 90% of uses are wrong, was it good thing? What percentage would that have to be to say “yeah, maybe we were better off without it”?



I agree with 1. and 2., it should be high priority, we should do it as soon as C++26. But Unicode casing… if not provided, at all, ever… I don’t think I will miss it.


Received on 2024-05-19 09:19:26