Date: Tue, 21 May 2024 17:15:12 -0400
On 5/21/24 3:32 PM, Tiago Freire wrote:
> On 21/05/2024 19.44, Jens Maurer via SG16 wrote:
>
>> For everybody else (and I posit that includes a large fraction of users that need case-insensitive stuff to begin with), having support in the standard library may outweigh the disadvantage of that support lagging a few Unicode versions behind.
> We have to keep in mind that the lag we are talking about is measured in years, not months.
Not by necessity, but I agree that is often the case in practice.
Note that we explicitly grant implementations permission to support
newer Unicode versions (or at least we will once CWG 2483
<https://cplusplus.github.io/CWG/issues/2843.html> is formally
accepted). Implementations have been doing a good job of keeping up with
new Unicode versions so far. I've worked at companies that upgraded
compiler versions much more frequently than they upgraded third party
libraries. Some projects would actually get Unicode updates more quickly
if Unicode features are included with their implementation.
> And what kind of user is doing case-insensitive stuff (in Unicode) where they don't care to update when a newer set of cased characters is available?
> If casing is important for what they are doing why wouldn't they want to update? What kind of application are they doing where casing is kind of important but not really?
There appear to be several assumptions here that I don't agree with. I
would expect that users do want to update, but since most have time
constraints, they have to prioritize what is most important. I wouldn't
assume that use of a third party library means that they will upgrade
more frequently.
>
>
>> What wrongness are we discussing here?
> As an example, using a Unicode compliant case-insensitive comparison to figure out if 2 paths correspond to the same node, would be wrong.
> (and this would be precisely the kind of user who wouldn't care to update)
We've already acknowledged that programmers should not do that. Not
providing useful features in the standard does not prevent them from
doing that through use of a third party library. This is something that
is not in our control.
>
>> I disagree with that characterization. Bold claims need lots of evidence and rationale.
> It maybe that my vision of this is colored by my own personal experience of what people actually want when they ask for casing features and having seen this kind of thing play out.
> I admit to that, it is personal experience.
> But I think the question should still be asked as to "why someone wants case conversion or case insensitive comparisons",
> 1. is it because they are processing text as text?
> or
> 2. is it because of Windows OS behavior?
>
> I don't think it is an unreasonable thing to suspect, given that the quote from the lifted point (which wasn't my own) that supported this feature displays a very similar dynamic.
Nothing in that quote suggests that the person that authored the comment
had filesystem paths in mind.
>
> I'm not saying that the feature should be off the table permanently. But I'm definitely saying that it is not as important as transcoding is.
> And that transcoding should be completed and done before figuring out these other secondary features.
> And that we should be careful in rushing casing functions which has the potential of being outdated and produce the wrong behavior as an external standard gets updated (causing either incompatible behavior or incorrect behavior depending on what gets updated).
> It would be very misfortunate to have to deal with this last problem, thinking that we are helping programmers do 1 if what they are trying to do is 2 (and this feature doesn't even help them).
>
> We should ask that question first.
>
> I don't think casing is important and I wanted to avoid discussing it before transcoding is a done deal... which is precisely what I am doing right now.
Within WG21, like in open source projects, prioritization is based on
who feels motivated to work on what. That isn't something that we decide
(as much as some would like to pretend it is). If someone feels
motivated and comes forward with a proposal for handling Unicode case
issues, we will consider it.
Tom.
> On 21/05/2024 19.44, Jens Maurer via SG16 wrote:
>
>> For everybody else (and I posit that includes a large fraction of users that need case-insensitive stuff to begin with), having support in the standard library may outweigh the disadvantage of that support lagging a few Unicode versions behind.
> We have to keep in mind that the lag we are talking about is measured in years, not months.
Not by necessity, but I agree that is often the case in practice.
Note that we explicitly grant implementations permission to support
newer Unicode versions (or at least we will once CWG 2483
<https://cplusplus.github.io/CWG/issues/2843.html> is formally
accepted). Implementations have been doing a good job of keeping up with
new Unicode versions so far. I've worked at companies that upgraded
compiler versions much more frequently than they upgraded third party
libraries. Some projects would actually get Unicode updates more quickly
if Unicode features are included with their implementation.
> And what kind of user is doing case-insensitive stuff (in Unicode) where they don't care to update when a newer set of cased characters is available?
> If casing is important for what they are doing why wouldn't they want to update? What kind of application are they doing where casing is kind of important but not really?
There appear to be several assumptions here that I don't agree with. I
would expect that users do want to update, but since most have time
constraints, they have to prioritize what is most important. I wouldn't
assume that use of a third party library means that they will upgrade
more frequently.
>
>
>> What wrongness are we discussing here?
> As an example, using a Unicode compliant case-insensitive comparison to figure out if 2 paths correspond to the same node, would be wrong.
> (and this would be precisely the kind of user who wouldn't care to update)
We've already acknowledged that programmers should not do that. Not
providing useful features in the standard does not prevent them from
doing that through use of a third party library. This is something that
is not in our control.
>
>> I disagree with that characterization. Bold claims need lots of evidence and rationale.
> It maybe that my vision of this is colored by my own personal experience of what people actually want when they ask for casing features and having seen this kind of thing play out.
> I admit to that, it is personal experience.
> But I think the question should still be asked as to "why someone wants case conversion or case insensitive comparisons",
> 1. is it because they are processing text as text?
> or
> 2. is it because of Windows OS behavior?
>
> I don't think it is an unreasonable thing to suspect, given that the quote from the lifted point (which wasn't my own) that supported this feature displays a very similar dynamic.
Nothing in that quote suggests that the person that authored the comment
had filesystem paths in mind.
>
> I'm not saying that the feature should be off the table permanently. But I'm definitely saying that it is not as important as transcoding is.
> And that transcoding should be completed and done before figuring out these other secondary features.
> And that we should be careful in rushing casing functions which has the potential of being outdated and produce the wrong behavior as an external standard gets updated (causing either incompatible behavior or incorrect behavior depending on what gets updated).
> It would be very misfortunate to have to deal with this last problem, thinking that we are helping programmers do 1 if what they are trying to do is 2 (and this feature doesn't even help them).
>
> We should ask that question first.
>
> I don't think casing is important and I wanted to avoid discussing it before transcoding is a done deal... which is precisely what I am doing right now.
Within WG21, like in open source projects, prioritization is based on
who feels motivated to work on what. That isn't something that we decide
(as much as some would like to pretend it is). If someone feels
motivated and comes forward with a proposal for handling Unicode case
issues, we will consider it.
Tom.
Received on 2024-05-21 21:15:15