Date: Tue, 21 May 2024 23:49:30 +0200
By large casing only makes sense in the context of latin and greek derived
alphabets (both of which developed lowercase/cursive typographies in the
middle ages).
As such I struggle to imagine a scenario by which a future Unicode version
or lack thereof would have a significant impact on anyone's non-tailored
case-folding.
On Tue, May 21, 2024 at 11:15 PM Tom Honermann via SG16 <
sg16_at_[hidden]> wrote:
> On 5/21/24 3:32 PM, Tiago Freire wrote:
>
> On 21/05/2024 19.44, Jens Maurer via SG16 wrote:
>
>
> For everybody else (and I posit that includes a large fraction of users that need case-insensitive stuff to begin with), having support in the standard library may outweigh the disadvantage of that support lagging a few Unicode versions behind.
>
> We have to keep in mind that the lag we are talking about is measured in years, not months.
>
> Not by necessity, but I agree that is often the case in practice.
>
> Note that we explicitly grant implementations permission to support newer
> Unicode versions (or at least we will once CWG 2483
> <https://cplusplus.github.io/CWG/issues/2843.html> is formally accepted).
> Implementations have been doing a good job of keeping up with new Unicode
> versions so far. I've worked at companies that upgraded compiler versions
> much more frequently than they upgraded third party libraries. Some
> projects would actually get Unicode updates more quickly if Unicode
> features are included with their implementation.
>
> And what kind of user is doing case-insensitive stuff (in Unicode) where they don't care to update when a newer set of cased characters is available?
> If casing is important for what they are doing why wouldn't they want to update? What kind of application are they doing where casing is kind of important but not really?
>
> There appear to be several assumptions here that I don't agree with. I
> would expect that users do want to update, but since most have time
> constraints, they have to prioritize what is most important. I wouldn't
> assume that use of a third party library means that they will upgrade more
> frequently.
>
> What wrongness are we discussing here?
>
> As an example, using a Unicode compliant case-insensitive comparison to figure out if 2 paths correspond to the same node, would be wrong.
> (and this would be precisely the kind of user who wouldn't care to update)
>
> We've already acknowledged that programmers should not do that. Not
> providing useful features in the standard does not prevent them from doing
> that through use of a third party library. This is something that is not in
> our control.
>
> I disagree with that characterization. Bold claims need lots of evidence and rationale.
>
> It maybe that my vision of this is colored by my own personal experience of what people actually want when they ask for casing features and having seen this kind of thing play out.
> I admit to that, it is personal experience.
> But I think the question should still be asked as to "why someone wants case conversion or case insensitive comparisons",
> 1. is it because they are processing text as text?
> or
> 2. is it because of Windows OS behavior?
>
> I don't think it is an unreasonable thing to suspect, given that the quote from the lifted point (which wasn't my own) that supported this feature displays a very similar dynamic.
>
> Nothing in that quote suggests that the person that authored the comment
> had filesystem paths in mind.
>
> I'm not saying that the feature should be off the table permanently. But I'm definitely saying that it is not as important as transcoding is.
> And that transcoding should be completed and done before figuring out these other secondary features.
> And that we should be careful in rushing casing functions which has the potential of being outdated and produce the wrong behavior as an external standard gets updated (causing either incompatible behavior or incorrect behavior depending on what gets updated).
> It would be very misfortunate to have to deal with this last problem, thinking that we are helping programmers do 1 if what they are trying to do is 2 (and this feature doesn't even help them).
>
> We should ask that question first.
>
> I don't think casing is important and I wanted to avoid discussing it before transcoding is a done deal... which is precisely what I am doing right now.
>
> Within WG21, like in open source projects, prioritization is based on who
> feels motivated to work on what. That isn't something that we decide (as
> much as some would like to pretend it is). If someone feels motivated and
> comes forward with a proposal for handling Unicode case issues, we will
> consider it.
>
> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
alphabets (both of which developed lowercase/cursive typographies in the
middle ages).
As such I struggle to imagine a scenario by which a future Unicode version
or lack thereof would have a significant impact on anyone's non-tailored
case-folding.
On Tue, May 21, 2024 at 11:15 PM Tom Honermann via SG16 <
sg16_at_[hidden]> wrote:
> On 5/21/24 3:32 PM, Tiago Freire wrote:
>
> On 21/05/2024 19.44, Jens Maurer via SG16 wrote:
>
>
> For everybody else (and I posit that includes a large fraction of users that need case-insensitive stuff to begin with), having support in the standard library may outweigh the disadvantage of that support lagging a few Unicode versions behind.
>
> We have to keep in mind that the lag we are talking about is measured in years, not months.
>
> Not by necessity, but I agree that is often the case in practice.
>
> Note that we explicitly grant implementations permission to support newer
> Unicode versions (or at least we will once CWG 2483
> <https://cplusplus.github.io/CWG/issues/2843.html> is formally accepted).
> Implementations have been doing a good job of keeping up with new Unicode
> versions so far. I've worked at companies that upgraded compiler versions
> much more frequently than they upgraded third party libraries. Some
> projects would actually get Unicode updates more quickly if Unicode
> features are included with their implementation.
>
> And what kind of user is doing case-insensitive stuff (in Unicode) where they don't care to update when a newer set of cased characters is available?
> If casing is important for what they are doing why wouldn't they want to update? What kind of application are they doing where casing is kind of important but not really?
>
> There appear to be several assumptions here that I don't agree with. I
> would expect that users do want to update, but since most have time
> constraints, they have to prioritize what is most important. I wouldn't
> assume that use of a third party library means that they will upgrade more
> frequently.
>
> What wrongness are we discussing here?
>
> As an example, using a Unicode compliant case-insensitive comparison to figure out if 2 paths correspond to the same node, would be wrong.
> (and this would be precisely the kind of user who wouldn't care to update)
>
> We've already acknowledged that programmers should not do that. Not
> providing useful features in the standard does not prevent them from doing
> that through use of a third party library. This is something that is not in
> our control.
>
> I disagree with that characterization. Bold claims need lots of evidence and rationale.
>
> It maybe that my vision of this is colored by my own personal experience of what people actually want when they ask for casing features and having seen this kind of thing play out.
> I admit to that, it is personal experience.
> But I think the question should still be asked as to "why someone wants case conversion or case insensitive comparisons",
> 1. is it because they are processing text as text?
> or
> 2. is it because of Windows OS behavior?
>
> I don't think it is an unreasonable thing to suspect, given that the quote from the lifted point (which wasn't my own) that supported this feature displays a very similar dynamic.
>
> Nothing in that quote suggests that the person that authored the comment
> had filesystem paths in mind.
>
> I'm not saying that the feature should be off the table permanently. But I'm definitely saying that it is not as important as transcoding is.
> And that transcoding should be completed and done before figuring out these other secondary features.
> And that we should be careful in rushing casing functions which has the potential of being outdated and produce the wrong behavior as an external standard gets updated (causing either incompatible behavior or incorrect behavior depending on what gets updated).
> It would be very misfortunate to have to deal with this last problem, thinking that we are helping programmers do 1 if what they are trying to do is 2 (and this feature doesn't even help them).
>
> We should ask that question first.
>
> I don't think casing is important and I wanted to avoid discussing it before transcoding is a done deal... which is precisely what I am doing right now.
>
> Within WG21, like in open source projects, prioritization is based on who
> feels motivated to work on what. That isn't something that we decide (as
> much as some would like to pretend it is). If someone feels motivated and
> comes forward with a proposal for handling Unicode case issues, we will
> consider it.
>
> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
Received on 2024-05-21 21:49:50
