Date: Sat, 30 Mar 2019 19:13:25 -0400
I tend to agree that text algorithms on non-unicode text are probably of
low value for standardizing. To the extent possible, we should provide the
consortium algorithms in order to minimize needless creativity.
On the other hand, I don't want to hide things without good reason. Unicode
database queries, for example. They are not actually generally useful, and
if you think they are the api you are looking for, you have probably missed
the one you should actually have used. Nonetheless, I think we should
probably provide access in a standard way, if just to allow better
implementations of algorithms.
It's my sense that there's a weak preference for meaningful names for
namespaces in std (which is an exception). Aliasing allows shorter ones,
like rg for ranges. My preference would be to use `unicode`, even for the
transcoding interfaces, as they will generally use scalar values
internally.
On Sat, Mar 30, 2019, 18:43 Corentin <corentin.jabot_at_[hidden]> wrote:
>
> On Sat, 30 Mar 2019 at 22:12 Lyberta <lyberta_at_[hidden]> wrote:
>
>> Ranges has made a precedent that we can provide better versions of old
>> functions by putting them into a separate namespace. It is general
>> consensus that almost all current text related function are obsolete. We
>> should consider a namespace for new ones.
>>
>> I think std::text fits this. This namespace would contain functions that
>> are modern and can properly support Unicode (and other encodings!).
>>
>
> i think we are trying to limit the support for non unicode encodings to
> transcoding.
> Unicode sandwich and all
>
>
>>
>> There is also a precedent of my proposal and D1628 having separate
>> namespace specifically for Unicode. Generally speaking, Unicode is a
>> subset of text processing so in mathematical sense it would be obvious
>> to put unicode namespace as std::text::unicode but here I agree that it
>> is too much typing.
>>
>
> You will find that LEWG will push strongly against that.
> I agree we need _one_ namespace - it's will be a very hard sell.
> Nested namespace is very unlikely to reach consensus.
> So would be 2 non-nested namespaces
>
>
>
>>
>> So I propose the following:
>>
>> std::text for general purpose text algorithms (to be determined as we
>> haven't even nailed the Unicode yet, but consider std::text::to_upper,
>> std::text::is_alphanumeric).
>> std::unicode for Unicode classes and algorithms. Everything in std::text
>> should be able to work with classes from std::unicode.
>>
>> Then we can add more encodings under std or maybe right into std::text
>> if they are too simple.
>>
>> Theoretical examples:
>>
>> std::ascii
>> std::ebcdic
>> std::shift_jis
>>
>> _______________________________________________
>> SG16 Unicode mailing list
>> Unicode_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/unicode
>>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>
low value for standardizing. To the extent possible, we should provide the
consortium algorithms in order to minimize needless creativity.
On the other hand, I don't want to hide things without good reason. Unicode
database queries, for example. They are not actually generally useful, and
if you think they are the api you are looking for, you have probably missed
the one you should actually have used. Nonetheless, I think we should
probably provide access in a standard way, if just to allow better
implementations of algorithms.
It's my sense that there's a weak preference for meaningful names for
namespaces in std (which is an exception). Aliasing allows shorter ones,
like rg for ranges. My preference would be to use `unicode`, even for the
transcoding interfaces, as they will generally use scalar values
internally.
On Sat, Mar 30, 2019, 18:43 Corentin <corentin.jabot_at_[hidden]> wrote:
>
> On Sat, 30 Mar 2019 at 22:12 Lyberta <lyberta_at_[hidden]> wrote:
>
>> Ranges has made a precedent that we can provide better versions of old
>> functions by putting them into a separate namespace. It is general
>> consensus that almost all current text related function are obsolete. We
>> should consider a namespace for new ones.
>>
>> I think std::text fits this. This namespace would contain functions that
>> are modern and can properly support Unicode (and other encodings!).
>>
>
> i think we are trying to limit the support for non unicode encodings to
> transcoding.
> Unicode sandwich and all
>
>
>>
>> There is also a precedent of my proposal and D1628 having separate
>> namespace specifically for Unicode. Generally speaking, Unicode is a
>> subset of text processing so in mathematical sense it would be obvious
>> to put unicode namespace as std::text::unicode but here I agree that it
>> is too much typing.
>>
>
> You will find that LEWG will push strongly against that.
> I agree we need _one_ namespace - it's will be a very hard sell.
> Nested namespace is very unlikely to reach consensus.
> So would be 2 non-nested namespaces
>
>
>
>>
>> So I propose the following:
>>
>> std::text for general purpose text algorithms (to be determined as we
>> haven't even nailed the Unicode yet, but consider std::text::to_upper,
>> std::text::is_alphanumeric).
>> std::unicode for Unicode classes and algorithms. Everything in std::text
>> should be able to work with classes from std::unicode.
>>
>> Then we can add more encodings under std or maybe right into std::text
>> if they are too simple.
>>
>> Theoretical examples:
>>
>> std::ascii
>> std::ebcdic
>> std::shift_jis
>>
>> _______________________________________________
>> SG16 Unicode mailing list
>> Unicode_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/unicode
>>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>
Received on 2019-03-31 00:13:39