Date: Mon, 1 Sep 2025 21:13:38 +0100
On Mon, 1 Sept 2025 at 14:37, Jason McKesson via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> On Mon, Sep 1, 2025 at 3:02 AM zxuiji <gb2985_at_[hidden]> wrote:
> >
> > Oh but contrare, such an API SHOULD be standardised. How do you think
> iconv(), WideCharToMuliByte() and MultiBytToWideCharf exist? By ignoring
> assumptions in the data types. They're explicitly stating (with the void*
> pointer parts) they they give no s**ts about assumed encodings associated
> with whatever type is being used for the strings passed in.
>
> That's because they're made in a language which has no function
> overloading. They *cannot* care about the type.
>
> C++ has overloading; taking advantage of that is the whole point of
> having UTF-based character types. Unless a function is being given a
> non-constant expression parameter that defines the encoding of the
> input or output, there is no reason to use a `void*`.
>
> > It is in such cases that the "API that ignores assumptions" is
> standardised, at least within there own ecosystem. Frankly I would like a
> standard variant of that combines the best parts of both APIs. something
> like:
> >
> > /// @brief get codepage
> > long getstrcp( char const *name );
> > /** @brief try to convert from source codepage (scp) to destination
> codepage (dcp)
> > @return If dst is NULL or cap < 1 then the size of the allocation needed
> for dst to be completely converted (including final \0),
> > otherwise returns the number of characters converted, not including the
> \0 character appended at the end
> > **/
> > long strconv( long dcp, void *dst, long cap, long scp, void const *src,
> long end );
> >
> > along with some optional inlines like these to make it easier to use
> when said types are used:
> >
> > inline utfconv8to16( char16_t *dst, long max, char8_t const *src, long
> len )
> > { return strconv( 1200, dst, max * sizeof(char16_t), 65001, src, len
> * sizeof(char8_t) ); }
> > inline utfconv8to32( .char32_t *dst, ... )
> > { return strconv( 12000, dst, ... ); }
> > inline utfconv16to8( char8_t *dst ... )
> > ...
> >
> > Would make things much more convenient to work with and the API would be
> conveying the assumptions it's applying directly to the developer
>
> Again, `filesystem::path`'s constructors are illustrative here. There
> is no reason for any language with overloading to do things the way
> you describe. If we have a type system, `utfconv` should be enough;
> everything else can be taken from the types of the parameters.
>
> Similarly, such functions should be given `std::span<char*_t>`s, not
> bare pointers and sizes.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
Really? So you'd rather make overloads for 1000s of codepages? You go ahead
and do you, I'll continue laughing at you until the cows come home (and I
have none :D)
std-proposals_at_[hidden]> wrote:
> On Mon, Sep 1, 2025 at 3:02 AM zxuiji <gb2985_at_[hidden]> wrote:
> >
> > Oh but contrare, such an API SHOULD be standardised. How do you think
> iconv(), WideCharToMuliByte() and MultiBytToWideCharf exist? By ignoring
> assumptions in the data types. They're explicitly stating (with the void*
> pointer parts) they they give no s**ts about assumed encodings associated
> with whatever type is being used for the strings passed in.
>
> That's because they're made in a language which has no function
> overloading. They *cannot* care about the type.
>
> C++ has overloading; taking advantage of that is the whole point of
> having UTF-based character types. Unless a function is being given a
> non-constant expression parameter that defines the encoding of the
> input or output, there is no reason to use a `void*`.
>
> > It is in such cases that the "API that ignores assumptions" is
> standardised, at least within there own ecosystem. Frankly I would like a
> standard variant of that combines the best parts of both APIs. something
> like:
> >
> > /// @brief get codepage
> > long getstrcp( char const *name );
> > /** @brief try to convert from source codepage (scp) to destination
> codepage (dcp)
> > @return If dst is NULL or cap < 1 then the size of the allocation needed
> for dst to be completely converted (including final \0),
> > otherwise returns the number of characters converted, not including the
> \0 character appended at the end
> > **/
> > long strconv( long dcp, void *dst, long cap, long scp, void const *src,
> long end );
> >
> > along with some optional inlines like these to make it easier to use
> when said types are used:
> >
> > inline utfconv8to16( char16_t *dst, long max, char8_t const *src, long
> len )
> > { return strconv( 1200, dst, max * sizeof(char16_t), 65001, src, len
> * sizeof(char8_t) ); }
> > inline utfconv8to32( .char32_t *dst, ... )
> > { return strconv( 12000, dst, ... ); }
> > inline utfconv16to8( char8_t *dst ... )
> > ...
> >
> > Would make things much more convenient to work with and the API would be
> conveying the assumptions it's applying directly to the developer
>
> Again, `filesystem::path`'s constructors are illustrative here. There
> is no reason for any language with overloading to do things the way
> you describe. If we have a type system, `utfconv` should be enough;
> everything else can be taken from the types of the parameters.
>
> Similarly, such functions should be given `std::span<char*_t>`s, not
> bare pointers and sizes.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
Really? So you'd rather make overloads for 1000s of codepages? You go ahead
and do you, I'll continue laughing at you until the cows come home (and I
have none :D)
Received on 2025-09-01 19:59:24