C++ Logo

SG16

Advanced search

Subject: Re: iconv-style interface for transcoding functions
From: Peter Brett (pbrett_at_[hidden])
Date: 2021-01-28 05:27:26


Hi Jens,

Thank you for sending this suggestion through. It'll be good input for a discussion at the start of our next meeting.

One of the features in the current paper that would be lost is described in (3.5):

    (3.5) - If output_size is not NULL, then *output_size will be
            decremented the amount of code units that would have
            been written to *output (even if output was NULL). If
            the output is exhausted (*output_size will be
            decremented below zero), the function returns
            MCHAR_INSUFFICIENT_OUTPUT.

This allows the restartable transcoding functions to be used to measure the amount of space required to store the results of a transcoding operation without pre-allocating a buffer.

This is valuable. How would it be provided in a hypothetical [begin, end) pointer interface?

Best regards,

                                     Peter

> -----Original Message-----
> From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Jens Maurer via SG16
> Sent: 27 January 2021 21:32
> To: SG16 <sg16_at_[hidden]>
> Cc: Jens Maurer <Jens.Maurer_at_[hidden]>
> Subject: [SG16] iconv-style interface for transcoding functions
>
> EXTERNAL MAIL
>
>
> In today's teleconference, JeanHeyd suggested that the
> "pointer + length" interface would allow to pass nullptr
> for the output length, allowing the user to assert
> "there is enough space", which, in turn, allows to forego
> range checking during transcoding.
>
> At least the version of iconv on my current Ubuntu
> system does not offer precedence for these semantics;
> passing nullptr for the output length just crashes.
> Test case:
>
> #include <iconv.h>
> #include <stdlib.h>
>
> int main()
> {
> iconv_t cd = iconv_open("utf-8", "utf-8");
> char in[] = "abcd";
> char *pin = in;
> size_t nin = 4;
> char *pout = (char*)malloc(100);
> size_t nout = 100;
> size_t n = iconv(cd, &pin, &nin, &pout, nullptr);
> }
>
> This behavior is consistent with the description in
> the man page, where no mention of special handling
> of nullptr length arguments appears.
> Plus the POSIX specification agrees:
> https://urldefense.com/v3/__https://pubs.opengroup.org/onlinepubs/9699919799
> /functions/iconv.html__;!!EHscmS1ygiU1lA!R9gLUcalUHS_RvR92m7qRbPIefVW8CV2wNI
> qm5qwaReR7OyQATrk1qRmyrbNEg$
>
> I'd also like to point out that interfaces that
> assume "there will be enough space" are prone to
> misuse, admitting buffer overflows. I'd also
> like to point out that the perceived run-time
> overhead of the extra length check is partially
> mitigated by
>
> - the necessity to check the length pointer for nullptr
> and branch to a special implementation
>
> - in a [begin, end) iterator range implementation,
> the ability to determine the available space and omit
> some or all length checks if ample space is provided.
>
> In short, I believe the core interface should treat in
> [begin, end) iterator ranges, where "begin" is updated
> by the function. If that doesn't materialize (for whatever
> reason), I expressly do not want thin decorators to
> be standardized, ballooning the number of functions even
> more.
>
> Jens
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg
> 16__;!!EHscmS1ygiU1lA!R9gLUcalUHS_RvR92m7qRbPIefVW8CV2wNIqm5qwaReR7OyQATrk1q
> TqQcP2aQ$


SG16 list run by sg16-owner@lists.isocpp.org