C++ Logo


Advanced search

Re: Performance requirements for Unicode views/types/algorithms

From: Thiago Macieira <thiago_at_[hidden]>
Date: Thu, 02 Mar 2023 13:37:44 -0800
On Thursday, 2 March 2023 13:10:35 PST JeanHeyd Meneide wrote:
> On Thu, Mar 2, 2023 at 4:08 PM Thiago Macieira via SG16
> <sg16_at_[hidden]> wrote:
> > On Thursday, 2 March 2023 07:28:02 PST Steve Downey via SG16 wrote:
> > > That ICU apparently has poor performance in this area indicates a little
> > > that many users are OK with it. But it does also sound like there are
> > > use
> > > cases for eager algorithms that could provide better performance.
> > > Probably
> > > something like the ones that JeanHeyd was proposing for C?
> >
> > I don't know the actual internals of those functions, but the wrapping API
> > with the needs of error code returning and the abstraction created by it
> > do
> > make it slower than a pure, raw implementation would be. It's a price one
> > pays for having a consistent API for UTF-8 as well as Shift JIS or any
> > other weird codec that ICU may have available and you needed to decode
> > content.
> This is not true.
> If the API is designed with the concepts Alexander Stepanov had in mind when
> he first designed the C++ STL, there is no material performance difference
> between a raw C API and a fully-realized, deeply generic STL-style API. I
> have

To be clear: I was referring to the ICU API.

This is not STL-style:

        ucnv_toUnicode(icu_conv, &target, targetLimit, &source, sourceLimit,
nullptr, flush, &err);

In none of your graphs is ICU in the top half of performance. How much worse
than the ideal it is varies depending on the benchmark in question. I also
made no statements about it being meaningfully or measurably worse; I simply
said that it has a wrapper API that abstracts and therefore the impact can't
be zero.

Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel DCAI Cloud Engineering

Received on 2023-03-02 21:37:46