On Thu, Mar 2, 2023 at 4:08 PM Thiago Macieira via SG16 <sg16@lists.isocpp.org> wrote:
On Thursday, 2 March 2023 07:28:02 PST Steve Downey via SG16 wrote:
> That ICU apparently has poor performance in this area indicates a little
> that many users are OK with it. But it does also sound like there are use
> cases for eager algorithms that could provide better performance. Probably
> something like the ones that JeanHeyd was proposing for C?

I don't know the actual internals of those functions, but the wrapping API
with the needs of error code returning and the abstraction created by it do
make it slower than a pure, raw implementation would be. It's a price one pays
for having a consistent API for UTF-8 as well as Shift JIS or any other weird
codec that ICU may have available and you needed to decode content.


This is not true.

If the API is designed with the concepts Alexander Stepanov had in mind when he first designed the C++ STL, there is no material performance difference between a raw C API and a fully-realized, deeply generic STL-style API. I have thoroughly benchmarked exactly these use cases, and documented it here:

https://ztdtext.readthedocs.io/en/latest/benchmarks.html
https://ztdtext.readthedocs.io/en/latest/benchmarks/transcoding%20-%20UTF.html

Sincerely,
JeanHeyd