Date: Sun, 1 Sep 2019 19:00:23 -0400
On Sun, Sep 1, 2019 at 12:07 PM Steve Downey <sdowney_at_[hidden]> wrote:
>
> That was, if I recall correctly, about the C standard library interfaces in the Null-terminated multibyte strings section. Basically that the character at a time interfaces are not amenable to vectorization.
>
Yes. The C interfaces for UTFx-to-multi-byte (mbrtoc16, etc.) and
back currently do one-by-one character encoding with a function that
is often hidden behind a DLL function call, or in object code. The
former prevents anything from being done about it, the latter is just
a prayer than LTO can optimize _so well_ that your loop using the
one-by-one codepoint converting functions and turn the whole thing
into a really, really nice loop which converts things very quickly.
I have not observed this to ever happen, and I'm working on a
benchmarking suite of various methods of conversion that will help
quantify these results in tangible ways.
With ptr + length, someone can optimize the resulting call as
much as they like. With null-terminated versions of the function, I am
skeptical the same performance can be achieved without first calling
strlen() but I have no experience or data to back up that intuition.
Sincerely,
JeanHeyd Meneide
>
> That was, if I recall correctly, about the C standard library interfaces in the Null-terminated multibyte strings section. Basically that the character at a time interfaces are not amenable to vectorization.
>
Yes. The C interfaces for UTFx-to-multi-byte (mbrtoc16, etc.) and
back currently do one-by-one character encoding with a function that
is often hidden behind a DLL function call, or in object code. The
former prevents anything from being done about it, the latter is just
a prayer than LTO can optimize _so well_ that your loop using the
one-by-one codepoint converting functions and turn the whole thing
into a really, really nice loop which converts things very quickly.
I have not observed this to ever happen, and I'm working on a
benchmarking suite of various methods of conversion that will help
quantify these results in tangible ways.
With ptr + length, someone can optimize the resulting call as
much as they like. With null-terminated versions of the function, I am
skeptical the same performance can be achieved without first calling
strlen() but I have no experience or data to back up that intuition.
Sincerely,
JeanHeyd Meneide
Received on 2019-09-02 01:00:36
