C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Replacement for codecvt

From: Steve Downey <sdowney_at_[hidden]>
Date: Thu, 29 Aug 2019 16:57:49 -0400
It's not out of the question that span may acquire contiguous_range
constructors for 20. That changes the picture in 2 dimensions. First it
ties span with the machinery the Niall was trying to avoid. Second, though,
it might make it more interesting for some of the low level decode and
encode routines that we need for byte-sized data to be specified as span
based, rather than ranges over constrained types.

UTF-16 often arrives packaged in a char array, or uint8_ts, or possibly as
std::bytes, as that becomes more prevalent. We can't, or shouldn't, alias
those to char16_t or wchar_t or such. IO, at root, is octets these days.

The downside of span is of course state on underrun on input or overrun on
output. Not insurmountable, but more difficult.


On Thu, Aug 29, 2019 at 3:58 PM Niall Douglas <s_sourceforge_at_[hidden]>
wrote:

> >> The reason why I really like span<T> for this is because reencoding
> >> benefits greatly from pipelining tricks only possible when input is
> >> guaranteed contiguous.
> >>
> >> I also agree that support for discontiguous input (and output) is also
> >> very important.
> >>
> >> But I'd tend to approach this as span-of-spans i.e. scatter-gather
> >> buffers with resumability. Or, precisely what LLFIO's read()/write()
> >> already implements.
> >
> > That approach would require reconstructing the outer span(s) whenever
> > the input is modified. That doesn't work well for some use cases (e.g.,
> > editors with undo buffers). Ranges offer the proper abstraction for
> > handling all of this transparently with optimum run-time performance.
>
> I'm actually advocating a low level API, and a high level API. Low level
> API works in terms of resumable span<CharT> (just one span, not span of
> spans). High level API's job is to generate as-long-as-possible
> span<CharT> calls to the low level API, in order to maximise performance.
>
> As I say, I'll return to path_view development after Belfast. I'll see
> what sort of low level API I can come up with intended for easier
> implementation of the high level API.
>
> Niall
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>

Received on 2019-08-29 22:58:04