C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Replacement for codecvt

From: JeanHeyd Meneide <phdofthehouse_at_[hidden]>
Date: Thu, 29 Aug 2019 18:29:54 -0400
As a minor sidenote, for the sake of discussion (and because I am drinking
too much coffee), you can make the Encoding Object approach compile fast by
employing all the type-fixing to fit your needs. Using the below
implementation:

Pros: compiles fast, works only with contiguous ranges/span-constructibles,
non-constexpr, can be optimized
Cons: works only with contiguous ranges, no flexible error handler choice,
depends on LTO for inlining optimizations

More work could be put into it to make different tradeoffs, but this would
work under the proposal and implementation's current specification since it
is based on the concept of what is supposed to be on an encoding object and
its associated structures. You could get away with
std::text::text<speedtf8, std::u8string>, std::text::text<speedtf8,
std::vector<char8_t>>, but it would fail instantiations of
std::text::text<speedtf8, std::deque<char8_t>>:

-------------------------
fast/include/fast/speedtf8.hpp
-------------------------

#include <span>

struct speedtf8;

using encoded_t = span<char8_t>;
using decoded_t = span<char32_t>;

struct empty_struct {};

using speedtf8_state = empty_struct;
using decode_result_t = decode_result<encoded_t, decoded_t, speedtf8_state>;
using encode_result_t = encode_result<encoded_t, decoded_t, speedtf8_state>;


struct fast_replacement_handler {
     decode_result_t operator()( const speedtf8& encoding, decode_result
res ) const noexcept;
     encode_result_t operator()( const speedtf8& encoding, encode_result
res ) const noexcept;
};

using speedtf8_error_handler_t = fast_replacement_handler;

struct speedtf8 {
     using state = empty_struct;
     using code_unit = char8_t;
     using code_point = char32_t;

     static decode_result_t decode(encoded_t input, decoded_t output,
state, speedtf8_error_handler_t) noexcept;
     static encode_result_t encode(encoded_t input, decoded_t output,
state, speedtf8_error_handler_t) noexcept;
};

--------------------------
fast/source/utf8.cpp
--------------------------
#include <fast/speedtf8.hpp>

// include "bloated" std header
#include <text>

// use "bloated" implementation, but only ever compile it once

decode_result_t fast_replacement_error_handler::operator( const speedtf8&
encoding, decode_result res ) const {
     return std::text::replacement_error_handler{}(encoding, res);
}

encode_result_t fast_replacement_error_handler::operator( const speedtf8&
encoding, encode_result_t res ) const {
     return std::text::replacement_error_handler{}(encoding, res);
}

speedtf8::encoding_result_t speedtf8::encode( decoded_t input, encoded_t
output, speedtf8_state& s, speedtf8_error_handler_t err_handler ) noexcept
{
     std::text::utf8::state real_s;
     return std::text::utf8{}.encode(input, output, real_s, err_handler);
}

speedtf8::decoding_result_t speedtf8::decode( encoded_t input, decoded_t
output, speedtf8_state& s, speedtf8_error_handler_t err_handler ) noexcept
{
     std::text::utf8::state real_s;
     return std::text::utf8{}.decode(input, output, real_s, err_handler);
}

Received on 2019-08-30 00:30:07