Date: Thu, 9 May 2024 07:44:27 +0000
>> That is why I think this model:
>>
>> > input encoding -> (program uses intermediate UTF-8 throughout) ->
>> > output
>> encoding
>>
>> is misguided, that middle step doesn’t actually solve anything, it
>> just introduces an extra middleman where more things can go wrong.
>It's not misguided at all.
>If we designed std::cout today, we'd make it do exactly that. In `std::cout << x;`, the inserter (operator<<) would serialize `x` to a sequence of characters in an encoding that can represent everything (i.e. UTF-8), the stream would then pass that UTF-8 to the stream buffer, the stream buffer would then transcode to the output encoding and write it out.
But your stream buffer is associated with 1 (singular) interface that expects a definite encoding (that is not always utf-8), why should the data in that stream buffer be anything other than the encoding expected by that interface?
Why do we need to do that extra unnecessary conversion to an intermediary encoding (that neither the input is written in or the output understand), why not just do 1 transcoding that immediately achieves the goal?
Why do we need a process that tries to do more than what is necessary just to have them do the wrong thing?
>>
>> > input encoding -> (program uses intermediate UTF-8 throughout) ->
>> > output
>> encoding
>>
>> is misguided, that middle step doesn’t actually solve anything, it
>> just introduces an extra middleman where more things can go wrong.
>It's not misguided at all.
>If we designed std::cout today, we'd make it do exactly that. In `std::cout << x;`, the inserter (operator<<) would serialize `x` to a sequence of characters in an encoding that can represent everything (i.e. UTF-8), the stream would then pass that UTF-8 to the stream buffer, the stream buffer would then transcode to the output encoding and write it out.
But your stream buffer is associated with 1 (singular) interface that expects a definite encoding (that is not always utf-8), why should the data in that stream buffer be anything other than the encoding expected by that interface?
Why do we need to do that extra unnecessary conversion to an intermediary encoding (that neither the input is written in or the output understand), why not just do 1 transcoding that immediately achieves the goal?
Why do we need a process that tries to do more than what is necessary just to have them do the wrong thing?
Received on 2024-05-09 07:44:34