Date: Thu, 9 May 2024 11:08:03 +0300
Tiago Freire wrote:
> >If we designed std::cout today, we'd make it do exactly that. In `std::cout <<
> x;`, the inserter (operator<<) would serialize `x` to a sequence of characters in
> an encoding that can represent everything (i.e. UTF-8), the stream would then
> pass that UTF-8 to the stream buffer, the stream buffer would then transcode
> to the output encoding and write it out.
>
> But your stream buffer is associated with 1 (singular) interface that expects a
> definite encoding (that is not always utf-8), why should the data in that stream
> buffer be anything other than the encoding expected by that interface?
> Why do we need to do that extra unnecessary conversion to an intermediary
> encoding (that neither the input is written in or the output understand), why
> not just do 1 transcoding that immediately achieves the goal?
> Why do we need a process that tries to do more than what is necessary just to
> have them do the wrong thing?
Number of reasons. First, this is how wide streams have to operate, because
they necessarily convert to wide in the inserter, and then convert to narrow
in the stream buffer. Doing the same in the narrow streams makes the stream
behavior consistent. (It is also how hypothetical charN_t-based streams would
work.)
Second, it allows everything in the program to be independent of the output
encoding, which is the only part that would vary at runtime. This makes it
easier to get the program correct without testing it on every possible runtime
encoding it's supposed to support. It's also a good separation of concerns
because the inserters don't need to know what the output encoding is going
to be.
Third,
> But your stream buffer is associated with 1 (singular) interface that expects a
> definite encoding (that is not always utf-8),
that's not even true because I can easily have a "tee" stream buffer that has two
outputs, each with its own encoding (one writes to a file, the other to the terminal.)
> >If we designed std::cout today, we'd make it do exactly that. In `std::cout <<
> x;`, the inserter (operator<<) would serialize `x` to a sequence of characters in
> an encoding that can represent everything (i.e. UTF-8), the stream would then
> pass that UTF-8 to the stream buffer, the stream buffer would then transcode
> to the output encoding and write it out.
>
> But your stream buffer is associated with 1 (singular) interface that expects a
> definite encoding (that is not always utf-8), why should the data in that stream
> buffer be anything other than the encoding expected by that interface?
> Why do we need to do that extra unnecessary conversion to an intermediary
> encoding (that neither the input is written in or the output understand), why
> not just do 1 transcoding that immediately achieves the goal?
> Why do we need a process that tries to do more than what is necessary just to
> have them do the wrong thing?
Number of reasons. First, this is how wide streams have to operate, because
they necessarily convert to wide in the inserter, and then convert to narrow
in the stream buffer. Doing the same in the narrow streams makes the stream
behavior consistent. (It is also how hypothetical charN_t-based streams would
work.)
Second, it allows everything in the program to be independent of the output
encoding, which is the only part that would vary at runtime. This makes it
easier to get the program correct without testing it on every possible runtime
encoding it's supposed to support. It's also a good separation of concerns
because the inserters don't need to know what the output encoding is going
to be.
Third,
> But your stream buffer is associated with 1 (singular) interface that expects a
> definite encoding (that is not always utf-8),
that's not even true because I can easily have a "tee" stream buffer that has two
outputs, each with its own encoding (one writes to a file, the other to the terminal.)
Received on 2024-05-09 08:08:07