Date: Sat, 29 Jan 2022 00:31:17 +0100
I have a question about the proposal "Formatted output" with number
P2093. Why was stdout from C chosen to be the default output stream
instead of std::cout from C++? I find it unusual that a new proposal
depends on facilities from C instead of C++. It might be a problem if
one mixes std::cout and the default std::print (without C++ stream
given as argument) when syncing with stdio is off. IMO mixing std::cout
and std::print should be a feature that works out of the box without
any boilerplate like giving std::cout as argument.
The paper gives couple of arguments in favour of C's stdout, but I'd
argue that further investigation is needed. I'll quote some stuff from
the paper and I will reply to them.
The paper says:
> We propose adding a free function called print with overloads for
> writing to the standard output (the default) and an explicitly passed
> output stream object. The default output stream can be either stdout
> or std::cout. We propose using stdout for the following reasons:
> stdout is considerably faster on at least two major implementations
> (see § 12 Performance).
The benchmark provided in the paper is not complete. It measures some
standard functions like printf, ostream::operator<<(const char *) and 2
calls to fmt::print. The problem is that the benchmark measures the
formatted IO part of the C standard library (printf) and the C++
standard library (operator<<), but instead it should be measuring the
unformatted IO (fwrite, fputs, std::ostream::write). std::print
internally should not depend on the formatted IO functions, but only on
the unformatted because its formatting features are different than what
printf and cout offer. Because it depends only on the unformatted IO
offered by the standard library, that IO should be measured. I have my
own benchmarks where I show that C++ unformatted IO is even faster than
C's on Linux.
> Better compatibility with other formatted I/O facilities compared to
> std::cout and its associated std::streambuf that suffer from private
> buffering, localization and conversion services that must be
> synchronized at a lower level.
The author mentions private buffering, but C streams (FILE*) also have
private buffering. Here are how things work, AFAIK.
1. At the lowest level Linux offers file IO with file descriptors,
see open(), close(), read(), write(). This IO is unbuffered. Inside the
kernel probably there is some form of caching at various levels (file-
level, inode-level, level of device blocks etc.), but we don't control
that.
2. At C level we have the FILE data structure which internally uses
the file descriptor and adds buffering on top of it, see setvbuf().
3. At C++ level we have std::streambuf and std::filebuf which do the
same, they depend direcly on the API with file desctiptors and add its
own buffering. Only when sync_with_stdio is true libstdc++ disables the
C++ buffering inside std::cout and depends on the C buffering in
stdout.
C++ streambuf is not localized by default. It holds a locale object
because it depends on codecvt, but the default codecvt<char, char,
mbstate_t> always returns noconv. wstreambuf is a different story. C
streams also have its encoding conversions built-in if we mix wchar_t,
for example.
printf("%ls\n", L"wide string"); // has encoding from wide to narrow
multibyte // during string formatting
wprintf(L"ABC\n"); // has encoding from wide to
narrow // before sending to stdout from the OS
wprintf(L"%s\n", "narrow string"); // has encoding from narrow to
wide // during string formatting and
again // from wide to narrow before
sending // to the OSHow std::print
behves when wchar_t is involved maybe should be different discussion.
My focus here is the performance aspect.
> print won’t use any formatted output functionality of ostream.
It can still depend only on the unformatted IO and be fast.
Implementations of std::print can even grab the std::streambuf inside
ostream and work directly with it to remove the overhead of
std::ostream::write.
With all this said, maybe the real reason why stdout was chosen is
because one can do the trick on Windows
with GetConsoleMode(_get_osfhandle(_fileno(stream)), ...) only on C's
stdout. But that is not a problem for standard libraries, it is only
for the library fmt. I will now show the benchmark with a few different
invocations.
// Filename: cout.cpp#include <cstdio>#include <iostream>#include
<benchmark/benchmark.h>
void bm_printf_with_number(benchmark::State& s) { while
(s.KeepRunning()) std::printf("The answer is %d.\n",
42);}BENCHMARK(bm_printf_with_number);
void bm_printf_with_string(benchmark::State& s) { while
(s.KeepRunning()) std::printf("The answer is
42.\n");}BENCHMARK(bm_printf_with_string);
void bm_fwrite(benchmark::State& s) { const char str[] = "The answer
is 42.\n"; while (s.KeepRunning()) std::fwrite(str,
sizeof(str)-1, 1, stdout);}BENCHMARK(bm_fwrite);
void bm_fwrite_2(benchmark::State& s) { const char str[] = "The answer
is 42.\n"; while (s.KeepRunning()) std::fwrite(str, 1,
sizeof(str)-1, stdout);}BENCHMARK(bm_fwrite_2);
void bm_fputs(benchmark::State& s) { const char str[] = "The answer
is 42.\n"; while (s.KeepRunning()) std::fputs(str,
stdout);}BENCHMARK(bm_fputs);
void bm_puts(benchmark::State& s) { const char str[] = "The answer
is 42."; while (s.KeepRunning()) std::puts(str);}BENCHMA
RK(bm_puts);
void bm_ostream_with_number(benchmark::State& s) { std::ios::sync_
with_stdio(false); while (s.KeepRunning()) std::cout <<
"The answer is " << 42 << ".\n";}BENCHMARK(bm_ostream_with_number);
void bm_ostream_with_string(benchmark::State& s) { std::ios::sync_
with_stdio(false); while (s.KeepRunning()) std::cout <<
"The answer is 42.\n";}BENCHMARK(bm_ostream_with_string);
void bm_ostream_write(benchmark::State& s) { std::ios::sync_with_std
io(false); const char str[] = "The answer is 42.\n"; while
(s.KeepRunning()) std::cout.write(str, sizeof(str) -
1);}BENCHMARK(bm_ostream_write);
BENCHMARK_MAIN();// End file.
Compile with:g++ -O2 cout.cpp -lbenchmark
Run with:./a.out --benchmark_out=result.txt --
benchmark_out_format=console > /dev/null && cat result.txt./a.out --
benchmark_out=result.txt --benchmark_out_format=console &&
cat result.txt./a.out --benchmark_out=result.txt --
benchmark_out_format=console > temp.txt && cat result.txt && rm
temp.txt./a.out --benchmark_out=result.txt --
benchmark_out_format=console | grep -v "^T" && cat result.txt
The result will vary but should show that C++ cout::write with
sync_with_stdio set to false is either the fastest or competetive with
fwrite. Please run this benchmark on various platforms that you work
and report.
Best,Dimitrij.
P2093. Why was stdout from C chosen to be the default output stream
instead of std::cout from C++? I find it unusual that a new proposal
depends on facilities from C instead of C++. It might be a problem if
one mixes std::cout and the default std::print (without C++ stream
given as argument) when syncing with stdio is off. IMO mixing std::cout
and std::print should be a feature that works out of the box without
any boilerplate like giving std::cout as argument.
The paper gives couple of arguments in favour of C's stdout, but I'd
argue that further investigation is needed. I'll quote some stuff from
the paper and I will reply to them.
The paper says:
> We propose adding a free function called print with overloads for
> writing to the standard output (the default) and an explicitly passed
> output stream object. The default output stream can be either stdout
> or std::cout. We propose using stdout for the following reasons:
> stdout is considerably faster on at least two major implementations
> (see § 12 Performance).
The benchmark provided in the paper is not complete. It measures some
standard functions like printf, ostream::operator<<(const char *) and 2
calls to fmt::print. The problem is that the benchmark measures the
formatted IO part of the C standard library (printf) and the C++
standard library (operator<<), but instead it should be measuring the
unformatted IO (fwrite, fputs, std::ostream::write). std::print
internally should not depend on the formatted IO functions, but only on
the unformatted because its formatting features are different than what
printf and cout offer. Because it depends only on the unformatted IO
offered by the standard library, that IO should be measured. I have my
own benchmarks where I show that C++ unformatted IO is even faster than
C's on Linux.
> Better compatibility with other formatted I/O facilities compared to
> std::cout and its associated std::streambuf that suffer from private
> buffering, localization and conversion services that must be
> synchronized at a lower level.
The author mentions private buffering, but C streams (FILE*) also have
private buffering. Here are how things work, AFAIK.
1. At the lowest level Linux offers file IO with file descriptors,
see open(), close(), read(), write(). This IO is unbuffered. Inside the
kernel probably there is some form of caching at various levels (file-
level, inode-level, level of device blocks etc.), but we don't control
that.
2. At C level we have the FILE data structure which internally uses
the file descriptor and adds buffering on top of it, see setvbuf().
3. At C++ level we have std::streambuf and std::filebuf which do the
same, they depend direcly on the API with file desctiptors and add its
own buffering. Only when sync_with_stdio is true libstdc++ disables the
C++ buffering inside std::cout and depends on the C buffering in
stdout.
C++ streambuf is not localized by default. It holds a locale object
because it depends on codecvt, but the default codecvt<char, char,
mbstate_t> always returns noconv. wstreambuf is a different story. C
streams also have its encoding conversions built-in if we mix wchar_t,
for example.
printf("%ls\n", L"wide string"); // has encoding from wide to narrow
multibyte // during string formatting
wprintf(L"ABC\n"); // has encoding from wide to
narrow // before sending to stdout from the OS
wprintf(L"%s\n", "narrow string"); // has encoding from narrow to
wide // during string formatting and
again // from wide to narrow before
sending // to the OSHow std::print
behves when wchar_t is involved maybe should be different discussion.
My focus here is the performance aspect.
> print won’t use any formatted output functionality of ostream.
It can still depend only on the unformatted IO and be fast.
Implementations of std::print can even grab the std::streambuf inside
ostream and work directly with it to remove the overhead of
std::ostream::write.
With all this said, maybe the real reason why stdout was chosen is
because one can do the trick on Windows
with GetConsoleMode(_get_osfhandle(_fileno(stream)), ...) only on C's
stdout. But that is not a problem for standard libraries, it is only
for the library fmt. I will now show the benchmark with a few different
invocations.
// Filename: cout.cpp#include <cstdio>#include <iostream>#include
<benchmark/benchmark.h>
void bm_printf_with_number(benchmark::State& s) { while
(s.KeepRunning()) std::printf("The answer is %d.\n",
42);}BENCHMARK(bm_printf_with_number);
void bm_printf_with_string(benchmark::State& s) { while
(s.KeepRunning()) std::printf("The answer is
42.\n");}BENCHMARK(bm_printf_with_string);
void bm_fwrite(benchmark::State& s) { const char str[] = "The answer
is 42.\n"; while (s.KeepRunning()) std::fwrite(str,
sizeof(str)-1, 1, stdout);}BENCHMARK(bm_fwrite);
void bm_fwrite_2(benchmark::State& s) { const char str[] = "The answer
is 42.\n"; while (s.KeepRunning()) std::fwrite(str, 1,
sizeof(str)-1, stdout);}BENCHMARK(bm_fwrite_2);
void bm_fputs(benchmark::State& s) { const char str[] = "The answer
is 42.\n"; while (s.KeepRunning()) std::fputs(str,
stdout);}BENCHMARK(bm_fputs);
void bm_puts(benchmark::State& s) { const char str[] = "The answer
is 42."; while (s.KeepRunning()) std::puts(str);}BENCHMA
RK(bm_puts);
void bm_ostream_with_number(benchmark::State& s) { std::ios::sync_
with_stdio(false); while (s.KeepRunning()) std::cout <<
"The answer is " << 42 << ".\n";}BENCHMARK(bm_ostream_with_number);
void bm_ostream_with_string(benchmark::State& s) { std::ios::sync_
with_stdio(false); while (s.KeepRunning()) std::cout <<
"The answer is 42.\n";}BENCHMARK(bm_ostream_with_string);
void bm_ostream_write(benchmark::State& s) { std::ios::sync_with_std
io(false); const char str[] = "The answer is 42.\n"; while
(s.KeepRunning()) std::cout.write(str, sizeof(str) -
1);}BENCHMARK(bm_ostream_write);
BENCHMARK_MAIN();// End file.
Compile with:g++ -O2 cout.cpp -lbenchmark
Run with:./a.out --benchmark_out=result.txt --
benchmark_out_format=console > /dev/null && cat result.txt./a.out --
benchmark_out=result.txt --benchmark_out_format=console &&
cat result.txt./a.out --benchmark_out=result.txt --
benchmark_out_format=console > temp.txt && cat result.txt && rm
temp.txt./a.out --benchmark_out=result.txt --
benchmark_out_format=console | grep -v "^T" && cat result.txt
The result will vary but should show that C++ cout::write with
sync_with_stdio set to false is either the fastest or competetive with
fwrite. Please run this benchmark on various platforms that you work
and report.
Best,Dimitrij.
Received on 2022-01-28 23:31:24