ISOCPP std-proposals List: Re: [std-proposals] constexpr support in std::filesystem API

From: Thiago Macieira <thiago_at_[hidden]>
Date: Tue, 12 Mar 2024 12:41:10 -0700

On Tuesday, 12 March 2024 12:19:03 PDT Andrei Grosu wrote:
> I don't agree about the atomic ptr having the same behavior at the user
> level. Clearly if you care enough to use atomic<T>, if it's implemented in
> the uArch or via an operating system synchronization primitive matters, but
> this is not the point.

Of course it matters. Anyone writing atomic code is attempting to write very
efficient code and thus would want to know those details. But that wasn't the
point: the point is that the observable behaviour is the same. In the abstract
machine, there is no performance (good or bad). If you step outside of the
abstract machine, then it matters in different ways: for example, you couldn't
use a looping or mutex-protected atomic on an MMIO region or in memory shared
between processes.

> So, given this code:
>
> std::filesystem::path root("/a/");
> std::filesystem::directory_iterator di(root);
> for (auto const & entry : di)
> {
> // do something with di.path();
> }
>
> it should produce the same implementation-defined list of paths (as
> described above) in both runtime and constexpr contexts. Seems pretty
> obvious.

Directory listing is not necessarily deterministic. Two different users with
the same check out could have two different orders of the same list. In fact,
nothing prevents the OS from reordering in the same system. The very act of
compilation could cause some background tool to run and modify the directory
in such a way the order changes on the next build.

You should apply a sorting on top and the constexpr environment ought to
require it. This in turn implies specifying what sorting it is. And then
consider the sorting of case-insensitive filesystems, filesystems that do
Unicode normalisation, and those that don't store bytes as the file names
(basically, think of macOS and Windows).

Speaking of Windows, how about file sizes and "text" reading, that is, the CRLF
to LF translation? Should text be allowed? What happens to content that is
loaded into the compilation with different CRLF settings depending on the OS?

Then there's the question of timestamps. Should those be allowed to be used in
constexpr environments? What's the resolution of the timestamps? Modern Unix
systems can store nanoseconds; NTFS can only store tenths of milliseconds I
think, and FAT can only store multiples of 2 seconds. Some filesystems can
store birth time, some cannot. Unix filesystems also have a ctime, but Windows
ones don't. Then there's atime...

How about inode numbers or file permissions/attributes?

> This would be useful in 'low level' tooling, IDLs, build systems , dynamic
> test generation and the sort of automation stuff that goes on in CI/CD
> pipelines.

Maybe. But you need to explain why this needs to or should be done in C++ in
the first place. We have powerful buildsystems and I see no reason for C++ to
extend leftwards to it. I also don't buy the argument that the compiler should
be the only tool you ever need: code generators are a good thing in many
cases. You need to compare to what is currently the best alternative and show
that it can be improved upon, to reduce the burden on developers somehow.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Principal Engineer - Intel DCAI Cloud Engineering

Received on 2024-03-12 19:41:13