Date: Tue, 25 Apr 2023 19:28:11 -0600
On Apr 25, 2023, at 4:47 PM, Thiago Macieira via Std-Proposals <std-proposals_at_[hidden]> wrote:
> Again, from experience not all randomly-named files are temporary. Some of
> those get renamed to permanent names and remain for long periods.
Agreed (“having the ability to rename a temporary file to a specified name is very useful IMO”). Also, yes, I’m aware of benefits of being able to rename within a filesystem. And, I agree that being able to specify the location for temporary files is critical for supporting that.
With the conversation we’ve had, I have updated my proposed mechanism to: adding a new openmode flag (instead of introducing a new overload of the open function, see my last email in full if you didn’t notice that).
> In my informed[*] opinion, tmpfile() is too limited. I've just given two
> reasonable use-cases where it wouldn't suffice. I believe it would be best to go
> beyond it.
>
> [*] informed because I am the maintainer of QFile, QTemporaryFile and
> QSaveFile.
FWIW, I’m sincerely honored to be getting your input/feedback. Thank you.
I think we’re both similarly saying that an interface to std::fstream::open ideally needs to support the varieties of usage while creating as little impedance mismatch as possible with what underlying kernel facilities may be available.
I had proposed calling the new openmode flag “tmpfile” and it would be defined similarly to flags like “in”, “out”, “trunc”, “noreplace” etc. On systems not supporting the direct O_TMPFILE, the libc++ implementation would use fallbacks like mentioned - such as the create exclusive strategy using some randomness in the names until one succeeds.
This would mean we could use std::fstream::open to create a temporary file under a path of our choosing with syntax like:
std::fstream os;
os.open{path, in|out|trunc|noreplace|tmpfile}; // <— using new openmode of “tmpfile"
Where path might be: std::filesystem::temp_directory_path() and with the prefix like std::ios_base:: elided for exposition. This would request creating the new file exclusively and in an unlinked state basically. On O_TMPFILE supporting systems this would exactly be what’s documented for O_TMPFILE (see https://man7.org/linux/man-pages/man2/open.2.html). Else, the libc++ implementation would fallback (to providing as similar as possible like by separately calling unlink for file). Similarly, without the C++23 “noreplace” flag would mean not specifying O_EXCL in conjunction with O_TMPFILE and allowing the resulting file to be linked back into the filesystem - i.e. “materialized” back into the filesystem like you’ve mentioned.
Supporting this re-materialization seems like it would require an additional change, but I believe that other change would be independent to a change that just added support for temporary-file semantics.
Should a change to additionally support “materialization” of a temporary file be in a separate proposal or the same proposal? I was of the opinion that because they’re independent, doing separate follow-on proposal would be easier. I don’t want to get side-tracked on the pros-and-cons of different ways of possibly supporting “materialization” but for the sake of completeness, I had mentioned adding a “rename_to” function to std::fstream for this as well as renaming files opened by an fstream object.
> Sure it does, on Unix systems, which is how tmpfile() was first implemented and
> designed for. The way you do it on Linux is that you delete the file after
> you've created-opened it.
Which is at least racy when the system call that creates the file doesn’t unlink the file in the same kernel operation and a process killed in that window possibly leaves behind a file that was intended not to be left behind. It's why I want std::fstream to be able to select underlying facilities when available like using open(2)’s O_TMPFILE flag. I see, thankfully, that’s what QTemporaryFile <https://github.com/qt/qtbase/blob/dev/src/corelib/io/qtemporaryfile.cpp> does when available. As a user of Qt, my thanks to you for doing that!
Now the question is, can we come up with a proposed change to the C++ standard that alleviates classes like QTemporaryFile from having to have so much infrastructure in them to get its functionality? I mean, given all the racy insecure implementations beyond tmpnam(3), tempnam(3), mktemp(3), and even tmpfile(3) too, it’d be nice to come up with a change to the standard that solved this. I believe between the C++23 “noreplace” openmode and a new one to select temporary-file semantics, we’d get progress on this. Adding something like a “rename_to” function, or maybe given std::filesystem::rename function the option of taking a std::fstream to rename it, then seems to address the materialization and renaming capabilities.
I suspect there’s a ton of implementations for temporary files in C++ that weren’t as well coded as QTemporaryFile. Another reason IMO to help users with something that libc++ already has much of the infrastructure ready for.
> The problem is that it doesn't work for the survives-after-close scenarios
> from above. If the temporary file is meant to do that, one still needs the
> ability to generate a random and unique name.
I don’t think I’m following this. What might you propose from a standards perspective for to with this? Or put another way, how might adding a “tmpfile”-like flag to openmode worsen this?
> The XXXXXX template replacement is a well-known technique. For many uses where
> the name of the file is important, the prefix or the suffix of the name might need
> to be fixed so the file is interpreted correctly by whatever is consuming it.
> Think for example of creating a new executable on Windows: it MUST end in .exe
> (or one of a handful of other suffixes).
Wouldn’t the flow for creating a new executable be like:
1. Create temporary file in same filesystem as executable & opened for writing. Directory matters but file component of the path doesn’t.
2. Write out the executable to the temporary file.
3. Rename the temporary file to the target name of the executable. Full pathname matters.
Here, extending the openmode flags to support temporary-file behavior doesn’t seem to make this any harder. It only begs for having a way to rename/materialize the temporary file to the target executable name. Seems like you’d be more in favor of making that part of a proposal for integrating support for creating temporary files. Is that correct?
Lou
> Again, from experience not all randomly-named files are temporary. Some of
> those get renamed to permanent names and remain for long periods.
Agreed (“having the ability to rename a temporary file to a specified name is very useful IMO”). Also, yes, I’m aware of benefits of being able to rename within a filesystem. And, I agree that being able to specify the location for temporary files is critical for supporting that.
With the conversation we’ve had, I have updated my proposed mechanism to: adding a new openmode flag (instead of introducing a new overload of the open function, see my last email in full if you didn’t notice that).
> In my informed[*] opinion, tmpfile() is too limited. I've just given two
> reasonable use-cases where it wouldn't suffice. I believe it would be best to go
> beyond it.
>
> [*] informed because I am the maintainer of QFile, QTemporaryFile and
> QSaveFile.
FWIW, I’m sincerely honored to be getting your input/feedback. Thank you.
I think we’re both similarly saying that an interface to std::fstream::open ideally needs to support the varieties of usage while creating as little impedance mismatch as possible with what underlying kernel facilities may be available.
I had proposed calling the new openmode flag “tmpfile” and it would be defined similarly to flags like “in”, “out”, “trunc”, “noreplace” etc. On systems not supporting the direct O_TMPFILE, the libc++ implementation would use fallbacks like mentioned - such as the create exclusive strategy using some randomness in the names until one succeeds.
This would mean we could use std::fstream::open to create a temporary file under a path of our choosing with syntax like:
std::fstream os;
os.open{path, in|out|trunc|noreplace|tmpfile}; // <— using new openmode of “tmpfile"
Where path might be: std::filesystem::temp_directory_path() and with the prefix like std::ios_base:: elided for exposition. This would request creating the new file exclusively and in an unlinked state basically. On O_TMPFILE supporting systems this would exactly be what’s documented for O_TMPFILE (see https://man7.org/linux/man-pages/man2/open.2.html). Else, the libc++ implementation would fallback (to providing as similar as possible like by separately calling unlink for file). Similarly, without the C++23 “noreplace” flag would mean not specifying O_EXCL in conjunction with O_TMPFILE and allowing the resulting file to be linked back into the filesystem - i.e. “materialized” back into the filesystem like you’ve mentioned.
Supporting this re-materialization seems like it would require an additional change, but I believe that other change would be independent to a change that just added support for temporary-file semantics.
Should a change to additionally support “materialization” of a temporary file be in a separate proposal or the same proposal? I was of the opinion that because they’re independent, doing separate follow-on proposal would be easier. I don’t want to get side-tracked on the pros-and-cons of different ways of possibly supporting “materialization” but for the sake of completeness, I had mentioned adding a “rename_to” function to std::fstream for this as well as renaming files opened by an fstream object.
> Sure it does, on Unix systems, which is how tmpfile() was first implemented and
> designed for. The way you do it on Linux is that you delete the file after
> you've created-opened it.
Which is at least racy when the system call that creates the file doesn’t unlink the file in the same kernel operation and a process killed in that window possibly leaves behind a file that was intended not to be left behind. It's why I want std::fstream to be able to select underlying facilities when available like using open(2)’s O_TMPFILE flag. I see, thankfully, that’s what QTemporaryFile <https://github.com/qt/qtbase/blob/dev/src/corelib/io/qtemporaryfile.cpp> does when available. As a user of Qt, my thanks to you for doing that!
Now the question is, can we come up with a proposed change to the C++ standard that alleviates classes like QTemporaryFile from having to have so much infrastructure in them to get its functionality? I mean, given all the racy insecure implementations beyond tmpnam(3), tempnam(3), mktemp(3), and even tmpfile(3) too, it’d be nice to come up with a change to the standard that solved this. I believe between the C++23 “noreplace” openmode and a new one to select temporary-file semantics, we’d get progress on this. Adding something like a “rename_to” function, or maybe given std::filesystem::rename function the option of taking a std::fstream to rename it, then seems to address the materialization and renaming capabilities.
I suspect there’s a ton of implementations for temporary files in C++ that weren’t as well coded as QTemporaryFile. Another reason IMO to help users with something that libc++ already has much of the infrastructure ready for.
> The problem is that it doesn't work for the survives-after-close scenarios
> from above. If the temporary file is meant to do that, one still needs the
> ability to generate a random and unique name.
I don’t think I’m following this. What might you propose from a standards perspective for to with this? Or put another way, how might adding a “tmpfile”-like flag to openmode worsen this?
> The XXXXXX template replacement is a well-known technique. For many uses where
> the name of the file is important, the prefix or the suffix of the name might need
> to be fixed so the file is interpreted correctly by whatever is consuming it.
> Think for example of creating a new executable on Windows: it MUST end in .exe
> (or one of a handful of other suffixes).
Wouldn’t the flow for creating a new executable be like:
1. Create temporary file in same filesystem as executable & opened for writing. Directory matters but file component of the path doesn’t.
2. Write out the executable to the temporary file.
3. Rename the temporary file to the target name of the executable. Full pathname matters.
Here, extending the openmode flags to support temporary-file behavior doesn’t seem to make this any harder. It only begs for having a way to rename/materialize the temporary file to the target executable name. Seems like you’d be more in favor of making that part of a proposal for integrating support for creating temporary files. Is that correct?
Lou
Received on 2023-04-26 01:28:23