sg16: Re: [SG16-Unicode] P1689: Encoding of filenames for interchange

From: Niall Douglas <s_sourceforge_at_[hidden]>
Date: Fri, 6 Sep 2019 19:03:42 +0100

>>> A forklift upgrade of the file system apis is not in the realm of
>>> possibility, even if C provided a string type that allows embedded nuls.
>>> Every program that processes paths is vulnerable to attack with
>>> unexpected nuls. Even if POSIX provided APIs it would be fantastically
>>> unlikely that vendors would allow their customers to be broken that way,
>>> because the old APIs can't be turned off.
>>
>> POSIX already allows NUL to appear in paths returned by the OS. This is
>> because POSIX code must be a *taker* when it comes to paths supplied by
>> others e.g. by other systems, or filing systems, where NUL in path
>> components is legal.
>
> Can you please provide a link to some documentation of a filesystem that
> allows NUL in path components?

I was thinking of struct dirent, where the record length is supplied to
you, and where the leafname is guaranteed to be null terminated, but not
to not contain null characters. In other words, use the length told to
you, not strlen(). That certainly is the case for Linux and FreeBSD.

Looking up POSIX's requirements for struct dirent at
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/dirent.h.html,
and I believe I am in error wrt struct dirent. POSIX says strlen().

For the NT kernel's OBJECT_ATTRIBUTES structure, to my best knowledge,
only the presence or lack of OBJECT_NAME_PATH_SEPARATOR causes
STATUS_OBJECT_NAME_INVALID in certain circumstances. Or supplying a non
two multiple length to a destination which requires wchar_t input.
Otherwise it accepts any combination of bytes.

(I am unaware of anywhere this is documented, but equally, nowhere
documents any illegal characters or banned sequences for NT kernel
object names because it depends on the bit of the NT kernel namespace
which receives your path fragment and what it considers valid or not.
For example, NTFS refuses OBJECT_NAME_PATH_SEPARATOR in leafnames, but
another filesystem driver might not)

If you run through ntfs-3g's UTF conversion code at
https://github.com/vitalif/ntfs-3g/blob/04d4b37a9a6c992d89c93193d0abdd13ab1f2931/libntfs-3g/unistr.c,
you'll see that they hard assume that filenames match the Win32
restrictions.

There are open bugs filed against ntfs-3g about NTFS filenames
containing '/' confuse the hell out of Linux.

If I get a chance over the weekend, I'll quickly write you a LLFIO
program testing which characters in filenames Windows accepts.

>> Quite a few filing systems already implement this using proprietary
>> APIs, because it's very useful. NTFS and ZFS come immediately to mind.
>> The ability to support this from standard POSIX code is desirable, and I
>> and others are trying to get that over the line, and into standards.
>
> Ok, you mention NTFS and ZFS here, can you provide some links to their
> documentation that describes this?

For Windows, simply specify the volume, and the unique id in curly
brackets e.g.

C:\{xxxxxxxxx}

There is a convenience Win32 API to do this for you at
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-openfilebyid

For ZFS, there is a userspace library for talking to ZFS which the zdb
utility uses. It has a function which lists information for a ZFS object
id, including current path etc. It runs in nearly constant time, because
of how ZFS works internally.

For compatibility, the inode number is kept equal to the object id for
files. So you'll see this work if you do:

zdb -dddd rpool/tank <inode number>

I can't remember the name of the zfs userspace library API, and it's end
of work day, but you can find it from the info above.

>> Perhaps what you are not considering is that future storage devices will
>> expose their internal key-value stores to the host? Some are already
>> available on the market. We'd like to efficiently support those from
>> standard code. They offer path-based lookup with performance orders of
>> magnitude faster than existing path lookup. Such storage would be
>> particularly suitable for build systems, which would create a "lookup
>> realm", and fire objects into that realm each with a binary identifier.
>> Entire realms can be efficiently cloned, or deleted. It's much faster
>> than text-path-based filesystem build artefact stores, because you can
>> avoid a kernel transition most of the time, userspace talks directly to
>> the storage device.
>
> I admit that this is something I have little exposure to.

Samsung are the most noisy mover here with their SNIA "standard" API for
key-value direct SSD access, but Seagate are quietly exposing to the
public ever more of their internal API endpoints.

Don't get me wrong, we're not there yet, and I have serious concerns
about the v1.0 SNIA spec, none of which Samsung took seriously because
they don't consider my feedback important. But standardisation here is
totally possible in the next few years. In the end, modern storage
devices tend to operate some form of key value store internally anyway
to do wear levelling, shingled writes etc, so exposing it publicly is
not hard.

>> Getting back to the OP's original question, I repeat once again, they
>> are best storing both the raw byte edition AND a UTF8-attempt at
>> conversion, try the raw byte array first, if unfound try the UTF8
>> edition converted to the local native filesystem encoding. It's the only
>> sensible approach.
>
> I think I agree here. The fallback to the UTF-8 encoded name may
> succeed in some cases when the raw code units are passed to an
> OS/library interface that mutates it before sending it down to the
> filesystem (e.g., MSVC's _fopen()). However, this fallback should
> probably be disabled if the producer knows that the UTF-8 representation
> is not accurate (e.g., contains a substitution character).

Unusual for us to agree isn't it Tom? :)

Niall

Received on 2019-09-06 20:03:49