Date: Thu, 7 Mar 2019 10:30:55 -0500
On Thu, Mar 07, 2019 at 00:15:34 -0500, Tom Honermann wrote:
> I don't know of any that use 32-bit code units for file names.
>
> I find myself thinking (as I so often do these days much to the surprise
> of my past self), how does EBCDIC and z/OS fit in here? If we stick to
> JSON and require the dependency file to be UTF-8 encoded, would all file
> names in these files be raw8 encoded and effectively unreadable (by
> humans) on z/OS? Perhaps we could allow more flexibility, but doing so
> necessarily invites locales into the discussion (for those that are
> unaware, EBCDIC has code pages too). For example, we could require that
> the selected locale match between the producers and consumers of the
> file (UB if they don't) and permit use of the string representation by
> transcoding from the locale interpreted physical file name to UTF-8, but
> only if reverse-transcoding produces the same physical file name,
> otherwise the appropriate raw format must be used.
I first tried saying "treat these strings as if they were byte arrays"
with allowances for escaping `"` and `\`, but there was pushback on the
previous thread about it. This basically makes a new dialect of JSON
which is (usually) an error in existing implementations. It would mean
that tools are implementing their own JSON parsers (or even writers)…
Note that if you'd like to have a readable filename, adding it as a
`_readable` key with a human-readable utf-8 transcoding to the filename
would be supported (see my message with the JSON schema bits from
yesterday).
--Ben
> I don't know of any that use 32-bit code units for file names.
>
> I find myself thinking (as I so often do these days much to the surprise
> of my past self), how does EBCDIC and z/OS fit in here? If we stick to
> JSON and require the dependency file to be UTF-8 encoded, would all file
> names in these files be raw8 encoded and effectively unreadable (by
> humans) on z/OS? Perhaps we could allow more flexibility, but doing so
> necessarily invites locales into the discussion (for those that are
> unaware, EBCDIC has code pages too). For example, we could require that
> the selected locale match between the producers and consumers of the
> file (UB if they don't) and permit use of the string representation by
> transcoding from the locale interpreted physical file name to UTF-8, but
> only if reverse-transcoding produces the same physical file name,
> otherwise the appropriate raw format must be used.
I first tried saying "treat these strings as if they were byte arrays"
with allowances for escaping `"` and `\`, but there was pushback on the
previous thread about it. This basically makes a new dialect of JSON
which is (usually) an error in existing implementations. It would mean
that tools are implementing their own JSON parsers (or even writers)…
Note that if you'd like to have a readable filename, adding it as a
`_readable` key with a human-readable utf-8 transcoding to the filename
would be supported (see my message with the JSON schema bits from
yesterday).
--Ben
Received on 2019-03-07 16:31:09