Date: Tue, 26 Feb 2019 11:59:30 -0500
"Most".
But that's why I said it might be reasonable. Depends how much IBM cares
about this for zOS and, in addition, if anyone is using non-unicode
encodings for the file names in Japan that we will need to support. JIS
doesn't round trip through unicode cleanly, although there isn't
information loss. For historical reasons there are identical characters
with different encodings.
On Tue, Feb 26, 2019 at 10:46 AM Manuel Klimek <klimek_at_[hidden]> wrote:
> On Tue, Feb 26, 2019 at 3:59 PM Steve Downey <sdowney_at_[hidden]> wrote:
>
>> The compilation db is at least nominally JSON, so the strings are
>> Unicode. The names of files and directories are not.
>>
>
> I was under the impression that POSIX doesn't specify an encoding, but
> says the bytes 0x00 and 0x2f (/) are special. Thus, if we want to do
> unicode file names, it's pretty much utf-8 that fits, which if I'm not
> mistaken most OS'es are now using?
>
>
>> If you stick to ascii, you won't notice. But if you transcode a filename
>> to unicode the filesystem might not recognize it anymore.
>>
>> This might be a reasonable tradeoff, that you need to have your file
>> names in a unicode encoding, but it's not the current state of the world.
>>
>> On Tue, Feb 26, 2019 at 8:52 AM Manuel Klimek <klimek_at_[hidden]> wrote:
>>
>>> On Tue, Feb 26, 2019 at 2:30 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>>
>>>> I'm pretty sure compilation DB totally ignores this, and is easy to get
>>>> invalid json in. Makefile syntax would care somewhat less.
>>>
>>>
>>> What specifically do you mean? The encoding or the path?
>>>
>>>
>>>>
>>>> I don't think it would be the worst thing for these tools to require
>>>> Unicode without any normalization. You have to be able to fopen, and that
>>>> means an exact match. I don't know the state of the world for Windows for
>>>> utf-8 vs ucs2. Can we reliably get the file open?
>>>>
>>>> I think the TR can place more requirements than the IS can on file
>>>> names.
>>>>
>>>> On Tue, Feb 26, 2019, 04:50 Manuel Klimek <klimek_at_[hidden]> wrote:
>>>>
>>>>> On Tue, Feb 26, 2019 at 4:01 AM Ben Boeckel <ben.boeckel_at_[hidden]>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Feb 25, 2019 at 09:52:34 +0100, Manuel Klimek wrote:
>>>>>> > In the compilation database (
>>>>>> > https://clang.llvm.org/docs/JSONCompilationDatabase.html) we
>>>>>> specify the
>>>>>> > build dir for each file.
>>>>>>
>>>>>> But that is (generally) output from the build system, not the
>>>>>> compiler.
>>>>>> The build system knows because…well, it does. The compiler is just
>>>>>> invoked in a working directory and given no indication of where a
>>>>>> "root"
>>>>>> directory is (and I think it might be silly to pass it on the command
>>>>>> line just to have it in this file, but maybe not).
>>>>>
>>>>>
>>>>> Can't the compiler put in the current work directory it's been called
>>>>> with? That's what I'd expect.
>>>>> _______________________________________________
>>>>> Tooling mailing list
>>>>> Tooling_at_[hidden]
>>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>>
>>>> _______________________________________________
>>>> Tooling mailing list
>>>> Tooling_at_[hidden]
>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>
>>> _______________________________________________
>>> Tooling mailing list
>>> Tooling_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/tooling
>>>
>> _______________________________________________
>> Tooling mailing list
>> Tooling_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/tooling
>>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
But that's why I said it might be reasonable. Depends how much IBM cares
about this for zOS and, in addition, if anyone is using non-unicode
encodings for the file names in Japan that we will need to support. JIS
doesn't round trip through unicode cleanly, although there isn't
information loss. For historical reasons there are identical characters
with different encodings.
On Tue, Feb 26, 2019 at 10:46 AM Manuel Klimek <klimek_at_[hidden]> wrote:
> On Tue, Feb 26, 2019 at 3:59 PM Steve Downey <sdowney_at_[hidden]> wrote:
>
>> The compilation db is at least nominally JSON, so the strings are
>> Unicode. The names of files and directories are not.
>>
>
> I was under the impression that POSIX doesn't specify an encoding, but
> says the bytes 0x00 and 0x2f (/) are special. Thus, if we want to do
> unicode file names, it's pretty much utf-8 that fits, which if I'm not
> mistaken most OS'es are now using?
>
>
>> If you stick to ascii, you won't notice. But if you transcode a filename
>> to unicode the filesystem might not recognize it anymore.
>>
>> This might be a reasonable tradeoff, that you need to have your file
>> names in a unicode encoding, but it's not the current state of the world.
>>
>> On Tue, Feb 26, 2019 at 8:52 AM Manuel Klimek <klimek_at_[hidden]> wrote:
>>
>>> On Tue, Feb 26, 2019 at 2:30 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>>
>>>> I'm pretty sure compilation DB totally ignores this, and is easy to get
>>>> invalid json in. Makefile syntax would care somewhat less.
>>>
>>>
>>> What specifically do you mean? The encoding or the path?
>>>
>>>
>>>>
>>>> I don't think it would be the worst thing for these tools to require
>>>> Unicode without any normalization. You have to be able to fopen, and that
>>>> means an exact match. I don't know the state of the world for Windows for
>>>> utf-8 vs ucs2. Can we reliably get the file open?
>>>>
>>>> I think the TR can place more requirements than the IS can on file
>>>> names.
>>>>
>>>> On Tue, Feb 26, 2019, 04:50 Manuel Klimek <klimek_at_[hidden]> wrote:
>>>>
>>>>> On Tue, Feb 26, 2019 at 4:01 AM Ben Boeckel <ben.boeckel_at_[hidden]>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Feb 25, 2019 at 09:52:34 +0100, Manuel Klimek wrote:
>>>>>> > In the compilation database (
>>>>>> > https://clang.llvm.org/docs/JSONCompilationDatabase.html) we
>>>>>> specify the
>>>>>> > build dir for each file.
>>>>>>
>>>>>> But that is (generally) output from the build system, not the
>>>>>> compiler.
>>>>>> The build system knows because…well, it does. The compiler is just
>>>>>> invoked in a working directory and given no indication of where a
>>>>>> "root"
>>>>>> directory is (and I think it might be silly to pass it on the command
>>>>>> line just to have it in this file, but maybe not).
>>>>>
>>>>>
>>>>> Can't the compiler put in the current work directory it's been called
>>>>> with? That's what I'd expect.
>>>>> _______________________________________________
>>>>> Tooling mailing list
>>>>> Tooling_at_[hidden]
>>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>>
>>>> _______________________________________________
>>>> Tooling mailing list
>>>> Tooling_at_[hidden]
>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>
>>> _______________________________________________
>>> Tooling mailing list
>>> Tooling_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/tooling
>>>
>> _______________________________________________
>> Tooling mailing list
>> Tooling_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/tooling
>>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
Received on 2019-02-26 17:59:45