Date: Wed, 27 Feb 2019 09:45:11 +0100
On Tue, Feb 26, 2019 at 5:59 PM Steve Downey <sdowney_at_[hidden]> wrote:
> "Most".
> But that's why I said it might be reasonable. Depends how much IBM cares
> about this for zOS and, in addition, if anyone is using non-unicode
> encodings for the file names in Japan that we will need to support. JIS
> doesn't round trip through unicode cleanly, although there isn't
> information loss. For historical reasons there are identical characters
> with different encodings.
>
Oh wow, not being able to round-trip is... unexpected. Thx for the info!
> On Tue, Feb 26, 2019 at 10:46 AM Manuel Klimek <klimek_at_[hidden]> wrote:
>
>> On Tue, Feb 26, 2019 at 3:59 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>
>>> The compilation db is at least nominally JSON, so the strings are
>>> Unicode. The names of files and directories are not.
>>>
>>
>> I was under the impression that POSIX doesn't specify an encoding, but
>> says the bytes 0x00 and 0x2f (/) are special. Thus, if we want to do
>> unicode file names, it's pretty much utf-8 that fits, which if I'm not
>> mistaken most OS'es are now using?
>>
>>
>>> If you stick to ascii, you won't notice. But if you transcode a filename
>>> to unicode the filesystem might not recognize it anymore.
>>>
>>> This might be a reasonable tradeoff, that you need to have your file
>>> names in a unicode encoding, but it's not the current state of the world.
>>>
>>> On Tue, Feb 26, 2019 at 8:52 AM Manuel Klimek <klimek_at_[hidden]> wrote:
>>>
>>>> On Tue, Feb 26, 2019 at 2:30 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>>>
>>>>> I'm pretty sure compilation DB totally ignores this, and is easy to
>>>>> get invalid json in. Makefile syntax would care somewhat less.
>>>>
>>>>
>>>> What specifically do you mean? The encoding or the path?
>>>>
>>>>
>>>>>
>>>>> I don't think it would be the worst thing for these tools to require
>>>>> Unicode without any normalization. You have to be able to fopen, and that
>>>>> means an exact match. I don't know the state of the world for Windows for
>>>>> utf-8 vs ucs2. Can we reliably get the file open?
>>>>>
>>>>> I think the TR can place more requirements than the IS can on file
>>>>> names.
>>>>>
>>>>> On Tue, Feb 26, 2019, 04:50 Manuel Klimek <klimek_at_[hidden]> wrote:
>>>>>
>>>>>> On Tue, Feb 26, 2019 at 4:01 AM Ben Boeckel <ben.boeckel_at_[hidden]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Feb 25, 2019 at 09:52:34 +0100, Manuel Klimek wrote:
>>>>>>> > In the compilation database (
>>>>>>> > https://clang.llvm.org/docs/JSONCompilationDatabase.html) we
>>>>>>> specify the
>>>>>>> > build dir for each file.
>>>>>>>
>>>>>>> But that is (generally) output from the build system, not the
>>>>>>> compiler.
>>>>>>> The build system knows because…well, it does. The compiler is just
>>>>>>> invoked in a working directory and given no indication of where a
>>>>>>> "root"
>>>>>>> directory is (and I think it might be silly to pass it on the command
>>>>>>> line just to have it in this file, but maybe not).
>>>>>>
>>>>>>
>>>>>> Can't the compiler put in the current work directory it's been called
>>>>>> with? That's what I'd expect.
>>>>>> _______________________________________________
>>>>>> Tooling mailing list
>>>>>> Tooling_at_[hidden]
>>>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>>>
>>>>> _______________________________________________
>>>>> Tooling mailing list
>>>>> Tooling_at_[hidden]
>>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>>
>>>> _______________________________________________
>>>> Tooling mailing list
>>>> Tooling_at_[hidden]
>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>
>>> _______________________________________________
>>> Tooling mailing list
>>> Tooling_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/tooling
>>>
>> _______________________________________________
>> Tooling mailing list
>> Tooling_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/tooling
>>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
> "Most".
> But that's why I said it might be reasonable. Depends how much IBM cares
> about this for zOS and, in addition, if anyone is using non-unicode
> encodings for the file names in Japan that we will need to support. JIS
> doesn't round trip through unicode cleanly, although there isn't
> information loss. For historical reasons there are identical characters
> with different encodings.
>
Oh wow, not being able to round-trip is... unexpected. Thx for the info!
> On Tue, Feb 26, 2019 at 10:46 AM Manuel Klimek <klimek_at_[hidden]> wrote:
>
>> On Tue, Feb 26, 2019 at 3:59 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>
>>> The compilation db is at least nominally JSON, so the strings are
>>> Unicode. The names of files and directories are not.
>>>
>>
>> I was under the impression that POSIX doesn't specify an encoding, but
>> says the bytes 0x00 and 0x2f (/) are special. Thus, if we want to do
>> unicode file names, it's pretty much utf-8 that fits, which if I'm not
>> mistaken most OS'es are now using?
>>
>>
>>> If you stick to ascii, you won't notice. But if you transcode a filename
>>> to unicode the filesystem might not recognize it anymore.
>>>
>>> This might be a reasonable tradeoff, that you need to have your file
>>> names in a unicode encoding, but it's not the current state of the world.
>>>
>>> On Tue, Feb 26, 2019 at 8:52 AM Manuel Klimek <klimek_at_[hidden]> wrote:
>>>
>>>> On Tue, Feb 26, 2019 at 2:30 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>>>
>>>>> I'm pretty sure compilation DB totally ignores this, and is easy to
>>>>> get invalid json in. Makefile syntax would care somewhat less.
>>>>
>>>>
>>>> What specifically do you mean? The encoding or the path?
>>>>
>>>>
>>>>>
>>>>> I don't think it would be the worst thing for these tools to require
>>>>> Unicode without any normalization. You have to be able to fopen, and that
>>>>> means an exact match. I don't know the state of the world for Windows for
>>>>> utf-8 vs ucs2. Can we reliably get the file open?
>>>>>
>>>>> I think the TR can place more requirements than the IS can on file
>>>>> names.
>>>>>
>>>>> On Tue, Feb 26, 2019, 04:50 Manuel Klimek <klimek_at_[hidden]> wrote:
>>>>>
>>>>>> On Tue, Feb 26, 2019 at 4:01 AM Ben Boeckel <ben.boeckel_at_[hidden]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Feb 25, 2019 at 09:52:34 +0100, Manuel Klimek wrote:
>>>>>>> > In the compilation database (
>>>>>>> > https://clang.llvm.org/docs/JSONCompilationDatabase.html) we
>>>>>>> specify the
>>>>>>> > build dir for each file.
>>>>>>>
>>>>>>> But that is (generally) output from the build system, not the
>>>>>>> compiler.
>>>>>>> The build system knows because…well, it does. The compiler is just
>>>>>>> invoked in a working directory and given no indication of where a
>>>>>>> "root"
>>>>>>> directory is (and I think it might be silly to pass it on the command
>>>>>>> line just to have it in this file, but maybe not).
>>>>>>
>>>>>>
>>>>>> Can't the compiler put in the current work directory it's been called
>>>>>> with? That's what I'd expect.
>>>>>> _______________________________________________
>>>>>> Tooling mailing list
>>>>>> Tooling_at_[hidden]
>>>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>>>
>>>>> _______________________________________________
>>>>> Tooling mailing list
>>>>> Tooling_at_[hidden]
>>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>>
>>>> _______________________________________________
>>>> Tooling mailing list
>>>> Tooling_at_[hidden]
>>>> http://www.open-std.org/mailman/listinfo/tooling
>>>>
>>> _______________________________________________
>>> Tooling mailing list
>>> Tooling_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/tooling
>>>
>> _______________________________________________
>> Tooling mailing list
>> Tooling_at_[hidden]
>> http://www.open-std.org/mailman/listinfo/tooling
>>
> _______________________________________________
> Tooling mailing list
> Tooling_at_[hidden]
> http://www.open-std.org/mailman/listinfo/tooling
>
Received on 2019-02-27 09:45:28