C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] P1208R3 / source_location
From: Axel Naumann (Axel.Naumann_at_[hidden])
Date: 2019-02-19 18:25:03


Thanks everyone, this is what I'll take to Core.
Axel.

On 19.02.19 13:58, Corentin wrote:
> After talking with Tom, I'd like to modify function_name to be a
> NTMBS as it is something we can actually guarantee and I don't think
> __func__ should constrain the design of source location. It would
> consistent with thTstatisfy the NB comment (whose resolution was
> adopted in that direction this morning)
>
> Tom convinced me that filename cannot and should not be a NTMBS
>
>
> On Tue, 19 Feb 2019 at 13:22 Robert Douglas <rwdougla_at_[hidden]
> <mailto:rwdougla_at_[hidden]>> wrote:
>
> Agree.
>
> On Tue, Feb 19, 2019 at 5:17 PM Tom Honermann <tom_at_[hidden]
> <mailto:tom_at_[hidden]>> wrote:
>
> On 2/18/19 1:17 PM, Robert Douglas wrote:
>> Historical footnote, these are intended to be as drop-in as
>> possible for existing facilities. __FILE__ is a "character
>> string literal," which gets it's null termination in phase 7.
>> Since we are accessing these at run-time, we should thus
>> expect these to be NTBS. Changes to this expectation would be
>> a deviation from these being a drop-in replacement to
>> __FILE__ and __func__. Note that [dcl.fct.def.general]
>>  p 8 defines __func__ as an implementation-defined string as
>> if static const char __func__[] = "function-name "; which
>> implies, also, an NTBS. This is the reasoning for NTBS. To do
>> otherwise, would deviate this feature from __FILE__ and
>> __func__, which it is designed to replace.
>
> Agreed.  Certainly guaranteeing that these have a null
> terminator is required given that file_name() returns const
> char*.  I don't agree with associating these with NTMBSs
> though since multi-byte has encoding implications.
>
> Tom.
>
>>
>>
>> On Mon, Feb 18, 2019 at 11:20 AM Corentin
>> <corentin.jabot_at_[hidden] <mailto:corentin.jabot_at_[hidden]>>
>> wrote:
>>
>> Quick reply : display only, no expectation the file can
>> be open, or exists, or is a file. It's purely
>> informative. But expectation it can be displayed, the
>> main use cases being logging. Otherwise I agree with you.
>>
>> On Mon, Feb 18, 2019, 7:16 AM Tom Honermann
>> <tom_at_[hidden] <mailto:tom_at_[hidden]>> wrote:
>>
>>
>> On Feb 18, 2019, at 10:04 AM, Corentin
>> <corentin.jabot_at_[hidden]
>> <mailto:corentin.jabot_at_[hidden]>> wrote:
>>
>>>
>>> Very good points. 
>>> Wouldn't it be sufficient to specify that the
>>> strings are NTMBS encoded using the execution
>>> character set?
>>> source_location currently avoids making any
>>> assumption about how these strings are formed,
>>> including that they are derived from a source file.
>>> So since the value is implementation-defined, so
>>> should be the way it's constructed. 
>>> However, it is reasonable to assume that these
>>> things are valid text and therefore have a known
>>> encoding.
>>>
>>> Adding Tom, because this is borderline SG16 territory. 
>>
>> This isn’t borderline as we have (recently) requested
>> review of anything involving file names. 
>>
>>>
>>>
>>> @Tom: Do you want to see source_location this week
>>> knowing that I'd hope it would get through LWG
>>> before the end of the week?
>>> Or do you think having function_name / filename as
>>> multi-bytes strings encoded using the execution
>>> character set is reasonable?
>>> The alternative I see are
>>>
>>> * Leave it unspecified
>>> * Force a specific character set... which the
>>> world is not ready for
>>>
>> I think there is a higher level question to answer.
>> Are the provided file names display only, or should
>> one expect to be able to open the file using the
>> provided name?
>>
>> If they are display only, then we can specify an
>> encoding for them similarly to what is done for
>> member functions of std::filesystem::path. In this
>> case, we must explicitly acknowledge that the names
>> do not roundtrip through the filesystem (though
>> typically will in practice). Note that, on Windows,
>> file names cannot be represented accurately using
>> char based strings, so unless we want to add wchar_t
>> support, these names will be technically display only. 
>>
>> If they are potentially not display only, then we
>> can’t associate an encoding and the names are
>> bags-of-bytes. This is a limitation of POSIX. But
>> then we need wchar_t support for Windows. 
>>
>> In San Diego, the guidance we gave for the stacktrace
>> proposal is that file names are  implementation
>> defined bags-of-bytes. If we advised otherwise for
>> source location, we would be giving inconsistent
>> guidance. 
>>
>> I think we should discuss this in SG16 this week. Not
>> necessarily to propose changes for the proposal, but
>> to solidify our collective thinking around file names. 
>>
>> Tom. 
>>>
>>> Thanks, 
>>> Corentin
>>>
>>>
>>>
>>> On Mon, 18 Feb 2019 at 03:56 Axel Naumann
>>> <Axel.Naumann_at_[hidden] <mailto:Axel.Naumann_at_[hidden]>>
>>> wrote:
>>>
>>> Hi Robert,
>>>
>>> Regarding your P1208R3:
>>>
>>> Nit: it's titled "D1208R3", it doesn't mention
>>> email addresses.
>>>
>>> Not-so-nit: a NB comment on the reflection TS
>>> asks to not use NTBS but
>>> NTMBS and "Where NTBS is mentioned in the
>>> document under ballot, the
>>> encoding used for the string’s value is
>>> unspecified." Jens agrees that
>>> the proposed solution should be applied:
>>> "Specify that the strings are
>>> first formed using the basic source character
>>> set (with
>>> universal-character-names as necessary) then
>>> mapped in the manner
>>> applied to string literals with no encoding
>>> prefix in phases 5 and 6 of
>>> translation."
>>>
>>> I would very much hope that both changes are
>>> also applied to P1208R3. I
>>> call this out explicitly in our recommended NB
>>> comment response paper.
>>>
>>> Cheers, Axel.
>>>
>



SG16 list run by herb.sutter at gmail.com