C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Width Formatting using std::format with custom locale [user defined do_get_separator()]

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Sun, 31 Mar 2024 18:24:26 +0100
On Sun, 31 Mar 2024, 13:21 David Armour, <dave.armour_at_[hidden]> wrote:

> I disagree. If you read the whole section, any reference to string is
> related to the "output".
>


Which section are you reading? In [format.string.std] none of them is
talking about the output. It's talking about formatting arguments of
"fundamental and string types".

The output isn't even guaranteed to be a "string", it's a sequence of
characters written to an output iterator.




>
>
>
> On 31/3/2024 20:05, Jonathan Wakely wrote:
>
>
>
> On Sun, 31 Mar 2024, 12:54 David Armour, <dave.armour_at_[hidden]> wrote:
>
>> I read this wording in the C++20 standard in section "20.20.2.2 Standard
>> format specifiers" under point 10 :
>>
>> "
>>
>> *For the purposes of width computation, a string is assumed to be in a
>> locale-independent, implementation- defined encoding. Implementations
>> should use a Unicode encoding on platforms capable of displaying Unicode
>> text in a terminal.*"
>>
>> Seems pretty explicit. Width computation does not include locale and is
>> ultimately implementation defined.
>>
>
> That's talking about formatting a string. You're formatting a
> floating-point number.
>
> That paragraph is not talking about the output of format.
>
>
>>
>> I agree it would seem obvious to have std::format follow imbue() when it
>> comes to locales, but it appears not to be the case.
>>
>> I am asking that the standard to become specific on this point.
>>
>> I have already asked Microsoft to modify their compiler behaviour but I
>> would prefer to have the standard clearly back it up.
>>
>>
>>
>> On 31/3/2024 17:58, Jonathan Wakely wrote:
>>
>>
>>
>> On Sun, 31 Mar 2024 at 10:55, Jonathan Wakely <cxx_at_[hidden]> wrote:
>>
>>>
>>>
>>> On Sun, 31 Mar 2024 at 10:42, David Armour via Std-Proposals <
>>> std-proposals_at_[hidden]> wrote:
>>>
>>>> I encountered an issue using std::format to format numbers which I
>>>> wanted to have a custom comma thousand separator in the display.
>>>>
>>>> When using the customised locale with MSVC I discovered the added
>>>> commas
>>>> pushed the formatted number to the right making the number no longer
>>>> conform to the width specifier.
>>>>
>>>> I tested the same code with GCC and found std::format() to conform
>>>> correctly.
>>>>
>>>> I also tested both MSVC and GCC using std::cout.imbue() to make the
>>>> stream format the numbers to my custom locale. In this instance it
>>>> worked correctly on both compilers.
>>>>
>>>> I tried to find some info about this problem and it seems that for
>>>> streams, the standard requires the use of std::setw() to include the
>>>> custom locale but for std::format it is left to the compiler
>>>> implementers.
>>>>
>>>
>>> Why did you conclude that?
>>>
>>> It seems obvious to me that any digit separators inserted for a
>>> locale-specific form should be included in the estimated field width.
>>>
>>> "The estimated field width is the number of field width units that are
>>> required for the formatted sequence of characters produced for a format
>>> argument independent of the effects of the width option."
>>>
>>>
>>>
>>>
>>>> I would suggest that the standard forces the same requirement to manage
>>>> the custom locale in the width specifier for std::format as well as for
>>>> imbue() on streams.
>>>>
>>>
>>> I think it already requires the behaviour you expect. If MSVC doesn't
>>> count digit separators in the estimated field width, that seems like a bug.
>>>
>>>
>>>
>>>>
>>>> A code sample that reproduces the problem is found below. As mentioned,
>>>> this needs to be compiled on MSVC to see the problem. I have not tested
>>>> any other compilers to see if they have the same behaviour as MSVC.
>>>>
>>>
>> For your example code, LLVM's libc++ produces the same output as GCC.
>>
>>
>>
>

Received on 2024-03-31 17:25:48