C++ Logo

sg16

Advanced search

Re: [SG16] On whitespaces and new-line

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Thu, 25 Mar 2021 23:04:58 -0400
On Thu, Mar 25, 2021 at 5:29 PM Corentin via SG16 <sg16_at_[hidden]>
wrote:

> Clang doesn't support more than what the standard specifies, although they
> do have a table of unicode whitespaces
> for the purpose of filtering them out in ucns
>
>
> https://github.com/llvm/llvm-project/blob/62ec4ac90738a5f2d209ed28c822223e58aaaeb7/clang/include/clang/Basic/CharInfo.h#L70
>
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/UnicodeCharSets.h#L401
>

Clang's behaviour is... interesting... but not compatible with "correct
UTF-8 handling during phase 1".

Given the non-UCN version of the following:
#define STR2( X ) # X
#define CONCAT2( X, Y ) X ## Y
#define CONCAT( X, Y ) CONCAT2( X, Y )
#define U32STR( X ) CONCAT( U, STR2( X ) )
constexpr char32_t s[2] = U32STR(\u00a0);
static_assert(s[0] == U'\u00a0', "");

Clang wants to pretend there's no "non-whitespace character" token. MSVC
compiles the source fine: https://godbolt.org/z/Pnnj3f3b3
So this thread has been most useful in bringing the whitespace issue back
into the space that needs clearing before P2295.

Received on 2021-03-25 22:05:27