On 12/5/18 8:31 PM, Markus Scherer wrote:

On Wed, Dec 5, 2018 at 3:34 PM Steve Downey <sdowney@gmail.com> wrote:

How many contain text that is not already UTF-8?

I am not sure what you are asking. Most of the u8"literals" I am seeing contain non-ASCII characters. Many as literal characters, a bunch of \uhhhh, and a few \U00hhhhhh.

I was likewise uncertain about this question.

Steve, I'm guessing the question you're trying to get at is, would there be any behavioral difference if the u8 prefix was simply dropped? I think this is equivalent to asking the question, are the source files for these examples encoded as UTF-8 and is the compiler invoked such that the source encoding and presumed execution encoding are both UTF-8 (always the case for Clang, the default for gcc unless -finput-charset or -fexec-charset is used, and not the case for MSVC unless /utf-8 is used).

Tom.