On Fri, Jun 10, 2022 at 11:16 AM William M. (Mike) Miller <william.m.miller@gmail.com> wrote:
On Fri, Jun 10, 2022 at 11:05 AM Hubert Tong via Core <core@lists.isocpp.org> wrote:
I've merged the suggestions (add "physical", use the parenthetical for the non-UTF-8 case, use plural form for designating, have wider-scope implementation-defined wording for non-UTF-8 case that encompasses the permission from the parenthetical):

I'm happy with this, with one exception noted below:

Change made; I've also made the parenthetical about end-of-line indicators into a note:

An implementation shall support physical source files that are a sequence of UTF-8 code units (UTF-8 source files). It may also support an implementation-defined set of other kinds of physical source files, and, if so, the kind of a physical source file is determined in an implementation-defined manner, which includes a means of designating physical source files as UTF-8 source files, independent of their content. [Note: In other words, recognizing the U+FEFF Byte Order Mark is not sufficient. --end note]

If a physical source file is determined to be a UTF-8 source file, then it shall be a well-formed UTF-8 code unit sequence and it is decoded to produce a sequence of UCS scalar values that constitutes the sequence of elements of the translation character set. For any other kind of physical source file supported by the implementation, characters are mapped, in an implementation-defined manner, to a sequence of translation character set elements. [Note: This can introduce new-line characters for end-of-line indicators --end note]
 
 
An implementation shall support physical source files that are a sequence of UTF-8 code units (UTF-8 source files). It may also support an implementation-defined set of other kinds of physical source files, and, if so, the kind of a physical source file is determined in an implementation-defined manner, which includes a means of designating physical source files as UTF-8 source files, independent of their content. [Note: In other words, recognizing the U+FEFF Byte Order Mark is not sufficient. --end note]

If a physical source file is designated or otherwise determined

Per the preceding paragraph, "determined" includes "designated" - "designating" is one mechanism for "determining" - so I'd be happier if this were shortened to just "...file is determined..."
 
to be a UTF-8 source file, then it shall be a well-formed UTF-8 code unit sequence and it is decoded to produce a sequence of UCS scalar values that constitutes the sequence of elements of the translation character set. For any other kind of physical source file supported by the implementation, characters are mapped, in an implementation-defined manner, to a sequence of translation character set elements (introducing new-line characters for end-of-line indicators).