Date: Thu, 18 Mar 2021 00:05:42 +0100
On 17/03/2021 18.29, Corentin via SG16 wrote:
> Hello,
>
> New revision of the paper which says "there should be some mechanism for the compiler to get fed utf-8 data for portability sake" paper.
>
> I've removed the whitespace discussion out of this paper, I think we can ignore it until there is a paper that only deals with that, if ever :)
>
> I'd love feedback before a future meeting
The term "Unicode scalar value" doesn't exist in ISO 10646.
Please replace by "UCS scalar value".
The paper defines UTF-8 source file and then says
"How the character set of a source file is determined is implementation-defined."
I'm unsure whether UTF-8 is intended to be a "character set" in that sense.
Also, we have
"An implementation accepts UTF-8 source files; The set of additional source
file character sets accepted is implementation-defined."
UTF-8 is not a character set, so "additional" feels like a category
error here.
From a presentation standpoint, I think it would be best if
UTF-8 source files would be handled first and completely in
the phase 1 description, and then all the implementation-defined
"other" stuff would be described. We could just say
"For any other source file, ..." and leave the existing text as-is.
Jens
> Hello,
>
> New revision of the paper which says "there should be some mechanism for the compiler to get fed utf-8 data for portability sake" paper.
>
> I've removed the whitespace discussion out of this paper, I think we can ignore it until there is a paper that only deals with that, if ever :)
>
> I'd love feedback before a future meeting
The term "Unicode scalar value" doesn't exist in ISO 10646.
Please replace by "UCS scalar value".
The paper defines UTF-8 source file and then says
"How the character set of a source file is determined is implementation-defined."
I'm unsure whether UTF-8 is intended to be a "character set" in that sense.
Also, we have
"An implementation accepts UTF-8 source files; The set of additional source
file character sets accepted is implementation-defined."
UTF-8 is not a character set, so "additional" feels like a category
error here.
From a presentation standpoint, I think it would be best if
UTF-8 source files would be handled first and completely in
the phase 1 description, and then all the implementation-defined
"other" stuff would be described. We could just say
"For any other source file, ..." and leave the existing text as-is.
Jens
Received on 2021-03-17 18:05:47