On 10/06/2022 10.02, Corentin via SG16 wrote:
> It's also very repetitive but maybe we can massage that a bit.
I'm not seeing serious repetition if you take phrases such
as "UTF-8 code units" as words of power.
+1.
> Lastly, I really don't like the " There are no end-of-line indicators apart from the content of the UTF-8 code unit sequence" which is more confusing than enlightening.
I'm fine with removing the note, but I would like to see
the parenthetical
"(introducing new-line characters for end-of-line indicators)"
restored for the "any other kind" case.
(Omitting the parenthetical feels like a regression.)
+1.
> It's also unfortunate that the utf-8-ness is tied to a medium rather than the content,
I don't follow. We can't rely on "content" alone, because we want to diagnose
ill-formed UTF-8 code units. If we relied on "content" alone, an ill-formed
UTF-8 code unit would, by definition, make the source file "not UTF-8", and we'd
lose the diagnostic.
+1. I considered the possibility of defining "UTF-8 source file" as a well-formed sequence of UTF-8 code units", and I rejected that because I want to be able to talk about an "ill-formed UTF-8 file".