Date: Sat, 2 Jan 2021 23:15:02 -0500
Happy New Year! And what better way to start off a new year than by
discussing the utility (or lack thereof) of BOMs in UTF-8 text!
Attached is a 2nd draft of a paper intended to clarify guidance in the
Unicode standard for when a BOM should or should not be used in UTF-8
text. Discussion of the prior draft can be found in the Unicode.org
mail archives
<https://corp.unicode.org/pipermail/unicode/2020-October/009070.html>.
This draft contains the following changes:
1. An abstract was added.
2. The Introduction section was modified as follows:
1. A link to the email thread with initial draft feedback was added.
2. The text was modified to highlight inconsistent interpretation
of the existing guidance as opposed to the intent.
3. A quote from section 2.13, "Special Characters" regarding
Unicode signatures was added.
3. The Proposed Resolution section was modified as follows:
1. The section was renamed from "Possible Resolutions".
2. The previously discussed possible changes are now presented as
two distinct options.
3. Proposed wording was added for the first option.
4. The proposed wording for the second option was directed to
section 23.8.
5. Option 2 was modified as follows:
1. The guidance for protocol designers was updated to avoid
adding a BOM to ASCII text thus rendering such text non-ASCII.
2. The guidance for text authors regarding when to use a BOM
was expanded to cover files that may be opened by
applications with different encoding expectations.
Thank you to everyone that shared their thoughts on the prior draft.
Assuming no substantially new feedback, I plan to submit this paper in a
week or so.
Tom.
discussing the utility (or lack thereof) of BOMs in UTF-8 text!
Attached is a 2nd draft of a paper intended to clarify guidance in the
Unicode standard for when a BOM should or should not be used in UTF-8
text. Discussion of the prior draft can be found in the Unicode.org
mail archives
<https://corp.unicode.org/pipermail/unicode/2020-October/009070.html>.
This draft contains the following changes:
1. An abstract was added.
2. The Introduction section was modified as follows:
1. A link to the email thread with initial draft feedback was added.
2. The text was modified to highlight inconsistent interpretation
of the existing guidance as opposed to the intent.
3. A quote from section 2.13, "Special Characters" regarding
Unicode signatures was added.
3. The Proposed Resolution section was modified as follows:
1. The section was renamed from "Possible Resolutions".
2. The previously discussed possible changes are now presented as
two distinct options.
3. Proposed wording was added for the first option.
4. The proposed wording for the second option was directed to
section 23.8.
5. Option 2 was modified as follows:
1. The guidance for protocol designers was updated to avoid
adding a BOM to ASCII text thus rendering such text non-ASCII.
2. The guidance for text authors regarding when to use a BOM
was expanded to cover files that may be opened by
applications with different encoding expectations.
Thank you to everyone that shared their thoughts on the prior draft.
Assuming no substantially new feedback, I plan to submit this paper in a
week or so.
Tom.
Received on 2021-01-02 22:15:11