C++ Logo

sg16

Advanced search

Re: Suggested wording change for non-Unicode cases in P2286R7: Formatting Ranges

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Thu, 5 May 2022 07:44:20 +0200
On 05/05/2022 04.08, Barry Revzin wrote:
> I think I have applied this. Here's the rendered version: https://brevzin.github.io/cpp_proposals/2286_fmt_ranges/p2286r8.html#pnum_12 <https://brevzin.github.io/cpp_proposals/2286_fmt_ranges/p2286r8.html#pnum_12>

> How does this look?

p2.2

For each code sequence X in S that either encodes a single character or encoding state transition or that is a sequence of ill-formed code units is processed in order as follows:

That feels like bad English grammar to me.

Why "encoding", yet there is an "encodes" before that?
Why "either" and there are three things that don't
exactly correspond grammatically?

Maybe make a bulleted sub-list with the three items
so that the structure is clear.

"If C is one of the UCS scalar values the table below,"

add "in"

better clarify: "the two characters shown as the
corresponding escape sequence are appended to E"


after p2.3.4, p2.5

"simple-hexadecimal-digit-sequence"

I would not re-use lexing grammar for a local placeholder,
just say \u{/hex-digit-sequence/} or so.


p2.5

"Otherwise, X is a sequence of ill-formed code units. Each"

-> "Otherwise (X is a sequence of ill-formed code units), each code unit ..."


"U+0027 APOSTROPHE is escaped as \' while U+0022 QUOTATION MARK is left unchanged."

Can we rephrase that to avoid "is escaped as"? We were on such a good
track to just append characters and avoid any judgment calls.

suggestion "
 - for each character U+0027 APOSTROPHE in S, the two characters \' are appended to E
 - U+0022 QUOTATION MARK is left unchanged"


Jens

Received on 2022-05-05 05:44:26