On Sun, 7 Jan 2024 at 12:07, Robin Leroy <eggrobin@unicode.org> wrote:

On the specific point mentioned in the title of this email thread:

On Fri, Jan 5, 2024 at 5:54 PM Steve Downey <sdowney@gmail.com> wrote:
Did we actually specify something for C++23 that depends on or provides the breaking algorithms? 
Le ven. 5 janv. 2024 à 17:59, Corentin via SG16 <sg16@lists.isocpp.org> a écrit :
std::format width estimation requires clustering

For a sequence of characters in UTF-8, UTF-16, or UTF-32, an implementation should use as its field width the sum of the field widths of the first code point of each extended grapheme cluster. Extended grapheme clusters are defined by UAX #29 of the Unicode Standard. […]

That is should, not shall, so is there really a conformance requirement here, regardless of the version meant by the phrase “the Unicode Standard”?
See also this discussion from the 2022-11-02 meeting of SG 16:
  • Hubert asked why the reference for extended grapheme cluster is non-normative.
  • Jens replied that he thinks UAX #29 is only referenced to satisfy normative encouragement for an implementation direction.
  • Charlie expressed agreement with Jens' recollection.

Yes, that's a good point. If I use the field width of the first code point in <some cluster that bears a resemblance to an extended grapheme cluster as described by Unicode> then that's still conforming.

So a best effort, or an older definition of extended grapheme cluster, is better than nothing.