C++ Logo

sg16

Advanced search

Re: [isocpp-core] P2295 Support for UTF-8 as a portable source file encoding

From: Corentin <corentin.jabot_at_[hidden]>
Date: Sat, 11 Jun 2022 10:01:33 +0200
New draft, using that wording, except that I'm not touching the end of line
indicators, so that we can do that in P2348
https://isocpp.org/files/papers/D2295R6.pdf

On Fri, Jun 10, 2022 at 5:32 PM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:

> On 10/06/2022 17.16, William M. (Mike) Miller via Core wrote:
> > On Fri, Jun 10, 2022 at 11:05 AM Hubert Tong via Core <
> core_at_[hidden] <mailto:core_at_[hidden]>> wrote:
> >
> > I've merged the suggestions (add "physical", use the parenthetical
> for the non-UTF-8 case, use plural form for designating, have wider-scope
> implementation-defined wording for non-UTF-8 case that encompasses the
> permission from the parenthetical):
> >
> >
> > I'm happy with this, with one exception noted below:
> >
> >
> > An implementation shall support physical source files that are a
> sequence of UTF-8 code units (UTF-8 source files). It may also support an
> implementation-defined set of other kinds of physical source files, and, if
> so, the kind of a physical source file is determined in an
> implementation-defined manner, which includes a means of designating
> physical source files as UTF-8 source files, independent of their content.
> [Note: In other words, recognizing the U+FEFF Byte Order Mark is not
> sufficient. --end note]
> >
> > If a physical source file is designated or otherwise determined
> >
> >
> > Per the preceding paragraph, "determined" includes "designated" -
> "designating" is one mechanism for "determining" - so I'd be happier if
> this were shortened to just "...file is determined..."
>
> Agreed.
>
> Jens
>
>
> >
> > to be a UTF-8 source file, then it shall be a well-formed UTF-8 code
> unit sequence and it is decoded to produce a sequence of UCS scalar values
> that constitutes the sequence of elements of the translation character set.
> For any other kind of physical source file supported by the implementation,
> characters are mapped, in an implementation-defined manner, to a sequence
> of translation character set elements (introducing new-line characters for
> end-of-line indicators).
> >
> >
> >
> > _______________________________________________
> > Core mailing list
> > Core_at_[hidden]
> > Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> > Link to this post: http://lists.isocpp.org/core/2022/06/12698.php
>
>

Received on 2022-06-11 08:01:45