I don't think we should entertain any notion of "same character" in C++,
beyond value comparisons in the execution encoding and "identity" as
needed for "same identifier".
We need to in/before phase 1, but I think we reached the consensus that we otherwise
shouldn't and wouldn't
To be clear, we need to make sure we are on the same page with respect to the meta (notion of) notion of "same character":
By "character", do we mean an "abstract character" or a "coded character"?
abstract character in phase 1 ( to get rid of "abstract character" in phase 1, we would have to assume that we have encoded text already - I think that would be a reasonable assumption )
I think our definition of the members of the "basic source character set" would still be in terms of abstract characters. The input, I believe, needs to be considered encoded text in order to encapsulate all of the perceived relevant differences between characters.
I think that the relationships between terms represent an ideal that is not met in practice. "Abstract character" is a meaningful notion; however, the ideal that coded character sets are a bijective function between values in a codespace and abstract characters has not been clearly attained.
Coded characters sets encode a set of abstract characters (unicode has non-characters) .
I believe the U+00C5/U+212B situation points out why we have a problem when trying to handle abstract characters. At the lower technical leve, the coded character set has them as different abstract characters. At a higher level, they are considered the same. If we deal in coded characters, we would not need to handle the "philosophical questions".