For information, since interest was expressed in today's meeting.
Wide characters are mostly a C/C++ invention. For EBCDIC encodings that do not have multibyte characters, the wide encoding of a character consists of the unsigned char value of the character in a wchar_t.
EBCDIC also has multibyte encodings. These are formed by pairing single-byte encodings and double-byte encodings. The unification of single-byte and double-byte encodings into a multibyte, stateful "narrow" encoding is achieved using shift-out/shift-in.
The wide encoding of a character from a multibyte EBCDIC encoding is as described above for a character from the single-byte component encoding. For a character from the double-byte component encoding, the wide encoding of a character consists of the value obtained by using the first byte of the double-byte character as the upper 8 bits of a 16-bit value and the second byte as the lower 8 bits.
While the link still works (they just shuffled
everything earlier this year), the following document describes the
shift state usage:
To figure out what form of EBCDIC a CCSID refers to, the following document describes the "encoding schemes" (which, in this usage, is more the nature of the encoding or "meta encoding schemes") and the scheme associated with various CCSIDs (it also includes names for the CCSIDs):
The component single-byte and double-byte encodings for a multibyte EBCDIC encoding can be found here:
As a bonus, a table that maps between EBCDIC and Unix or "PC" encodings is here: