sg12: Re: [ub] [c++std-ext-14592] Re: Re: Sized integer types and char bits

From: Ion Gaztañaga <igaztanaga_at_[hidden]>
Date: Sun, 27 Oct 2013 23:46:35 +0100

El 27/10/2013 18:12, Jeffrey Yasskin escribió:
> And AFAICS they didn't bother to implement a C++ compiler at all,
> indicating to me that the niche for C is that of being easy to
> implement, not that of supporting more machines (since it _doesn't_
> support the efficient mode for this machine).
>
> If we make the C++ definition stricter, either unusual machines will
> keep implementing just C because it's still easier, or they'll
> implement a non-conforming mode for C++ as the default and a
> conforming mode as a switch, just like they do for C.

I think we have two separate issues here. We might have a different
answer to each question:

1) One's complement, sign-magnitude / padding bits

Non two's complement (with/without padding bits) machines seem to be old
architectures that have survived until now in some sectors (government,
financial, health) where backwards compatibility with old mainframes is
important.

Unisys was formed through a merger of mainframe corporations
Sperry-Univac and Burroughs, so Clearpath systems are available in two
variants: a UNISYS 2200-based system (Sperry, one's complement 36 bit
machines) or an MCP-based system (Burroughs, sign-magnitude 48/8 bit
machines). According to their website, new Intel based mainframes are
being designed (so they'd need to emulate 1's complement /
sign-magnitude behaviour though the compiler). They have no C++ compiler
and Java is executed with additional 2's complement emulation code. They
are migrating from custom ASICs to Intel processors
(http://www.theregister.co.uk/2011/05/10/unisys_clearpath_mainframe/) so
2's complement will be faster in newer mainframes than 1's complement.

I think requiring 2's complement in the long term would be a good idea,
even in C, as no new architecture is using other representation and this
simplifies teaching and programming in C/C++. We could start having a
ISO C macro (for C compatibility) to detect 2's complement at compile
time and deprecate 1's complement a sign-magnitude representations for
C++. If no one objects then only 2's complement could be allowed for the
next standard.

It would be interesting to have more guarantees on 2's complement
systems (say no padding bits, other than in bool), but I don't know if
that would be possible as I think there are Cray machines with padding
bits in short/int pointers types:

http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html#QEARLRWH

At least it would be interesting to have a simple way to detect types
with padding bits.

2) CHAR_BITS > 8

Architectures with CHAR_BIT > 8 are being designed these days and they
have a very good reason to support only word (16-24-32 bit) based types:
performance. Word-multiple memory accesses and operands simplify the
design, speed-up and allow bigger caches and arithmetic units, they
allow fetching several instructions and operands in parallel more easily
and use every transistor to do what a DSP is supposed to do: very
high-speed data processing.

These DSPs have modern C++ compilers (VisualDSP++ 5.0 C/C++ Compiler
Manual for SHARC Processors,
http://www.analog.com/static/imported-files/software_manuals/50_21k_cc_man.rev1.1.pdf).

"Analog Devices does not support data sizes smaller than the addressable
unit size on the processor. For the ADSP-21xxx processors, this means
that both short and char have the same size as int. Although 32-bit
chars are unusual, they do conform to the standard"

"All the standard features of C++ are accepted in the default mode
except exception handling and run-time type identification because these
impose a run-time overhead that is not desirable for all embedded
programs. Support for these features can be enabled with the -eh and
-rtti switches."

In DSPs that can be configured in byte-addressing mode (instead of the
default word-addressing mode) stdint.h types are accordingly defined
(int8_t and friends only exist in in byte addressing mode). Example:
TigerShard DSPs (VisualDSP++ for TigerSharc processors:
http://www.analog.com/static/imported-files/software_manuals/50_ts_cc_man.4.1.pdf).
Even pointer implementations are optimized for Word-addressing (taken
from the C compiler manual):

"Pointers

The pointer representation uses the low-order 30 bits to address the
word and the high-order two bits to address the byte within the word.
Due to the pointer implementation, the address range in byte-addressing
mode is 0x00000000 to 0x3FFFFFFF.

The main advantage of using the high-order bits to address the bytes
within the word as opposed to using the low-order bits is that all
pointers that address word boundaries are compatible with existing code.
This choice means there is no performance loss when accessing 32-bit items.

A minor disadvantage with this representation is that address arithmetic
is slower than using low-order bits to address the bytes within a word
when the computation might involve part-word offsets."

I think banning or deprecating systems with CHAR_BIT != 8 would be a
very bad idea as C++ is a natural choice for high-performance
data/signal processors.

Best,

Ion

Received on 2013-10-27 23:46:53