C++ Logo

std-proposals

Advanced search

Re: P1839 and the object representation of subobjects

From: Thiago Macieira <thiago_at_[hidden]>
Date: Tue, 21 Jul 2020 08:55:45 -0700
On Tuesday, 21 July 2020 08:01:13 PDT Jason McKesson via Std-Proposals wrote:
> ```
> auto byte_ptr = reinterpret_cast<byte *>(qoptr);
> auto c1ptr = std::launder(reinterpret_cast<C1 *>(qoptr + off);
> ```
>
> If there is an object of type `C1` at the given address, then
> `std::launder` will return a pointer to it. This is the purpose of
> `launder`.

There is an object of the proper type at the address. The problem is we can't
get to the address without UB in the first place because the pointer
arithmetic is undefined.

TBH, I don't see how it could be, in any theoretical machine, so this seems
like an unnecessary limitation of the standard. It seems like the standard
forbids assuming that there is an array of bytes as the storage for the
object, even though it allows us to use an array of bytes as storage for some
objects. But isn't memory defined as being an array of bytes? Is it possible,
even in theory, that there are regions of memory that are not byte-
addressable?

The definition of byte (char) is that it is the smallest addressable unit of
memory at least 8 bits wide and with no padding bits. I understand some
architectures could have some weird implementation of a pointer. For example,
one with 36-bit addressable words (allowing for 2^(36+2) total bytes[*] of
memory, a very reasonable 256 GB) may want to add two extra bits to a pointer
value to select which of the four 9-bit pieces inside that word was meant. On
a first attempt, I'd reserve the lower 64 GB of the addressable space for
byte-based arrays, so the pointer is still one word wide. For all other
objects and arrays, the pointers would be actual machine pointers, allowing
for the full 256 GB virtual address space. Can such an implementation exist?

[*] in this machine with a 9-bit byte, 256 GB is not 256 gigaoctets = 2048
terabits, but 288 Go = 2304 Tbit.

Problem 1: malloc:

        char *ptr = static_cast<char *>(malloc(1234));

Since the buffer returned by malloc may be used as byte-addressed arrays,
malloc() is forced to restrict itself to the low 64 GB of VM space. I suppose
the platform may extend the C library with a _malloc_word() to allocate in the
upper three quarters of the address space.

Would this limit the usefulness of this platform/

Problem 2: memcpy:

        char buf[SuitableSize];
        memcpy(buf, &object, sizeof(object));

Because memcpy can receive either pointer to a byte array not aligned to the
machine word, the pointer addresses it receives must have those 2-bit
discriminators. I suppose the compiler can replace such a memcpy call with a
call to the actual library function that takes two extra parameters.

Problem 3: memchr:

        char *ptr = static_cast<char *>(memchr(buf, 'a', bufsiz));

Because the returned pointer may also be word-misaligned, all the mem*
functions are forced to return the extra 2-bit discriminator. Effectively,
this means all void* are two words in size. Is it allowed that sizeof(void*) >
sizeof(char*) and sizeof(void*) > sizeof(int*) ?

Given that uintptr_t must be able to hold the value of any pointer, it too
must have size 8 bytes (note how sizeof(uintptr_t) > sizeof(size_t)). Now, the
standard doesn't allow for arithmetic in the uintptr_t value. That is, there
is no guarantee that

        uintptr_t(ptr) + N * sizeof(*ptr) == uintptr_t(ptr + N)

But I don't see why this can't be there. If it isn't there, this theoretical
platform could store a broken-up word-pointer in one word and the 2-bit
discriminator in the other. This may be more efficient for objects bigger than
char, but inefficient for char pointers. But another possible implementation
is the opposite, allowing for efficient char pointers and slightly inefficient
bigger ones: store a 38-bit linear byte address. This would allow for the
arithmetic on uintptr_t, even if arithmetic on char* didn't work.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel DPG Cloud Engineering

Received on 2020-07-21 10:59:07