Date: Fri, 31 Jan 2025 11:16:33 +0100
Hi Jonathan,
the value of the underlying bytes matter.
Currently the C++ standard (AFAIK) does not specify, how pointers are represented in memory.
There are only certain high-level guarantees, e.g.
&p[i] = p + i
or that increment/decrement can access the next/previous element.
Internally the small string optimization of the implementation(s) use the same pointer member variable for a heap pointer or for pointing to the SSO buffer. When relocating the object, this has to be tested, and the pointer value should only be changed, if it points to the internal buffer.
If we do not assume a certain underlying pointer representation, we cannot know, whether a pointer to std::byte and a pointer to char32_t has the same representation and we can just use the unaligned beginning of the buffer as marker, as we did in the original class.
Probably the string class is an non-optimal example:
Whoever creates the unaligned relocation should be the implementer of the standard library implementation or even the basic_string class itself, as private class internals are accessed and modified. They can just always use a byte-pointer anyway or make platform-dependent assumptions about the underlying representation
It is possible to keep the char32_t pointer and use std::align to generate an internal aligned pointer without any assumptions about representation.
So if one would bring up the example in the proposal, it should not put additional non-necessary demands into the language.
Thomas/Frederick asked about making all pointer types the same size and representation (and probably set the representation) and make them some sort of 'intercompatible'.
This is all unnecessary, even for this example, if one would want to keep the char32_t and relocate to unaligned memory and store either absolute or relative pointers.
While defining the pointer representation in more depth and allow intercompatibility between different pointer types would open up new ways for hacking on the C++ application side, it would restrict ways for the implementation sides, and make C++ lower-level than necessary. (Also perhaps less safe and performant.)
Best,
Sebastian
-----Ursprüngliche Nachricht-----
Von:Jonathan Wakely <cxx_at_[hidden]>
Gesendet:Fr 31.01.2025 10:24
Betreff:Re: [std-proposals] Make all data pointers intercompatible
An:std-proposals_at_[hidden];
CC:Sebastian Wittmeier <wittmeier_at_[hidden]>;
On Thu, 30 Jan 2025 at 17:04, Sebastian Wittmeier via Std-Proposals <std-proposals_at_[hidden] <mailto:std-proposals_at_[hidden]> > wrote:
IMOH as internals are accessed, that code can only be done by the implementer's of basic_string or at least by the implementers of the standard library providing basic_string. Everything else is a hack.
Thomas (Frederick) stated that the class is not used/called in its unaligned state, so using the 4 bytes of the pointer creatively (either as unaligned byte pointer representation even for char32_t or as I suggested letting it point to the first aligned byte within the SSO buffer) probably has to be done, even if it is violating the class invariant, as long as the class variant is correctly restored, when unpacking to aligned memory.
If the object isn't accessed while in unaligned memory, why does the value of the underlying bytes of that pointer matter in the slightest? They're just bytes, not a pointer value, aren't they?
I understood the example more like a typical user class, which has to cope with unaligned storage.
Unless the type is trivially copyable, you cannot use memcpy to copy the underlying bytes to an array of bytes and then back in again. That certainly applies to a typical user class.
The motivation seems to be to allow such copies of the underlying bytes, but without actually touching the relevant rule (which is in [basic.types.general]). So the motivation is lacking.
But perhaps I understood the OP wrongly.
-----Ursprüngliche Nachricht-----
Von:Jonathan Wakely <cxx_at_[hidden] <mailto:cxx_at_[hidden]> >
On Thu, 30 Jan 2025 at 10:10, Sebastian Wittmeier via Std-Proposals <std-proposals_at_[hidden] <mailto:std-proposals_at_[hidden]> > wrote:
If p is pointing to buf,
set it to the first aligned byte of buf in the unaligned memory.
That would break invariants of the libstdc++ basic_string, which uses `p == buf` to detect when the string is no heap allocated. If you point p to the middle of bug then the destructor will do deallocate(p, capacity()+1) which will try to delete non-allocated memory.
The solution is: stop trying to mess with the internals of std::string. Come up with a better motivating example.
--
Std-Proposals mailing list
Std-Proposals_at_[hidden] <mailto:Std-Proposals_at_[hidden]>
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
Received on 2025-01-31 10:20:14