std-proposals: Re: BytesReadable & BytesWritable Traits

From: connor horman <chorman64_at_[hidden]>
Date: Sun, 1 Dec 2019 21:34:17 -0500

On Sun, Dec 1, 2019 at 20:35 Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
wrote:

> You dropped the mailing list, by the way.
>
Apologies for that. It should be included here

>
> On Sun, Dec 1, 2019 at 3:37 PM connor horman <chorman64_at_[hidden]> wrote:
>
>> On Sun, 1 Dec 2019 at 14:46, Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
>> wrote:
>> >
>> > Your description is strangely non-parallel-in-construction for pointer
>> types and reference types. Shouldn't those two kinds of types be treated
>> identically?
>> Reference types and pointer types are not the same (though internally,
>> it is possible that a reference is represented as a pointer).
>>
>
> You can (and should) treat "reference members" and "object-pointer
> members" equivalently. I think you're right that non-static data members of
> reference type are not "subobjects," but that just convinces me that you
> should be defining your notion in terms of the "bases and non-static data
> members" of the class type T, rather than in terms of the "subobjects" of
> an unspecified object of type T.
>
The reasons I primarily use subobjects to describe it is so that arrays can
participate in this directly, again the wording can be changed if this is
not suitable.

>
> Additionally, the way I describe it involves objects, and references
>> are not objects so I can't describe references the same way. You can't
>> really describe at all what happens when you overwrite the bytes that
>> make up a reference, because technically, references don't have
>> storage, so bytes can't really make up references. That is, at least,
>> my interpretation of the fact that references aren't objects.
>>
>
> My naïve impression is that the Standard's wording around reference
> members is awfully confused and probably buggy if you look too close.
> Notice that `struct S { int& r; }` is trivially copyable and trivially
> copy-constructible (although not copy-assignable). I don't know exactly
> what wording causes that triviality.
>
Yeah its stupid. Its probably because it still has a non-user-provided copy
and move constructor and the reference copy/move constructor is technically
trivial.

>
>
> > What's wrong with pointer-to-member types? (Be specific. If what you
>> describe is true of only one platform's ABI, consider whether the
>> definition of BytesReadable should maybe be different only on that
>> platform.)
>> To my knowledge, pointer-to-members have the same limitations as
>> pointer-to-objects/functions. As in, they are either a null pointer,
>> point to a member of a class, or point to something that isn't really
>> there.
>
>
> Pointers-to-members never "point to" anything. They have to be combined
> with an object pointer before they can be dereferenced. On the purely
> physical level (and without considering virtual inheritance), they work
> like this:
> struct S { int a; int b; };
> int S::*mp = &S::a;
> // At this point, the bitwise representation of `mp` contains "0"
> mp = &S::b;
> // At this point, the bitwise representation of `mp` contains "4"
> *mp; // syntax error: mp cannot be dereferenced
> S s;
> s.*mp; // OK: access the int at offset "4" in `s`, i.e., `s.b`
>
To be fair to my wording, they are called “pointers”. They may not point to
something in memory, but they point to something. That something may
meaningless without a glvalue of a particular class type in conjunction
with the pointer-to-member but its still a pointer. Just as object pointers
know what they point to, I assume would pointer-to-members as well as.
And certainly, pointer-to-member functions are really fun in this regard,
because if they point to a non-virtual function, they could easily just be
a fancy function pointer.

>
> An example of this, would actually be on an theoretical implementation
>> of C and C++ which compiles to JVM Bytecode, where a Pointer-to-member
>> object is represented by a Java 9 VarHandle. The value representation
>> hasn't been determined for any particular VarHandle, so reading it,
>> would result in it being given a null object value.
>>
>
> I didn't follow that well enough to express an opinion on its feasibility.
> However, notice that according to http://eel.is/c++draft/basic.types#9 ,
> pointer-to-member types are scalar types and all scalar types must be
> trivially copyable. If the scheme you just described makes
> pointers-to-members non-trivially-copyable, then it isn't permitted as an
> implementation strategy.
>
Technically, the implementation of a trivial special member function is up
to the implementation (with the caveat that a trivial default constructor
is effectively a no-op), as long as memcpy is valid for Trivially Copyable
types. If I know how to “memcpy” a pointer-to-member somewhere (among other
things), then it keeps the requirement. (again, you can correct me if this
is wrong).

>
> Unrelated to the proposal, assuming proper steps are taken in reading
>> to avoid reading the wrong size for type, I'm not sure how reading a
>> wrong-size int would segfault (or even be undefined behavior). It
>> could only read part of the value, or (for a larger sizeof) leave an
>> indeterminate value (so it would be UB to read).
>>
>
> I was thinking about buffer overflows (by reading 4 bytes from a 2-byte
> buffer, for example).
>

> > Should `struct sockaddr_in` be BytesReadable and BytesWritable? Why or
>> why not?
>> Looking at struct sockaddr_in, it would be both, as
>> BytesWritable/Readable propagates for structure types and union types.
>>
>
> I didn't ask whether it *was* BytesReadable; I asked whether it *should*
> be.
> However, I probably tried too hard to make that example "clever." Let's
> use this example instead:
>
> struct FileDescriptor {
> int fd;
> void put(const char *s) { write(fd, s, strlen(s)); }
> };
>
> struct StringDescriptor {
> char *p;
> void put(const char *s) { while (*s) *p++ = *s++; }
> };
>
> Should we expect to be able to write out a StringDescriptor from one
> process, read it back in in a different process (with a different pattern
> of heap allocations), and go on happily?
>
StringDescriptor already is not (and in-fact, should not be)
The meaning of FileDescriptor is not kept in serialization. However,
arguably, a FileDescriptor is a resource, so it should follow an RAII
pattern and would at the very least, have a non-trivial destructor, making
it fail to be Trivially Copyable, therefore it would satisfy neither
(though that is outside the scope of the example). Technically, you can
already write a FileDescriptor to a file, so this isn’t any less correct in
the current standard.
In the original example, I would add that a sockaddr_in only derives its
meaning from its representation, so it is not incorrectly classified.
It could be possible to “opt-out” of writability/readability, however, I
would prefer to minimize the impact of the change on existing code. The
rule is, if bytewise reading/writing either would technically cause UB or
effectively leads to UB, make it UB up front.

> What about if we write out a FileDescriptor from one process and read it
> back in in a different process (with a different pattern of open file
> descriptors)?
> Which of these types *should* be BytesReadable/BytesWritable?
>

> >> Aside from a few types, and types guaranteed by the standard to be
>> empty, no types in the standard library would satisfy this concept.
>> >
>> > So, "no types aside from a few types" would satisfy this concept? Just
>> say "a few types"! ;)
>> > Off the top of my head, std::byte would be, plus appropriate
>> specializations of std::complex, std::pair, std::tuple, and std::variant.
>>
>> std::variant gives me pause, but it could be.
>
>
> I believe std::variant is now required to be trivially copyable whenever
> all its alternatives are trivially copyable.
> See for example http://eel.is/c++draft/variant.ctor#8 and
> http://eel.is/c++draft/variant.variant#variant.dtor-2
>
> [...] The reason why I say "no
>> types in the standard library would satisfy this concept" is to
>> reinforce this.
>
>
> To reinforce what? the incorrect statement? :P It would be more correct to
> say "some types would satisfy this concept" and leave it at that. (Which
> means you could even get rid of that sentence because it wouldn't be saying
> anything meaningful anymore.)
>
I would want to note that only some class types in the standard library
would satisfy the concept, and that for the most part, regardless of how
the implementation implements a normal standard library class, it won’t
satisfy the concept, unless the standard says it does (or equivalently,
says its both Trivially Copyable, and empty)
I should actually add this, but valid user-provided specializations of
standard-library class templates are subject to normal rules, despite them
being technically “standard library classes”.

>
>
>> > I am extremely skeptical that you'll be able to build your notion up
>> into a complete and correct framework for what-I-call-"persistence."
>
> > However, if you do manage to do so, I'll be very interested!
>> Persistence in general is annoying in C++. The best I have is the
>> ShadeNBT file format, which is a pain to implement in C++, w/o a
>> wrapper type that has checked up casting and downcasting. The library
>> I have allows you to wrap raw input/output streams in
>> DataInput/OutputStreams, which actually do work instead of
>> writing/reading values memcpy wise.
>>
>
> ShadeNBT being some variation on https://wiki.vg/NBT ? I don't see any
> google hits for "shadenbt" or "shade nbt".
>
It is indeed. It started as basically a 6-byte header above an NBT file,
though it has

>
>
>> > If you have not already done so, please take a look at
>> https://www.youtube.com/watch?v=SGdfPextuAU&t=63m45s ("Trivially
>> Relocatable", C++Now 2019), specifically the part I linked there, which has
>> to do with persistence and why it's a hard problem for C++'s type system to
>> deal with.
>>
>
> –Arthur
>

Received on 2019-12-01 20:36:52