Date: Sat, 26 Aug 2023 10:27:28 -0400
On Sat, 26 Aug 2023 at 06:31, Frederick Virchanza Gotham via Std-Proposals <
std-proposals_at_[hidden]> wrote:
>
> I'm top-posting here to give an intro here before I reply to comments in
> series below.
>
> C++ is a programming language used in the real world by real programmers
> getting work done. Sometimes we find ourselves in some difficult situations
> because of a previous bug, or because of a deliberate improvisation to
> avoid an ABI break.
>
> I'll give you an example from code I wrote just yesterday. I program the
> microcontrollers inside microscopes for a multinational firm
> (microcontrollers made by Texas Instruments (CHAR_BIT==16) and Arduino).
>
> A microscope that was designed about 20 years ago was based on a
> communication system called 'ABC', and a new microscope designed three
> years ago is based on the 'XYZ' communication system.
>
> There is an extensive SDK for the desktop PC software to interact with the
> microscopes. For every class beginning with IABC, there is another
> beginning with IXYZ.
>
> In order to save a lot of time and effort, the class for the new
> microscope was made to inherit from the old microscope:
>
> class NewMicroscope : Old Microscope {};
>
> This has been working fine for 3 years but yesterday I spent a few hours
> trying to figure out why the new microscope can't manually manipulate the
> COM port. After a few hours I realised that 'NewMicroscope' inherits from a
> class that inherits from a class that inherits from a class that inherits
> from 'IABCComHandler'. The problem here is that it should have instead
> inherited from 'IXYZComHandler'. So the following always yielded a nullptr:
>
> dynamic_cast<IXYZComHander*>(&new_microscope)
>
> But if I were to change the class hierarchy to make this dynamic_cast
> possible, then NewMicroscope would no longer inherit from OldMicroscope,
> but more importantly it would be an ABI break, and I couldn't send out a
> new DLL file to every customer in a dozen countries just because one
> customer wants to manually manipulate the COM port (it's a rare request).
>
> So what did I do? In Visual Studio I wrote:
>
> NewMicroscope obj;
> constexpr void *p1 = &obj;
> constexpr void *p2 = static_cast<IABCComHandler*>(&obj);
>
> Then I just hovered my mouse over the third line and it came up with a
> tooltip that said '&obj + 16'. Then I found another function in the API
> that gave back a pointer to another base class whose offset was '&obj + 8'.
> So I knew that if I could get a pointer to the other class, then I just had
> to add 8 to it to get the COM port handler. So I sent the customer code
> that looks like:
>
> NewMicroscope obj;
>
> IXYZInterface &inter = obj.GetInterface();
>
> IXYZComHandler &com =
> *static_cast<IXYZComHandler*>(static_cast<void*>(static_cast<char*>(static_cast<void*>(&inter))
> + 8u));
>
> The code was tested and working before I sent it to them. Is it Ideal to
> be sending code like this out to customers? No it's not. But I live in the
> real world.
>
> If C++ is a real world language then it should have a few features in it
> that allow 'repair jobs' like this. Sure I could ask my compiler vendor to
> make a change, but isn't the C++ Standard all about making these feature
> ubiquitous?
>
This code is relying on your vendor anyways. The fact that the vptrs are in
the place they are, and that they have the matching layouts, is a detail of
your compiler. This code would not be portable to any other implementation
>
> I reply in series to people below.
>
>
> On Friday, August 25, 2023, Sebastian Wittmeier wrote:
>
>> - What about called functions? Are the rules also valid within those?
>> What about template functions? What about used operators?
>>
>
>
> When a function is marked as '__verbose', the verbosity is not extended
> into nested function calls.
>
>
> - All objects are treated as volatile. Including parameters, return
>> values, parameters of function calls (if DoSomething would have
>> parameters)? Including member variables of used classes/structs?
>>
>
>
> Yes, everything is treated as volatile. Every time the compiler sees the
> name of a variable inside a function, it has to read its value from memory
> again -- no assumptions are made, i.e. no optimisation.
>
You didn't answer whether "everything is treated as volatile" means
everything has a volatile-qualified type, or simply the optimization
implications
*void foo(bar x){* *x.baz(); *// Does this line call `void baz();` or
`void baz() volatile;` given both are defined as non-static member
functions of bar?
*} *
>
>
> - All pointers are automatically std::launder'ed. When? After each line,
>> after each instruction, after each sub-expression? With launder, you mean p
>> = std::launder(p). Of which pointers is this valid? Any pointer used in the
>> C++ program (are they registered somewhere)? Pointers used in the
>> instructions within the function? What about called functions, operators,
>> ..., which internally use pointers?
>>
>
> Inside the function, every pointer has 'launder' applied to it before
> every dereference, and there is no caching of the vtable. Every single
> pointer.
>
Is dereference the `*` operator? Or the lvalue-to-rvalue conversion? Both
of these can add UB to programs, before dereference can add a lot, as
`std::launder` requires quite a few things that aren't enforced by the mere
act of dereferencing a pointer.
*int x = 0;*
*float* f = reinterpret_cast<float*>(&x);*
*float& g = *f;*
*int& i = reinterpret_cast<int&>(g);*
This code has DB because it never does anything that enforces that `x` have
a type similar to `float`.
If we add `std::launder` at the static type of each pointer, we get
*int x = 0; // good*
*float* f = reinterpret_cast<float*>(&x); // still good*
*float& g= *std::launder(f); // Oops*
*int& i = reinterpret_cast<int&>(g); *
This is because `std::launder` requires that the object selected have a
type similar to `T`. `int` and `float` are not similar. It's not such
strict aliasing either. I can write code that is defined behaviour in base
C++, such that replacing each pointer operand to the unary `*` operator
with a call to std::launder` applied to that pointer violates any of the
preconditions in [ptr.launder], except for the first "p represents the
address A of a byte of memory."
Also, does `&*p` turn into `&*std::launder(p)`? If so, `p=nullptr` and
kaboom.
>
>
>
>> - All objects are treated as volatile. Does that do, what you expect?
>> Variables can be put into processor registers. With the as-if-rule, the
>> optimizer can remove if conditions even for volatile variables. In this
>> case the pointers are provided by the caller, but perhaps this function is
>> inlined and the target of the pointers is known and local to the calling
>> functions.
>>
>
>
> __verbose functions are forbidden to be inlined.
>
> Rip this code running on CUDA I guess. All functions are inlined there.
>
>
>> The overall question is, why would you want to have such a feature?
>>
>> For making wrong code valid?
>>
>> For debugging purposes?
>>
>
>
> I gave one example at the top of this email.
>
>
>
>> Most of the effects you expect would be visible only at the assembler
>> level or in multithreaded code.
>>
>> For multithreaded code one should use correct synchronization to begin
>> with.
>>
>> Debugging assembler level is kind of outside of the scope of the C++
>> standard.
>>
>> Better each implementation provides a way to generate debug-friendly
>> code, e.g. with a switch like '-O0'.
>>
>
>
> I don't think that debugging should be beyond the Standard. We have
> 'assert' and 'NDEBUG' already. More so though here, I'm talking about
> improvisation rather than debugging.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
std-proposals_at_[hidden]> wrote:
>
> I'm top-posting here to give an intro here before I reply to comments in
> series below.
>
> C++ is a programming language used in the real world by real programmers
> getting work done. Sometimes we find ourselves in some difficult situations
> because of a previous bug, or because of a deliberate improvisation to
> avoid an ABI break.
>
> I'll give you an example from code I wrote just yesterday. I program the
> microcontrollers inside microscopes for a multinational firm
> (microcontrollers made by Texas Instruments (CHAR_BIT==16) and Arduino).
>
> A microscope that was designed about 20 years ago was based on a
> communication system called 'ABC', and a new microscope designed three
> years ago is based on the 'XYZ' communication system.
>
> There is an extensive SDK for the desktop PC software to interact with the
> microscopes. For every class beginning with IABC, there is another
> beginning with IXYZ.
>
> In order to save a lot of time and effort, the class for the new
> microscope was made to inherit from the old microscope:
>
> class NewMicroscope : Old Microscope {};
>
> This has been working fine for 3 years but yesterday I spent a few hours
> trying to figure out why the new microscope can't manually manipulate the
> COM port. After a few hours I realised that 'NewMicroscope' inherits from a
> class that inherits from a class that inherits from a class that inherits
> from 'IABCComHandler'. The problem here is that it should have instead
> inherited from 'IXYZComHandler'. So the following always yielded a nullptr:
>
> dynamic_cast<IXYZComHander*>(&new_microscope)
>
> But if I were to change the class hierarchy to make this dynamic_cast
> possible, then NewMicroscope would no longer inherit from OldMicroscope,
> but more importantly it would be an ABI break, and I couldn't send out a
> new DLL file to every customer in a dozen countries just because one
> customer wants to manually manipulate the COM port (it's a rare request).
>
> So what did I do? In Visual Studio I wrote:
>
> NewMicroscope obj;
> constexpr void *p1 = &obj;
> constexpr void *p2 = static_cast<IABCComHandler*>(&obj);
>
> Then I just hovered my mouse over the third line and it came up with a
> tooltip that said '&obj + 16'. Then I found another function in the API
> that gave back a pointer to another base class whose offset was '&obj + 8'.
> So I knew that if I could get a pointer to the other class, then I just had
> to add 8 to it to get the COM port handler. So I sent the customer code
> that looks like:
>
> NewMicroscope obj;
>
> IXYZInterface &inter = obj.GetInterface();
>
> IXYZComHandler &com =
> *static_cast<IXYZComHandler*>(static_cast<void*>(static_cast<char*>(static_cast<void*>(&inter))
> + 8u));
>
> The code was tested and working before I sent it to them. Is it Ideal to
> be sending code like this out to customers? No it's not. But I live in the
> real world.
>
> If C++ is a real world language then it should have a few features in it
> that allow 'repair jobs' like this. Sure I could ask my compiler vendor to
> make a change, but isn't the C++ Standard all about making these feature
> ubiquitous?
>
This code is relying on your vendor anyways. The fact that the vptrs are in
the place they are, and that they have the matching layouts, is a detail of
your compiler. This code would not be portable to any other implementation
>
> I reply in series to people below.
>
>
> On Friday, August 25, 2023, Sebastian Wittmeier wrote:
>
>> - What about called functions? Are the rules also valid within those?
>> What about template functions? What about used operators?
>>
>
>
> When a function is marked as '__verbose', the verbosity is not extended
> into nested function calls.
>
>
> - All objects are treated as volatile. Including parameters, return
>> values, parameters of function calls (if DoSomething would have
>> parameters)? Including member variables of used classes/structs?
>>
>
>
> Yes, everything is treated as volatile. Every time the compiler sees the
> name of a variable inside a function, it has to read its value from memory
> again -- no assumptions are made, i.e. no optimisation.
>
You didn't answer whether "everything is treated as volatile" means
everything has a volatile-qualified type, or simply the optimization
implications
*void foo(bar x){* *x.baz(); *// Does this line call `void baz();` or
`void baz() volatile;` given both are defined as non-static member
functions of bar?
*} *
>
>
> - All pointers are automatically std::launder'ed. When? After each line,
>> after each instruction, after each sub-expression? With launder, you mean p
>> = std::launder(p). Of which pointers is this valid? Any pointer used in the
>> C++ program (are they registered somewhere)? Pointers used in the
>> instructions within the function? What about called functions, operators,
>> ..., which internally use pointers?
>>
>
> Inside the function, every pointer has 'launder' applied to it before
> every dereference, and there is no caching of the vtable. Every single
> pointer.
>
Is dereference the `*` operator? Or the lvalue-to-rvalue conversion? Both
of these can add UB to programs, before dereference can add a lot, as
`std::launder` requires quite a few things that aren't enforced by the mere
act of dereferencing a pointer.
*int x = 0;*
*float* f = reinterpret_cast<float*>(&x);*
*float& g = *f;*
*int& i = reinterpret_cast<int&>(g);*
This code has DB because it never does anything that enforces that `x` have
a type similar to `float`.
If we add `std::launder` at the static type of each pointer, we get
*int x = 0; // good*
*float* f = reinterpret_cast<float*>(&x); // still good*
*float& g= *std::launder(f); // Oops*
*int& i = reinterpret_cast<int&>(g); *
This is because `std::launder` requires that the object selected have a
type similar to `T`. `int` and `float` are not similar. It's not such
strict aliasing either. I can write code that is defined behaviour in base
C++, such that replacing each pointer operand to the unary `*` operator
with a call to std::launder` applied to that pointer violates any of the
preconditions in [ptr.launder], except for the first "p represents the
address A of a byte of memory."
Also, does `&*p` turn into `&*std::launder(p)`? If so, `p=nullptr` and
kaboom.
>
>
>
>> - All objects are treated as volatile. Does that do, what you expect?
>> Variables can be put into processor registers. With the as-if-rule, the
>> optimizer can remove if conditions even for volatile variables. In this
>> case the pointers are provided by the caller, but perhaps this function is
>> inlined and the target of the pointers is known and local to the calling
>> functions.
>>
>
>
> __verbose functions are forbidden to be inlined.
>
> Rip this code running on CUDA I guess. All functions are inlined there.
>
>
>> The overall question is, why would you want to have such a feature?
>>
>> For making wrong code valid?
>>
>> For debugging purposes?
>>
>
>
> I gave one example at the top of this email.
>
>
>
>> Most of the effects you expect would be visible only at the assembler
>> level or in multithreaded code.
>>
>> For multithreaded code one should use correct synchronization to begin
>> with.
>>
>> Debugging assembler level is kind of outside of the scope of the C++
>> standard.
>>
>> Better each implementation provides a way to generate debug-friendly
>> code, e.g. with a switch like '-O0'.
>>
>
>
> I don't think that debugging should be beyond the Standard. We have
> 'assert' and 'NDEBUG' already. More so though here, I'm talking about
> improvisation rather than debugging.
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
Received on 2023-08-26 14:27:42