Re: [std-proposals] Relocation in C++

From: Edward Catmur <ecatmur_at_[hidden]>
Date: Mon, 26 Sep 2022 11:31:34 +0100
On Mon, 26 Sept 2022 at 10:36, S├ębastien Bini <sebastien.bini_at_[hidden]>

> On Sat, Sep 24, 2022 at 5:40 PM Edward Catmur <ecatmur_at_[hidden]>
> wrote:
>> On Thu, 22 Sept 2022 at 18:48, S├ębastien Bini <sebastien.bini_at_[hidden]>
>> wrote:
>>> I fail to see the real danger here. Sure we may extract
>>> private/protected subobjects, but the source object we extracted them from
>>> is destroyed. So it's not like we can mess with it. We simply get
>>> independant objects that have no relationship with one another, which is
>>> guaranteed by the fact that the object we extracted them from has no
>>> user-provided constructors and destructor.
>> A tuple class is highly likely to have user-provided constructors, even
>> if its destructor is defaulted; indeed, a class that needs to make use of
>> this facility will almost certainly not be aggregate constructible,
>> otherwise it would be able to use aggregate structured binding. For
>> example, std::tuple has user-provided constructors, so this facility would
>> have to accept class types with user-provided constructors. So there is a
>> definite danger that this could be abused to access private/protected bases
>> of a class type that uses those bases to maintain invariants, and by doing
>> so break those invariants.
> My bad, I meant no user-provided *relocation, move or copy* constructors.

Sure. I'd be a bit concerned that this is too restrictive, though it
shouldn't be a problem for tuple types as far as I can tell, but they might
want to user-provide special member functions for logging? This should be
called out, though.

But I still don't see the danger. We no longer care about the class
> invariant as the instance of that class we are concerned about is destroyed
> by std::decompose (the invariants of a destroyed object no longer hold
> anyway). Each subobject returned by std::decompose may have its own
> invariants and those are not violated.

The base class may not be intended to be used outside the context of being
a private base class, so it may not maintain its own invariants (it relies
on the derived class for those).

Every operator token is currently a keyword; there are four identifiers
>>>> with special meaning (final, override, import and module) but they are not
>>>> operators.
>>>> The problem with having it as an operator that is not a keyword is
>>>> parsing; is `reloc x;` a discarded-value relocation expression (destroying
>>>> x) or is it the declaration of a variable `x` with type `reloc`? If `reloc`
>>>> is not allowed as an identifier, then that's basically the same as making
>>>> it a keyword.
>>> Fair point. When I first thought of this reloc operator, I considered
>>> using a new symbol instead. Like `$x`, `_at_x`, or `>x`, etc... instead of
>>> `reloc x`. However I do find `reloc` to be better than a symbol: first, it
>>> clearly conveys the intent. Second, it is visually more eye-catching, which
>>> helps when reading code. I reckon it's important to quickly see where
>>> relocation happens as it ends the scope of variables.
>> Yes, but adding a keyword that could be already in use as an identifier
>> is fraught, and could become an obstacle. An alternative is to construct a
>> sequence of punctuation tokens that cannot currently appear in a program,
>> as was done for spaceship; with that in mind, you might propose something
>> like `<~< x`?
> I also thought of `&< x` lately, which looks a bit like shell redirection.
> I am not thrilled by this, `reloc x` conveys more meaning and stands out
> more IMO.
> C++11 managed to do the migration of `auto` quite well, with compilers
> giving warnings beforehand. The same could happen with `reloc`. Migrating
> the codebase to a new version of the language always involves some work.

`auto` was already a keyword, though.

Migrating to C++11 caused some trouble I recall, mainly because we wanted
> to migrate from boost smart pointers to those of the STL.
> I quickly looked at:
> - my personal projects and those of Stormshield, `reloc` is never used
> at all.
> - llvm has less than a hundred hits where `reloc` is used outside
> comments and strings (counting uses, not variable declaration), and reloc
> is never used to name a type, only variables.They mostly come from the ELF
> module.
> - FreeBSD. It's mainly C but also has some contribs in C++. We could
> argue that if reloc is not seen in a C codebase, chances are it is not used
> that much as well in C++. It has a bit more than a hundred hits. They
> mostly come from llvm (FreeBSD has less than a dozen hits if we exclude
> llvm from the count, and they equally come from C and C++ contribs.)
> This is just a quick search, but it gives us a hint that reloc might not
> be used that often.
> Anyway, I think I am going to propose both alternatives, while pondering
> in favor of `reloc`.

Yes, that matches my own research; `reloc` is mostly used as an identifier
in ELF code. So it might be possible to use it, but it would still have
some cost.

It's still a problem if `reloc` is used as only as a variable identifer;
are `reloc (x)`, `reloc + a` ill-formed relocation expressions or are they
function call, arithmetic expressions? Is `reloc x.y` an ill-formed
relocation expression or an expression missing an operator? I don't think
having a token that means a keyword only when followed by an identifier is
likely to be accepted.

Received on 2022-09-26 10:31:49