The C++11 memory model and later revisions are defined in terms of "atomic objects". This restricts the amount of operations performable on atomically accessed locations. Notably, it was not possible to non-atomically access a std::atomic<T>, or access the first 16 bits of a std::atomic<int32_t> (whether atomically or non-atomically). The formalisation of the memory model was done on this basis:

The Problem of Programming Language Concurrency Semantics, Batty et al.

The C++11 standard prose refers to “atomic objects” as if they are quite different from non-atomic objects, and the mathematical model of Batty et al. [8] for the C++11 and C11 concurrency primitives followed suit by imposing a simple type discipline: a location kind map in each candidate execution partitioned locations into atomic, nonatomic, and mutex locations. The definition of consistent execution permitted atomic accesses only at atomic locations, and the only nonatomic accesses allowed at atomic locations were atomic initialisations.

Mixed-Size Concurrency: ARM, POWER, C/C++11, and SC, Flur et al.

In the ISO C standard mixed-size overlapping atomic accesses are forbidden by the effective type rules

Both mixed size and mixed atomicity accesses are now possible in C++20 with std::atomic_ref<T>. Note that one is not allowed to access the underlying pointer of atomic_ref during its lifetime, this ensures that any such two types of accesses cannot race with any other accesses. Nonetheless, the memory model was written without considering even these restricted scenarios:

https://godbolt.org/z/5s4Prvax6

std::pair<int16_t, int16_t> mixed_size() {
    int32_t x = 0;

    {
        auto x_atomic = std::atomic_ref<int32_t>(x);
        x_atomic.store(0xabbafafa, std::memory_order_relaxed);
        // atomic_ref lifetime ends so we can use x again
    }

    auto x_parts = reinterpret_cast<int16_t*>(&x);
    int16_t& x_left = x_parts[0];
    int16_t& x_right = x_parts[1];
    
    auto x_left_atomic = std::atomic_ref<int16_t>(x_left);
    auto x_right_atomic = std::atomic_ref<int16_t>(x_right);

    // Atomic loads have to read-from a store
    // These both read-from line 11 but they read different things???
    int16_t left = x_left_atomic.load(std::memory_order_relaxed);
    int16_t right = x_right_atomic.load(std::memory_order_relaxed);
    return std::pair<int16_t, int16_t>(left, right);
}

std::int32_t mixed_atomicity() {
    int32_t x = 0;

    {
        auto x_atomic = std::atomic_ref<int32_t>(x);
        x_atomic.store(42, std::memory_order_relaxed);
        // atomic_ref lifetime ends so we can use x again
    }

    // Obviously reading 42 is sane here, but there is no codified semantics
    // governing non-atomically reading an atomic location in any circumstance
    int32_t read = x;
    return read;
}

If it is indeed fine to perform a mixed size and/or atomicity access when it is related by happens-before with all other accesses, then we should update the standard to reflect this. I think the simple Read-Read and Write-Read coherence rules should be applicable.

Andy