sg12: Re: [ub] ub due to left operand of shift

From: Howard Hinnant <howard.hinnant_at_[hidden]>
Date: Sat, 26 Oct 2013 17:36:20 -0400

On Oct 26, 2013, at 5:16 PM, Jean-Marc Bourguet <jm_at_[hidden]> wrote:

> On 26/10/2013 20:09, Howard Hinnant wrote:
>> Might we reword [expr.shift]/p2 in terms of shifting 1 bits off of the most significant end, instead of in terms of E1*2^E2?
>>
>> The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. <del>If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1 × 2^E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise<del><ins>If E1 has signed type and the result would have fewer non-zero bits than E1</ins>, the behavior is undefined.

> If I'm not mistaken, the only case you seem to define is a left shift by
> 0 of a negative value. (I'd not mind converting E1 to the corresponding
> unsigned type, do the shift, then converting back to the signed type; I
> know of two machines which have an arithmetic shift left instruction
> with a sticky sign bit -- System/360 and PDP-10 --, and if I'm not
> mistaken C and C++ compilers for them are using the logical shift when
> shifting left -- someone with an access to a zSeries could check if it
> hasn't been done when processing DR1457, the PDP-10 has been irrelevant
> for about 3 decades).

Agreed. My intent was to clean up the wording by ceasing to describe the result in two ways:

1. The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled.
2. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1 × 2^E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value.

We only need to describe the resulting value in one way. And a side-effect of this clean up is to fix what appears to me to be an obvious bug (left shifting a negative value by 0 should not be undefined behavior).

I am also of the opinion that we should clean up p3 in the same way, and for the same reasons, though in this case there is no bug fix:

The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed
type and a non-negative value, <del>the value of the result is the integral part of the quotient of E1/2^E2</del>
<ins>vacated bits are zero-filled</ins>. If E1 has a signed type and a negative value, the <del>resulting</del>
value <ins>of the vacated bits</ins> is implementation-defined.

Are these changes consistent with the existing behavior of System/360 and PDP-10?

I can vouch that they are consistent with the behavior of clang on OS X, with the exception that I pointed out with respect to constexpr and left-shift-negative-by-zero (which again, just seems like a bug to me).

I've programmatically specified what I believe should be the undefined behavior of the current specification modulo the left-shift-by-zero bug in the code below (std::__clz(x) is a count-leading-zeros function, which if we had one, would be another way to specify the range of well-defined behavior). The purpose of this exercise is to help identify and clarify any ambiguity that might exist in my modified English specification above.

constexpr
unsigned
lshift(unsigned x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned>::digits);
    return x << y;
}

constexpr
unsigned long
lshift(unsigned long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long>::digits);
    return x << y;
}

constexpr
unsigned long long
lshift(unsigned long long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long long>::digits);
    return x << y;
}

constexpr
int
lshift(int x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned>::digits);
    assert(x == 0 || y <= std::__clz(static_cast<unsigned>(x)));
    return x << y;
}

constexpr
long
lshift(long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long>::digits);
    assert(x == 0 || y <= std::__clz(static_cast<unsigned long>(x)));
    return x << y;
}

constexpr
long long
lshift(long long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long long>::digits);
    assert(x == 0 || y <= std::__clz(static_cast<unsigned long long>(x)));
    return x << y;
}

constexpr
unsigned
rshift(unsigned x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned>::digits);
    return x >> y;
}

constexpr
unsigned long
rshift(unsigned long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long>::digits);
    return x >> y;
}

constexpr
unsigned long long
rshift(unsigned long long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long long>::digits);
    return x >> y;
}

constexpr
int
rshift(int x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned>::digits);
    return x >> y;
}

constexpr
long
rshift(long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long>::digits);
    return x >> y;
}

constexpr
long long
rshift(long long x, unsigned long long y)
{
    assert(y < std::numeric_limits<unsigned long long>::digits);
    return x >> y;
}

Howard

Received on 2013-10-26 23:36:24