ISOCPP std-proposals List: Re: [std-proposals] Signed sizes

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 10 Dec 2024 17:50:48 -0500

On 12/10/24 5:16 PM, Tiago Freire wrote:
>
> > While var[-1] is always wrong (for a container that doesn't support
> negative indexes), the integer promotion rules conspire to ensure that
> such code successfully compiles when the index type is an unsigned
> integer type. The result is either an out of bounds condition or an
> access of an element other than what the programmer intended; a bug in
> either case. Use of an unsigned type hides such bugs making them more
> difficult to discover.
>
> Which is easily caught when you do bounds checking since the compiler
> will catch the comparison between 2 incompatible types.
>
The conversion happens before the bounds checking can be performed
unless the index type is a template parameter of the operator[]
declaration. That is not the case for the standard library containers.
>
> But this would also still be in the category of mistakes of “using the
> wrong type to index”. The code will also compile if you pass an
> unsigned value to a signed interface, it did nothing to help there.
>
Correct, but that is only problematic for (unsigned) values that exceed
the range of the signed integer type, which is an uncommon occurrence.
Bounds checking trivially catches these cases by rejecting negative values.
>
> Then again, variable promotion rules have their own category of bugs,
> and I see no reason why you should make an interface represent the
> wrong thing because somewhere else is broken.
>
This isn't an issue of right or wrong; it is an issue of tradeoffs given
the problems caused by implicit integer promotions and conversions. Use
of signed types enables more bugs to be detected.

Tom.

> *From:*Tom Honermann <tom_at_[hidden]>
> *Sent:* Tuesday, December 10, 2024 10:22 PM
> *To:* std-proposals_at_[hidden]
> *Cc:* Tiago Freire <tmiguelf_at_[hidden]>
> *Subject:* Re: [std-proposals] Signed sizes
>
> On 12/10/24 12:50 PM, Tiago Freire via Std-Proposals wrote:
>
> I will take the opportunity to address the raised by Thiago.
>
> > Except where it isn't and negative indices have a "count from the
> end" meaning or an API can return a negative value to indicate
> "item not found". Whether that is good API or not we can debate,
> but those API exist.
>
> I’ve never stated that there aren’t any libraries with an API that
> works like that. It is a matter of fact that they do exists Qt
> being one of them, and it is also a matter of fact that there are
> many libraries who are ill-formed, it is not a productive
> discussion to have. Whether or not those API’s are good (and
> specifically how the C++ standard has handled it), is on the other
> hand the whole point.
>
> And while I would even agree that signed indexing is appropriate
> in situations where you can index from a middle point (including
> backwards or forwards), this is not the case of the indexable
> standard containers, nor it is what is being argued.
>
> > Out of bounds access is just wrong, no matter which side. I
> don’t think it makes sense to make a distinction here.
>
> I fundamentally disagree.
>
> var[3285446] may or may not be valid, conceptually it is a thing
> that you can do with a container that you can index from the
> beginning. It can be out of bounds, but it might also not be.
>
> var[-1] is always wrong, conceptually you are trying to access an
> element before first element of the container. I don’t need to
> check if this index is a valid index in the container, it’s just
> wrong.
>
> While var[-1] is always wrong (for a container that doesn't support
> negative indexes), the integer promotion rules conspire to ensure that
> such code successfully compiles when the index type is an unsigned
> integer type. The result is either an out of bounds condition or an
> access of an element other than what the programmer intended; a bug in
> either case. Use of an unsigned type hides such bugs making them more
> difficult to discover.
>
> Tom.
>
> > Fortunately this can actually codegen the same as a traditional
> unsigned comparison so I don’t think it should be a concern:
> https://godbolt.org/z/P7xTbxWhW
>
> While you can make it codegen the same, look at the amount of
> tweaking you had to do. Just remove the “[[assume(n >= 0)]];” and
> your argument is over. It’s convoluted, nobody codes like that,
> and nobody codes constantly looking back at the assembly and
> checking if it generated the same as the correct solution (which
> is to just use unsigned).
>
> Write code and express what you mean, be correct by default.
>
> > Additionally Bjarne’s paper offers a thoughtful argument as to
> why indexes and sizes should just be signed:
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf
>
> Well, Bjarne is just wrong on this paper, plain and simple. It
> commits the categorical error of indexing containers with the
> wrong type, teachability is the least of its problems, introducing
> bugs because you don’t know what you are doing, and you are used
> to doing the wrong thing is a far worse problem.
>
> Hopefully, I don’t need to explain why it is wrong, right?
>
> *From:*Jeremy Rifkin <rifkin.jer_at_[hidden]>
> <mailto:rifkin.jer_at_[hidden]>
> *Sent:* Tuesday, December 10, 2024 6:02 PM
> *To:* Tiago Freire <tmiguelf_at_[hidden]>
> <mailto:tmiguelf_at_[hidden]>
> *Subject:* Re: [std-proposals] Signed sizes
>
> Hi Tiago,
>
> > While indexing very high values is dubiously wrong, indexing
> negative values is unquestionably wrong.
>
> Out of bounds access is just wrong, no matter which side. I don’t
> think it makes sense to make a distinction here.
>
> Indexes are often pointed to as an example where unsigned is
> natural since negatives don’t make sense but the problem is
> unsigned doesn’t really provide any safety. I think this cppcon
> lightning talk explained it better than I can:
>
> https://youtu.be/wvtFGa6XJDU?si=iv5F5-SI9xQn-x4X. Additionally
> Bjarne’s paper offers a thoughtful argument as to why indexes and
> sizes should just be signed:
>
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf
>
> > Plus, if you make indexing signed you would need to perform
> double side bounds checking of indexes, while with unsigned you
> just need to do a one side bounds check since unsigned values can
> not be smaller than 0.
>
> Fortunately this can actually codegen the same as a traditional
> unsigned comparison so I don’t think it should be a concern:
>
> https://godbolt.org/z/P7xTbxWhW
>
> Cheers,
>
> Jeremy
>
> On Tue, Dec 10, 2024 at 00:32 Tiago Freire <tmiguelf_at_[hidden]>
> wrote:
>
> I agree with making things uniform, but I completely disagree
> with making "signed" the default interface for indexing.
>
> While indexing very high values is dubiously wrong, indexing
> negative values is unquestionably wrong.
> There's no such thing as negatively indexing into an array,
> that is always wrong (even if achieves the exact same effect
> as a too high number), there's also no such thing as a
> container with a negative amount of slots.
> Plus, if you make indexing signed you would need to perform
> double side bounds checking of indexes, while with unsigned
> you just need to do a one side bounds check since unsigned
> values can not be smaller than 0.
> Signed integers are weird.
> Unsigned integers should be the default, not the exception.
>
>
>

Received on 2024-12-10 22:50:54