C++ Logo

std-proposals

Advanced search

[std-proposals] Error on out-of-bounds index, and syntax for conversion

From: Levo D <levoplusplus_at_[hidden]>
Date: Fri, 22 Aug 2025 20:04:44 +0000
I'd be happy to work on this proposal with others.

This document is about bound-checking on arrays, not containers,
which can be added in the future after a successful implementation.

If we look at the following program, we'll notice several problems. Comments will explicitly explain them.
The two biggest issues are
        1) no easy way to convert a large array into a smaller array
        2) Compilers don't need to error when the size is known and a literal index is outside of it

#include <cstddef>

void test16(char (&arr)[16]) { arr[15] = 0x12; }
void test32(char (&arr)[32]) { arr[31] = 0x34; }

// Sanitizers won't catch this if you change 257 to 255
void test256(char (&arr)[256]) { arr[257] = 0x56; }

template <size_t N> void testN(char (&arr)[N]) {
        //test16(arr); // error because size is not exact
        test32(arr); // exact, compiles
        // Typecasting is bad, as we know
        test16(reinterpret_cast<char (&)[16]>(arr)); // compiles
        test256(reinterpret_cast<char (&)[256]>(arr)); // compiles and will overwrite memory
}

int main() {
        char buf[32]{};
        testN(buf); // two problems, 32 is smaller than 256 and
        // test256 writes to 257, which is clearly out of range
        // the below doesn't cause a warning (or error) in some compilers
        buf[-1] = 0x78;
        // I much rather the previous line be written as
        *(buf-1) = 0x78;
}


By having 'arr[257]' and 'buf[-1]' become an error, obvious mistakes will be caught immediately.
To allow testN to be implemented without casting, I suggest a syntax that accepts an index and length
Here are a few
        arr[I .. LEN] // may be confusing, but I think it is very readable
        arr[I length LEN] // length would be a contextual keyword
        arr[I, LEN] // may interfere with existing code that uses the comma operator
        arr[I : LEN] // may be confusing as a start and end rather than start and length
        
Then we may implement testN as something like

void testN(char (&arr)[128]) {
        test16(arr[0 .. 16]); // ok now
        test16(arr[..]); // ok as long as array is big enough
        test16(arr[10 ..]); // starts at index 10 takes an implicit amount of length
        // 128-10 would allow converting arrays to any size up to 118

        test32(arr[.. 32]); // no longer exact since we changed this function signature
        test256(arr[..]); // errors, 256>128
}

The next problem is using a variable as an index. Since a lot of code would break, this should be opt-in.
I have implemented the below in a compiler, so I understand the complexity of each, but the compiler wasn't a C++ compiler.

// Extremely simple
void myfunc(char (&arr)[256]) {
        // loop variable, positive and less than array size
        for(size_t i=0; i<256; i++) {
                doSomething(arr[i]);
        }
}

// Fairly simple
void myfunc(char (&arr)[256], int n) {
        if (n >= 0 && n < 128) { // Note 128 is smaller than array length
                doSomething(arr[n]);
        }
}

// Nearly as simple
void myfunc(char (&arr)[256], int n) {
        if (n < 0 || n >= 128)
                return;
        // At this point n must be a positive, and must be smaller than the array length
        doSomething(arr[n]);
}

// Medium to hard, I'm not sure if this is common enough to support this
// The issue here is n is mutated after the check
void myfunc(char (&arr)[256], int n) {
        if (n < 0 || n >= 254)
                return;
        
        doSomething(arr[++n]);
        doSomething(arr[++n]);
}

Having the above makes handling binary simpler.
Let's highlight the difference by using span. There will be assert(s), but no compile-time error.

#include <span>
int test1(std::span<int> file, int n) {
        if (file[0] != 0x12 && file[2] != 0x34)
                return -1;
        return file[n] + file[100];
}
int test2(std::span<int, 32> file, int n) {
        if (file[0] != 0x12 && file[2] != 0x34)
                return -1;
        return file[n] + file[100];
}
int main() {
        int tooSmall[]={1};
        int fileBadHeader[64]={1,2,3,4,5,6,7,8,9};
        int fileOkHeader[64]={0x12,0x34,3,4,5,6,7,8,9};

        //test1({tooSmall}, 100); // will assert
        test1({fileBadHeader}, 100); // will not assert
        //test1({fileOkHeader}, 100); // will assert
        test2(std::span{fileBadHeader}.subspan<10, 32>(), 100); // will not assert
        //test2(std::span{fileOkHeader}.subspan<0, 32>(), 100); // will assert
}

By using arrays, we can catch indexing 100 and the problematic index n if the user opts into the additional checks.

With additional support for parameters, we would be able to catch the error below.

void myFunc(std::vector<char>&v) {
        #define LIB_BUF_LEN 16
        auto buf = v.make_push(LIB_BUF_LEN);
        // image make_push signature being: T(&)[count] make_push(constexpr size_t count);
        for(int i=0; i<32; i++) { // LIB_BUF_LEN is smaller, this the array reference is smaller
                buf[i] = i; // oops, out of range found at compile time
        }
}

Received on 2025-08-22 20:04:46