C++ Logo

std-proposals

Advanced search

[std-proposals] Draft 2: Error on out-of-bounds index

From: Levo D <levoplusplus_at_[hidden]>
Date: Thu, 28 Aug 2025 22:43:55 +0000
I wouldn't be able to attend meetings. Is there a usual way to ask a member to work on a proposal with you? Below is my second draft

Introduction

Arrays with a known length are easy to understand and easy for the compiler to check the bounds of when using literals. However, it isn't easy to mix arrays with different sizes, and compilers generally don't offer a way to ensure a variable is in bounds. This proposal offers syntax to allow easier use of arrays of different sizes, and simple rules to ensure variables are in bounds if users want to opt into that level of strictness.

Motivation and Scope

When working with sizes from different sources (libraries, header files, etc), it is easy to change a number (manually or through an upgrade) and accidentally cause out-of-bounds access that may or may not be caught at runtime. This proposal would have the following be an error at compile time. A separate proposal will be for containers.

void myfunc(int largeAmount) {
        #define HEADER20 20
        #define HEADER32 32
        int buf[HEADER20];
        buf[HEADER32-1] = 0; // error

        // The following is an opt-in error
        for (int i=0; i<largeAmount; i++) {
                buf[i] = i;
        }
}

Impact On the Standard

This will introduce slicing syntax on arrays.

---
If we look at the following program, we'll notice several problems. The comments explain some of the problems.
The two core issues are
	1) There is no way to convert a large array into a smaller array
	2) Compilers offer enough warnings/errors to catch these
#include <cstddef>
void test16(char (&arr)[16]) { arr[15] = 0x12; }
void test32(char (&arr)[32]) { arr[31] = 0x34; }
// Sanitizers won't catch this if you change 257 to 255
void test256(char (&arr)[256]) { arr[257] = 0x56; }
template <size_t N> void testN(char (&arr)[N]) {
	//test16(arr); // error because size is not exact
	test32(arr); // exact, compiles
	// Typecasting is bad, as we know
	test16(reinterpret_cast<char (&)[16]>(arr)); // requires a cast
	test256(reinterpret_cast<char (&)[256]>(arr)); // cast allows this to compiler, and overwrite memory
}
int main() {
	char buf[32]{};
	testN(buf); // 32 is too small for test256, no errors
	// the below doesn't cause a warning (or error) in some compilers (my copy of gcc 15 doesn't)
	buf[-1] = 0x78;
	// I much rather the previous line be written as the next if intentional
	*(buf-1) = 0x78;
}
By having 'arr[257]' and 'buf[-1]' become an error, obvious mistakes will be caught immediately.
To allow testN to be implemented without casting, I suggest having a syntax to transform an array to a smaller array.
The syntax should have an index and a length. Here are some suggestions
	arr[I .. LEN] // may be confusing, but I think it is very readable
	arr[I length LEN] // length would be a contextual keyword
	arr[I, LEN] // may interfere with existing code that uses the comma operator
	arr[I : LEN]
Then we may implement testN as something like
void testN(char (&arr)[128]) {
	test16(arr[0 .. 16]); // ok now
	test16(arr[..]); // ok as long as array is big enough
	test16(arr[10 ..]); // starts at index 10 takes an implicit amount of length
	// 128-10 would allow converting arrays to any size up to 118
	test16(arr[.. 32]); // I have no problem allowing this, but I don't mind if this is an error since 32 != 16
	test256(arr[..]); // errors, 256>128
}
Non-Literal Bounds:
The next problem is using a variable as an index. Since a lot of code would break, this should be opt-in.
I have implemented the below in a compiler, so I understand the complexity; however, the compiler wasn't a C++ compiler.
// Extremely simple, i is a loop variable with a known max size
void myfunc(char (&arr)[256]) {
	// loop variable, positive and less than array size
	for(size_t i=0; i<256; i++) {
		doSomething(arr[i]);
	}
}
// Fairly simple, n is checked for being a positive number and less than a literal
void myfunc(char (&arr)[256], int n) {
	if (n >= 0 && n < 128) { // Note 128 is smaller than array length
		doSomething(arr[n]);
	}
}
// Nearly as simple, same ideas as above, except it requires inverting the check when the if branch doesn't merge
void myfunc(char (&arr)[256], int n) {
	if (n < 0 || n >= 128)
		return;
	// At this point n must be a positive, and must be smaller than the array length
	doSomething(arr[n]);
}
// Medium to hard. The issue is `n` is mutated after the check
void myfunc(char (&arr)[256], int n) {
	if (n < 0 || n >= 254)
		return;
	doSomething(arr[++n]);
	doSomething(arr[++n]);
}
Why Not Span
Span doesn't have special rules for bounds checking. The following would assert, but we're looking for a compile-time error, which does not happen.
The test1 and test2 only differ in the function signature. 
#include <span>
int test1(std::span<int> file, int n) {     // span<int>
	if (file[0] != 0x12 && file[2] != 0x34)
		return -1;
	return file[n] + file[100];
}
int test2(std::span<int, 32> file, int n) { // span<int, 32>
	if (file[0] != 0x12 && file[2] != 0x34)
		return -1;
	return file[n] + file[100];
}
int main() {
	int tooSmall[]={1};
	int fileBadHeader[64]={1,2,3,4,5,6,7,8,9};
	int fileOkHeader[64]={0x12,0x34,3,4,5,6,7,8,9};
	//test1({tooSmall}, 100); // will assert
	test1({fileBadHeader}, 100); // will not assert since header check fails
	//test1({fileOkHeader}, 100); // will assert
	test2(std::span{fileBadHeader}.subspan<10, 32>(), 100); // will not assert
	//test2(std::span{fileOkHeader}.subspan<0, 32>(), 100); // will assert
}
By using arrays, we can catch `file[100]` along with the 
problematic index `n` if the user opts into the non-literal bounds check.

Received on 2025-08-28 22:43:56