Date: Sun, 27 Aug 2023 22:20:38 +0100
On Sun, Aug 27, 2023 at 5:03 PM Thiago Macieira wrote:
>
> I hear what you're saying, but what you're saying is just not how things work.
> You're making an assumption of how compilers work and that's a flawed
> assumption.
>
> I mean, under some reading, it's true. The problem is that they are not coded
> the way your sentence, as phrased, supposed they are. It's not that they
> detected the UB and therefore made an optimisation. If that were the case,
> then they could emit warnings for every single possible UB and let the
> programmer know. That is clearly not possible, therefore that's clearly not
> how they are written.
>
> Instead of being written to detect undefined behaviour, they're written to
> operate on defined behaviour. What you're effectively asking for is that they
> begin detecting the UB possibilities (all of them) and make them specified.
>
> Moreover, you're not asking that they all have the same behaviour: you're
> assuming there's an underlying true behaviour that everyone agrees upon, and
> that can be demonstrated not to be true either. This is especially true when
> compilers have a choice of which instructions to use but whose behaviour is
> only the same within the defined behaviour parameters of the language.
I've never written a compiler nor been on a team of people writing a
compiler, nor even worked in a company that has a compiler-writing
department, nor do I frequent the same yoga studios as Jacob Navia,
but still. . . really. . . even in all my ignorance. . . I find that
hard to believe.
If I were to write a compiler to parse the following code:
for ( int i = 0; i >= 0; ++i ) DoSomething();
Then I would split the text between parentheses into three parts:
(1) int i = 0
(2) i >= 0
(3) ++i
I would then write the assembler for each of the three parts:
(1) sub rsp, 8; mov [rsp], 0
(2) cmp [rsp], 0; jlt Out_Of_Loop
(3) add [rsp], 1
And then I'd put them together something like:
sub rsp, 8
mov [rsp], 0
Start_Loop:
cmp [rsp], 0
jlt Out_Of_Loop
call DoSomething
add [rsp], 1
jmp Start_Loop
OutOfLoop:
add rsp, 8
If I were then some how some way to make the decision to omit the
third and fourth instructions, I could only have come to such a
decision by deeply analysing what's happening inside the loop and then
also applying a piece of pre-programmed knowledge, i.e. "incrementing
a signed integer can never go negative and so the check for negativity
is redundant". This is a lot more extra processing and elaborate
compiler writing than just simply doing what the programmer wrote.
Applying the marker '__verbose' to a function would also tell the
compiler, "Don't deeply analyse what's going on here and don't try to
get clever with your analysis -- just convert the C++ code expression
by expression, line by line into assembler please".
I think you quite rightly mentioned in a previous post Thiago that any
code written inside a '__verbose' function would be nonportable, and
so why try to standardise something that's nonportable? What we can do
is standardise a portable way of adding nonportable code to your C++
program. So if we need to compensate for a bug or for an avoidance to
break ABI, then we can at least do it as cleanly as possible. The
alternative would be to isolate the dodgy function into its own
translation unit all by itself, compile it with "-O0", and then link
it later with the rest of the program which was compiled with "-O2".Of
course different compilers have different command line switches for
optimisation. If we had a program that is compiled by a few different
compilers then we could do:
void Func(void) __verbose
{
#ifdef _BORLANC_
// Put hack here for Borland ABI
#elif _GNU_
// Put hack here for GNU ABI
#elif MICROSOFT_VCPP
// Put hack here for Microsoft ABI
#endif
}
This would be a lot handier than having to edit the Makefile and to
write three different ways of isolating this function in its own
translation unit with all optimisations disabled.
>
> I hear what you're saying, but what you're saying is just not how things work.
> You're making an assumption of how compilers work and that's a flawed
> assumption.
>
> I mean, under some reading, it's true. The problem is that they are not coded
> the way your sentence, as phrased, supposed they are. It's not that they
> detected the UB and therefore made an optimisation. If that were the case,
> then they could emit warnings for every single possible UB and let the
> programmer know. That is clearly not possible, therefore that's clearly not
> how they are written.
>
> Instead of being written to detect undefined behaviour, they're written to
> operate on defined behaviour. What you're effectively asking for is that they
> begin detecting the UB possibilities (all of them) and make them specified.
>
> Moreover, you're not asking that they all have the same behaviour: you're
> assuming there's an underlying true behaviour that everyone agrees upon, and
> that can be demonstrated not to be true either. This is especially true when
> compilers have a choice of which instructions to use but whose behaviour is
> only the same within the defined behaviour parameters of the language.
I've never written a compiler nor been on a team of people writing a
compiler, nor even worked in a company that has a compiler-writing
department, nor do I frequent the same yoga studios as Jacob Navia,
but still. . . really. . . even in all my ignorance. . . I find that
hard to believe.
If I were to write a compiler to parse the following code:
for ( int i = 0; i >= 0; ++i ) DoSomething();
Then I would split the text between parentheses into three parts:
(1) int i = 0
(2) i >= 0
(3) ++i
I would then write the assembler for each of the three parts:
(1) sub rsp, 8; mov [rsp], 0
(2) cmp [rsp], 0; jlt Out_Of_Loop
(3) add [rsp], 1
And then I'd put them together something like:
sub rsp, 8
mov [rsp], 0
Start_Loop:
cmp [rsp], 0
jlt Out_Of_Loop
call DoSomething
add [rsp], 1
jmp Start_Loop
OutOfLoop:
add rsp, 8
If I were then some how some way to make the decision to omit the
third and fourth instructions, I could only have come to such a
decision by deeply analysing what's happening inside the loop and then
also applying a piece of pre-programmed knowledge, i.e. "incrementing
a signed integer can never go negative and so the check for negativity
is redundant". This is a lot more extra processing and elaborate
compiler writing than just simply doing what the programmer wrote.
Applying the marker '__verbose' to a function would also tell the
compiler, "Don't deeply analyse what's going on here and don't try to
get clever with your analysis -- just convert the C++ code expression
by expression, line by line into assembler please".
I think you quite rightly mentioned in a previous post Thiago that any
code written inside a '__verbose' function would be nonportable, and
so why try to standardise something that's nonportable? What we can do
is standardise a portable way of adding nonportable code to your C++
program. So if we need to compensate for a bug or for an avoidance to
break ABI, then we can at least do it as cleanly as possible. The
alternative would be to isolate the dodgy function into its own
translation unit all by itself, compile it with "-O0", and then link
it later with the rest of the program which was compiled with "-O2".Of
course different compilers have different command line switches for
optimisation. If we had a program that is compiled by a few different
compilers then we could do:
void Func(void) __verbose
{
#ifdef _BORLANC_
// Put hack here for Borland ABI
#elif _GNU_
// Put hack here for GNU ABI
#elif MICROSOFT_VCPP
// Put hack here for Microsoft ABI
#endif
}
This would be a lot handier than having to edit the Makefile and to
write three different ways of isolating this function in its own
translation unit with all optimisations disabled.
Received on 2023-08-27 21:20:46