Date: Thu, 07 Aug 2025 13:04:50 -0700
On Thursday, 7 August 2025 06:43:30 Pacific Daylight Time Hans Åberg via Std-
Proposals wrote:
> Intel Coffee Lake 2.4–4.1 GHz
> div divq
> clang O3 28 ns 25 ns
That's a 10-year-old architecture. The latency for this instruction is up to
73 cycles on CFL: https://uops.info/html-instr/DIV_R64.html#CFL
25 ns is either 73 cycles at 2.92 GHz or 60 cycles at 2.4 GHz or anything in-
between.
If you're going to micro-benchmark, I suggest measuring something that doesn't
change with frequency, like the cycle count or the number of instructions
retired (or both).
Sunny Cove improves that to 18 cycles:
https://uops.info/html-instr/DIV_R64.html#ICL
Similar on AMD Zen 3 and 4:
https://uops.info/html-instr/DIV_R64.html#ZEN4
Proposals wrote:
> Intel Coffee Lake 2.4–4.1 GHz
> div divq
> clang O3 28 ns 25 ns
That's a 10-year-old architecture. The latency for this instruction is up to
73 cycles on CFL: https://uops.info/html-instr/DIV_R64.html#CFL
25 ns is either 73 cycles at 2.92 GHz or 60 cycles at 2.4 GHz or anything in-
between.
If you're going to micro-benchmark, I suggest measuring something that doesn't
change with frequency, like the cycle count or the number of instructions
retired (or both).
Sunny Cove improves that to 18 cycles:
https://uops.info/html-instr/DIV_R64.html#ICL
Similar on AMD Zen 3 and 4:
https://uops.info/html-instr/DIV_R64.html#ZEN4
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel Platform & System Engineering
Received on 2025-08-07 20:04:52