Date: Wed, 3 Dec 2025 20:06:54 +0800 (GMT+08:00)
We are discussing my proposal, which is an improvement proposal for C++ STL linked lists. Please enter the proposal discussion as soon as possible. Thanks.
方晶晶
20090187_at_[hidden]
Original:
From:std-proposals<std-proposals_at_[hidden]>Date:2025-12-03 19:57:16(中国 (GMT+08:00))To:std-proposals<std-proposals_at_[hidden]>Cc:Kamalesh Lakkampally <founder_at_[hidden]>Subject:[std-proposals] Core-Language Extension for Fetch-Only Instruction SemanticsHello all,
My name is Kamalesh Lakkampally, founder of an EDA startup, I am submitting draft proposal (PnnnnR0) for consideration by the Evolution Working Group (EWG).
The goal is to support workloads where the execution order of micro-tasks changes dynamically and unpredictably every cycle, such as event-driven HDL/SystemVerilog simulation.
In such environments, conventional C++ mechanisms (threads, coroutines, futures, indirect calls, executors) incur significant pipeline redirection penalties. Fetch-only instructions aim to address this problem in a structured, language-visible way.
II. IntroductionThis paper proposes a core-language extension introducing fetch-only instructions—metadata-carrying constructs handled entirely within the instruction-fetch stage, using a dedicated path that never enters the normal execution pipeline.
The design targets workloads with highly dynamic, micro-task–driven execution patterns, such as SystemVerilog simulation, where pipeline redirection penalties dominate performance.
III. Scenario DescriptionIn event-driven engines such as HDL simulators, the execution order of micro-tasks changes dynamically based on events.
Example Queue State 1:
Arr[0] = T1
Arr[1] = T2
Arr[2] = T3
Execution order: T1 → T2 → T3
Queue State 2 (after events):
Arr[0] = T3
Arr[1] = T4
Arr[2] = T1
Arr[3] = T2
Execution order: T3 → T4 → T1 → T2
These reorderings may occur very frequently. Existing hardware and language mechanisms must execute branches, indirect jumps, or scheduler calls—each causing pipeline flushes, mispredictions, and stalls.
IV. Limitations of Existing C++ FeaturesModern C++ provides threads, executors, coroutines, futures, task tables, and function-pointer invocation. All share a limitation: control-flow changes occur in the execution pipeline.
Impacts:
- Indirect calls cause branch mispredictions.
- Dynamic reordering causes repeated pipeline flushes.
- Coroutines incur state-machine execution overhead.
- std::thread/std::async rely on OS scheduling, unsuitable for micro-task switching.
- Executors/work-stealing operate in software and still rely on pipeline-based jumps.
For workloads with thousands of per-cycle task transitions, these penalties accumulate into a dominant bottleneck.
V. Proposed Solution: Fetch-Only InstructionsWe propose a C++ core-language abstraction mapping to architectural fetch-only instructions.
Characteristics:
- Processed entirely in the fetch stage.
- Never decoded or executed.
- Carry: instruction_address, thread_context, execution_context.
- Redirect the fetch PC without pipeline involvement.
- Avoid many branch misprediction and call/jump penalties.
Provisional C++ syntax:
fad q[i] = address_of(task);
fcd q[i] = thread_context;
fed q[i] = exec_context;
The fetch subsystem uses these metadata entries to follow the dynamic queue order with minimal redirection cost.
For example, updating q[] enables hardware to execute:
T3 → T4 → T1 → T2
without pipeline flushes.
VI. Required Changes in C++ and OSA. C++ Core Language
Introduce new semantic category of instructions (fetch-only) with defined behavior:
- Visible to the abstract machine as fetch-control constructs.
- No execution-stage participation.
- No side effects beyond fetch redirection.
B. Abstract Machine Model
Extend program order to include fetch-stage–only redirections.
Clarify that these operations do not constitute observable side effects.
C. Fetch-Only Memory Region (OS-Level Support)
A new region in the virtual address space, analogous to:
- Stack (for call frames)
- Heap (for dynamic objects)
Fetch-Only Region properties:
- Holds fetch-only metadata (instructions addresses + 8-bit thread/execution context).
- MMU-enforced write validation.
- Stricter rules than normal memory.
- Prevents unauthorized metadata forging.
D. MMU / Validation
MMU enforces:
- Whether context bits may be updated.
- Whether address updates are permitted.
- Masking/rejecting invalid writes.
E. Optional Function Prologue Context Check
Functions may embed an 8-bit comparison to ensure valid entry conditions.
These changes enable fetch-only instructions as a safe and efficient core-language construct.
I would greatly appreciate feedback, criticism, and suggestions from the community.
I am also open to collaboration.
This is an early-stage concept, and I welcome any guidance on improving the design, refining the semantics, or adapting the idea to better align with the C++ abstract machine and WG21 process.
Thank you for your time, and I look forward to your comments.
Best Regards,
Kamalesh Lakkampally,
Founder & CEO
www.chipnadi.com
方晶晶
20090187_at_[hidden]
Original:
From:std-proposals<std-proposals_at_[hidden]>Date:2025-12-03 19:57:16(中国 (GMT+08:00))To:std-proposals<std-proposals_at_[hidden]>Cc:Kamalesh Lakkampally <founder_at_[hidden]>Subject:[std-proposals] Core-Language Extension for Fetch-Only Instruction SemanticsHello all,
My name is Kamalesh Lakkampally, founder of an EDA startup, I am submitting draft proposal (PnnnnR0) for consideration by the Evolution Working Group (EWG).
The goal is to support workloads where the execution order of micro-tasks changes dynamically and unpredictably every cycle, such as event-driven HDL/SystemVerilog simulation.
In such environments, conventional C++ mechanisms (threads, coroutines, futures, indirect calls, executors) incur significant pipeline redirection penalties. Fetch-only instructions aim to address this problem in a structured, language-visible way.
II. IntroductionThis paper proposes a core-language extension introducing fetch-only instructions—metadata-carrying constructs handled entirely within the instruction-fetch stage, using a dedicated path that never enters the normal execution pipeline.
The design targets workloads with highly dynamic, micro-task–driven execution patterns, such as SystemVerilog simulation, where pipeline redirection penalties dominate performance.
III. Scenario DescriptionIn event-driven engines such as HDL simulators, the execution order of micro-tasks changes dynamically based on events.
Example Queue State 1:
Arr[0] = T1
Arr[1] = T2
Arr[2] = T3
Execution order: T1 → T2 → T3
Queue State 2 (after events):
Arr[0] = T3
Arr[1] = T4
Arr[2] = T1
Arr[3] = T2
Execution order: T3 → T4 → T1 → T2
These reorderings may occur very frequently. Existing hardware and language mechanisms must execute branches, indirect jumps, or scheduler calls—each causing pipeline flushes, mispredictions, and stalls.
IV. Limitations of Existing C++ FeaturesModern C++ provides threads, executors, coroutines, futures, task tables, and function-pointer invocation. All share a limitation: control-flow changes occur in the execution pipeline.
Impacts:
- Indirect calls cause branch mispredictions.
- Dynamic reordering causes repeated pipeline flushes.
- Coroutines incur state-machine execution overhead.
- std::thread/std::async rely on OS scheduling, unsuitable for micro-task switching.
- Executors/work-stealing operate in software and still rely on pipeline-based jumps.
For workloads with thousands of per-cycle task transitions, these penalties accumulate into a dominant bottleneck.
V. Proposed Solution: Fetch-Only InstructionsWe propose a C++ core-language abstraction mapping to architectural fetch-only instructions.
Characteristics:
- Processed entirely in the fetch stage.
- Never decoded or executed.
- Carry: instruction_address, thread_context, execution_context.
- Redirect the fetch PC without pipeline involvement.
- Avoid many branch misprediction and call/jump penalties.
Provisional C++ syntax:
fad q[i] = address_of(task);
fcd q[i] = thread_context;
fed q[i] = exec_context;
The fetch subsystem uses these metadata entries to follow the dynamic queue order with minimal redirection cost.
For example, updating q[] enables hardware to execute:
T3 → T4 → T1 → T2
without pipeline flushes.
VI. Required Changes in C++ and OSA. C++ Core Language
Introduce new semantic category of instructions (fetch-only) with defined behavior:
- Visible to the abstract machine as fetch-control constructs.
- No execution-stage participation.
- No side effects beyond fetch redirection.
B. Abstract Machine Model
Extend program order to include fetch-stage–only redirections.
Clarify that these operations do not constitute observable side effects.
C. Fetch-Only Memory Region (OS-Level Support)
A new region in the virtual address space, analogous to:
- Stack (for call frames)
- Heap (for dynamic objects)
Fetch-Only Region properties:
- Holds fetch-only metadata (instructions addresses + 8-bit thread/execution context).
- MMU-enforced write validation.
- Stricter rules than normal memory.
- Prevents unauthorized metadata forging.
D. MMU / Validation
MMU enforces:
- Whether context bits may be updated.
- Whether address updates are permitted.
- Masking/rejecting invalid writes.
E. Optional Function Prologue Context Check
Functions may embed an 8-bit comparison to ensure valid entry conditions.
These changes enable fetch-only instructions as a safe and efficient core-language construct.
I would greatly appreciate feedback, criticism, and suggestions from the community.
I am also open to collaboration.
This is an early-stage concept, and I welcome any guidance on improving the design, refining the semantics, or adapting the idea to better align with the C++ abstract machine and WG21 process.
Thank you for your time, and I look forward to your comments.
Best Regards,
Kamalesh Lakkampally,
Founder & CEO
www.chipnadi.com
Received on 2025-12-03 12:07:00
