ISOCPP std-proposals List: Re: [std-proposals] Strategic Direction for AI in C++: Governance, and Ecosystem

From: Adrian Johnston <ajohnston4536_at_[hidden]>
Date: Tue, 2 Jun 2026 14:33:09 -0700

Elon Musk has made public statements about the infeasibility of using AI to
train AI.

All I know is Anthropic is about to IPO at a trillion dollars while
shipping a product that I have to spend half my time teaching to be a
senior engineer.

On Tue, Jun 2, 2026 at 2:22 PM Sebastian Wittmeier via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> What is missing is the approach to preprocess training data with AI,
> either to filter out old idioms or to convert those into modern code.
>
>
>
> I think that is stronger than finetuning or custom instructions after
> training with old code.
>
>
>
> It depends on the quality (and feasibility) of this automatic
> preprocessing.
>
>
>
>
>
> (some may find this worse: An AI trained not on human-written code, but on
> AI generated or at least refactored code.)
>
>
> -----Ursprüngliche Nachricht-----
> *Von:* Adrian Johnston via Std-Proposals <std-proposals_at_[hidden]>
> *Gesendet:* Di 02.06.2026 22:44
> *Betreff:* [std-proposals] Strategic Direction for AI in C++: Governance,
> and Ecosystem
> *An:* C++ Proposals <std-proposals_at_[hidden]>;
> *CC:* Adrian Johnston <ajohnston4536_at_[hidden]>;
> Recently (2026-02-23) the ISO C++ Directions Group (DG) / WG21 published a
> document:
>
> Strategic Direction for AI in C++: Governance, and Ecosystem
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2026/p4023r0.pdf
>
> As one of its findings it identified a problem with "Garbage In, Garbage
> Out".
>
>
> *The DG sees or recognizes a critical "Garbage In, Garbage Out" problem
> facing C++ developers using AI. Current models are trained on legacy C++
> (C++98/03), vendor-specific dialects, and unsafe patterns found online.*
>
>
> I'd say this is an understatement.
>
> What I am observing is that high quality websites like
> https://en.cppreference.com/ are blocking AI search tools because they
> don't generate advertising revenue. And so my AI (Claude) routinely ends
> searching for online posts made by people who are confused and asking for
> help and getting terse responses that may be incomplete at best.
>
> Next, if I ask Claude what data it was given about the C++ standard, it
> says it was trained on "commentary, documentation, and discussion during
> training — not verbatim text." It can identify final drafts like N4950 as
> being available, but for some reason it needs to be explicitly encouraged
> to consult that document.
>
> In general, the AI companies are being very careful to avoid been seen to
> use copywritten data like the C++ standard.
>
> If we want AI generated responses and AI generated code to be as modern
> and correct as possible, I think it would make sense to release the
> copyright to the AI companies to use in training. And then insist they used
> that information as purveyors of programming tools.
>
> If it is well known that there is no barrier to training an AI correctly
> on the most recent C++ standard and that users should expect verbatim
> information, and standards aware code from their AI, then I would hope for
> some improvement on the current situation. It is very easy to add RLHF
> training data if the AI company is allowed to use the standard to create it.
>
> Oddly enough, Claude is capable of providing more modern code when
> requested. In general, I find AI has a serious issue where (for no reason)
> it assumes your software may be 10 years out of date, unless told otherwise.
>
> Regards,
> Adrian Johnston
>
>
>
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2026-06-02 21:33:22