> I think it would make sense to release the copyright to the AI companies to use in training. And then insist they used that information as purveyors of programming tools.

1. What does that mean? The ISO organization owns the copyright, period. The ISO grants me permission to consult the standard document and use its information to create something (in the terms that are licensed by the copyright) but I don’t own “a copyright”.

2. Copyright is not what stopping AI companies. That has never stopped them before. And I have asked commercial versions of such chat bots. And even though they’ve completely made up the version of the reference document, they were able to correctly cite new C++26 features as well as correctly cite the sections in the latest working draft where those features are described. Meaning, they already have been trained on them.
We are talking past tense here. It already happened.

> The DG sees or recognizes a critical "Garbage In, Garbage Out" problem facing C++ developers using AI. Current models are trained on legacy C++ (C++98/03), vendor-specific dialects, and unsafe patterns found online.[…]

> In general, I find AI has a serious issue where (for no reason) it assumes your software may be 10 years out of date

What else would you expect to have happened?

The AI can’t just read a standard document and just spit out perfect modern code, hoping it understands what is modern and why. The only way it can learn is by looking at real code written by real people, and have them explain why we do things the new way and why we don’t do things the old way. And when something is new, that doesn’t exist.

And I have no interest in helping streamline something so that multibillion dollar company, that couldn’t possibly operate without the work that I do, just take my work for free, then use that to convince my boss to pay them instead of me, while getting poor people in rural areas homeless and sick.

Why are we doing this?

From: Std-Proposals <std-proposals-bounces@lists.isocpp.org> On Behalf Of Adrian Johnston via Std-Proposals
Sent: Tuesday, June 2, 2026 22:44
To: C++ Proposals <std-proposals@lists.isocpp.org>
Cc: Adrian Johnston <ajohnston4536@gmail.com>
Subject: [std-proposals] Strategic Direction for AI in C++: Governance, and Ecosystem

Recently (2026-02-23) the ISO C++ Directions Group (DG) / WG21 published a document:

Strategic Direction for AI in C++: Governance, and Ecosystem

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2026/p4023r0.pdf

As one of its findings it identified a problem with "Garbage In, Garbage Out".

The DG sees or recognizes a critical "Garbage In, Garbage Out" problem facing C++ developers using AI. Current models are trained on legacy C++ (C++98/03), vendor-specific dialects, and unsafe patterns found online.

I'd say this is an understatement.

What I am observing is that high quality websites like https://en.cppreference.com/ are blocking AI search tools because they don't generate advertising revenue. And so my AI (Claude) routinely ends searching for online posts made by people who are confused and asking for help and getting terse responses that may be incomplete at best.

Next, if I ask Claude what data it was given about the C++ standard, it says it was trained on "commentary, documentation, and discussion during training — not verbatim text." It can identify final drafts like N4950 as being available, but for some reason it needs to be explicitly encouraged to consult that document.

In general, the AI companies are being very careful to avoid been seen to use copywritten data like the C++ standard.

If we want AI generated responses and AI generated code to be as modern and correct as possible, I think it would make sense to release the copyright to the AI companies to use in training. And then insist they used that information as purveyors of programming tools.

If it is well known that there is no barrier to training an AI correctly on the most recent C++ standard and that users should expect verbatim information, and standards aware code from their AI, then I would hope for some improvement on the current situation. It is very easy to add RLHF training data if the AI company is allowed to use the standard to create it.

Oddly enough, Claude is capable of providing more modern code when requested. In general, I find AI has a serious issue where (for no reason) it assumes your software may be 10 years out of date, unless told otherwise.

Regards,

Adrian Johnston