C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Dedicated website with AI that has processed all papers

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Wed, 28 May 2025 13:38:51 +0100
On Wed, 28 May 2025 at 13:32, Jonathan Wakely <cxx_at_[hidden]> wrote:
>
> On Wed, 28 May 2025 at 10:18, Frederick Virchanza Gotham via
> Std-Proposals <std-proposals_at_[hidden]> wrote:
> >
> > On Wed, May 28, 2025 at 12:33 AM Oliver Hunt wrote:
> > >
> > > Well now you have one person saying that they do not
> > > give you permission to use their work.
> >
> >
> > That's three now: Jonathan, Oliver, René. I'm keeping a list here.
> >
> >
> > > Even if we were to try to assume goodwill on your part, when someone
> > > said you did not have their consent to steal their work, you said “I don’t
> > > care, I’m going to do it anyway”. That you think that is a reasonable, or
> > > even remotely ethical, behavior is absurd. Step 1 of being in any community
> > > is demonstrating some amount of respect for others in the community, and
> > > you haven’t demonstrated even the slightest semblance of that.
> >
> >
> > This is clearly an emotive subject for some.
> >
> > If I told you that I was using 'grep' at the command line to go
> > through papers, I don't think that you'd complain (I hope not). Now
> > this is where my mindset diverges from the mindsets of a few people
> > here: I don't think that using 'grep' at the command line is much
> > different from using an offline (i.e. no internet connection) large
> > language model to do retrieval-augmented generation to search for
> > papers. The contents of the paper, and also the data derived from it,
> > are erased from the AI's memory as soon as the search ends -- nothing
> > is stored persistently anywhere.
>
> If you're not training the LLM on the papers and producing derivative
> works from those papers, then copyright isn't relevant.

Although I think you can probably see why people are concerned when
the thread topic says "AI that has processed all papers".

That certainly sounds like an LLM that has been trained on all the
papers, not one that retains no knowledge of them between queries. If
that's not what you're talking about now, then it's a very misleading
title.

Received on 2025-05-28 12:39:17