The full report by the Tony Blair Institute is out, in which they detail how copyright should be "rebooted" in the AI age.
It is - and I don't say this lightly - terrible.
It reads like a cross between a big tech lobbying document and a funding proposal for a new academic centre no one wants (it literally includes this).
I've pulled out some quotes from the report and added my thoughts below. I wish I had time to write a more in-depth rebuttal. But this should give you an idea of its contents. I encourage you to read the whole thing yourself.
⬇️📄⬇️
“The UK government has proposed a text and data mining exception with the possibility for rights holders to opt-out. This … [would give] rights holders increased control of how their data are used.”
This is not true. Existing UK copyright law gives rights holders full control over how their works are used by UK AI companies. Moving to an opt-out model would inevitably mean that many rights holders would miss the chance to opt out or fail to opt out all their works. It could *only* reduce the amount of control rights holders have.
“A lack of clarity harms all stakeholders. This includes creators, who are not properly remunerated for their labour”.
There is no lack of clarity over the current law; and creators are only not remunerated for their labour if British AI companies break the law.
“Today, the application of UK copyright law to the training of AI models remains contested.”
This is not true. I have seen zero arguments that commercial gen AI training on copyrighted work without a licence is legal in the UK. Even AI companies understand it is currently illegal.
“Currently, United Kingdom copyright law provides insufficient clarity for creators, rights holders, developers and consumer groups”.
Same point, still not true.
“The free flow of information has been a key principle of the open web since its inception.”
Putting your content online does not entitle anyone to use it for whatever they like, for free.
“To argue that commercial AI models cannot learn from open content on the web would be close to arguing that knowledge workers cannot profit from insights they get when reading the same content.”
This is a classic big tech talking point, but it is incredibly misleading. That humans will learn from your work is assumed when you create; we do *not* assume commercial gen AI models will be trained on it. Moreover, commercial gen AI models scale in ways no individual human can. Commercial gen AI creates hyper-scalable competitors to creators by exploiting their work.
“Generative AI is here to stay. Already, music generator Beatoven .ai has built a fully licensed generative music model, and KL3M has produced a “fairly trained” large language model (LLM).”
Yes - both these companies are certified under the Fairly Trained certification scheme I run. Both vehemently object to unlicensed training; both prove that it is possible to build gen AI models without stealing people’s work. These are terrible examples if the authors are trying to argue that we must legalise IP theft. They show the opposite - that it is possible to build gen AI without the theft.
“Photography and sampling are key examples of technologies that sparked debates about creative ownership but ultimately led to artistic renewal rather than extinction.”
Yes, but cameras are not built by exploiting the work of the world’s creators, and music samples have to be licensed. These are not in any way comparable to building gen AI based on IP theft.
“But, in general, copyright controls copying; it does not control other ways in which those engaging with the material might use its intellectual content.”
Training AI models involves copying. It is hugely misleading to suggest otherwise.
“Developers are not set to make long-term profits from publicly trained data [so AI companies shouldn’t have to pay for training data].”
This is an incredibly surprising argument. Why would VCs be investing so many billions of dollars into this space if there were not long-term profits to be made?
“It is hotly debated whether model weights should be thought of as a copy of the training data.”
This is intentionally misleading. Many of the lawsuits simply claim that training involves copying, which is not disputed in the slightest.
(As an aside, it is very odd that they quote the robots.txt page from my personal site, when I have publicly pointed out that this is Squarespace’s default robots.txt page which cannot be changed. I would hope the people writing this report understood robots.txt well enough to be aware of that.)
“Generative AI may never be good enough to be a substitute for all human activities for which people get paid.”
This is a straw man. The competitive effects of gen AI on the works it’s trained on, and the people behind those works, can be huge, without AI being a substitute for “all human activities for which people get paid”.
“Generative AI will continue to improve but its greatest value is likely to be in augmenting existing workflows, as demonstrated by platforms such as Invoke AI.”
This *totally* ignores the mounting evidence that generative AI competes with its training data. Every study so far reinforces this common-sense hypothesis, e.g. that the introduction of ChatGPT decreased demand for freelance writing tasks by 30%.
THEIR PROPOSED SOLUTIONS
“The UK government should establish the Centre for AI and the Creative Industries.”
I have no particular issue with the idea of a new academic centre. But it is totally nonsensical to propose this as some kind of solution to the AI / copyright problem. It is orthogonal to the question at hand. Why would creators give up strong copyright protections, which they desperately want, for a new academic institute, which no one is asking for? This is an absolutely insane proposition, and I am genuinely shocked it made it into the report. (It’s also worth noting that an early draft of the report I saw suggested that an eye-popping £50M a year would be a good price tag for this centre.)
They suggest “a targeted levy on ISPs” to remunerate creators for their work being used.
To be clear, this is a tax on consumers (their words) - “pennies per month for a regular household”. Charging consumers for training data, rather than the AI companies that exploit it commercially, is perhaps the most brazen big-tech talking point of the entire report. And, even more astonishingly, they argue that “the priority of this revenue would be funding the Centre for AI and the Creative Industries.” They want to let AI companies pay nothing for training data; instead charge the general public; and not even give the new tax revenue to creators, but instead give it to an unwanted academic centre.
--
This report reads like it was written by big tech, capped off by a truly astonishing set of proposals that would be punitive to both creators and the general public - everyone, that is, except the AI companies who want to exploit creators' life's work for free, to build AI models that compete with them.
It's perhaps not surprising that the Tony Blair Institute would align so closely with big tech interests. We can only hope the government doesn't give this report the same weight as the protests of the country's creators.
Full report in the next tweet.