Will an American lawsuit save Canadian journalism?
A link tax could be nothing compared to LLM training theft, but Canada’s Liberal government is asleep at the wheel.
Late yesterday, the New York Times dropped the gauntlet on artificial intelligence company OpenAI and its investment partner Microsoft in a comprehensive lawsuit alleging extensive copyright infringement.
The core argument in this lawsuit could also have vast consequences and financial implications for Canadian journalism and print media, even though Canadian industry and its federal government is asleep at the wheel. Here’s why.
The upshot of the Time’s argument is that OpenAI infringed copyright because it allegedly extensively, without authorization, used New York Times-owned content to train its large language model artificial intelligence (LLM AI), ChatGPT. And now, because of this, ChatGPT - which is being used by millions of people - is producing responses for its users which plagiarize New York Times content without any reference (unlike a search engine which links a user back to the source content).
The lawsuit also argues that this plagiarized output derives a significant amount of income for OpenAI without any link back to the New York Times or compensation for its work, so ergo, the Times should be entitled to a significant amount of damages.
(An excellent overview of the lawsuit and how egregious the alleged copyright infringement is can be found here, and the full text of the complaint can be found here.)
Given that OpenAI is presently valuated somewhere in the $100B neighbourhood, that Microsoft is laughing all the way to the bank due to its partnership with OpenAI, that OpenAI appears to have heavily relied upon New York Times-owned content to build and operate ChatGPT, and that OpenAI may have poo-pooed demands from the New York Times to address the issue, the lawsuit could result in either a massive financial settlement or a massive landmark ruling with considerable damages being awarded.
But despite all this, and data theft in LLM training has become a mainstream political issue in the United States and Europe, it’s been crickets from the Canadian government and Canadian print media.
Which begs the question: why?
The answer is pretty simple. Both groups seem to have been so myopically focused on squeezing social media platforms and search engines like Facebook and Google, which already drove traffic to news media sites by linking back to their source content, via the tremendously shitty and flawed bill C-18 that they seem to have ignored the fact that A MASSIVE AMERICAN CORPORATION MAY BE FLAT OUT STEALING THEIR CONTENT AND MAKING BILLIONS WHILE DOING IT.
This posture is a big mistake for several reasons.
First, taking on a well-funded company like OpenAI and its deep pocketed investors is a mountain to climb for any company, even the New York Times, one of the few examples of a print media company making a financially successful transition to digital media. It becomes even more challenging in a technologically nascent area like LLM AI, where little legal precedent exists regarding how existing laws apply to this new paradigm. That means it would be virtually impossible for small, nearly bankrupt Canadian print media outlets to legally challenge these players without a coordinated effort. And now that Canadian print media is becoming almost solely funded by the government, their ability to independently challenge issues like the one presented by ChatGPT is even more muted.
Second, how Canadian laws on copyright laws compare to those in the United States and how they interact via trade agreements may create even more legal uncertainty and loopholes for Canadian creators who have unwittingly had their works used to train and operate LLM AIs. If there ever were a legitimate reason for the federal government to step in and provide clarity to preempt endless messy and expensive court cases while newspapers go bankrupt, this would be it.
However, because the government has done virtually no analysis regarding how things like provisions in Canada’s most recent trade agreement with the United States may impact the ownership of Canadian content in light of LLM AIs like ChatGPT, if laws or legal rulings are made in the United States before Canada can sort its legal position out, Canada may end up being a rule-taker, not a rule maker.
Instead, Canadian print media spent the last critical 13 months since ChatGPT was released filing questionable competition bureau complaints against organizations that already provide free traffic to their sites, and the Canadian Liberal government wasted time and money on ramming through bill C-18 (which they ended up walking back anyway). Moreover, there has been little public analysis done in Canada on the extent to which companies like OpenAI have used Canadian content - and not just news content - to train and operate their LLMs, which means it’s difficult for legislators to assess the scope of the problem.
It takes considerable effort and resources to produce quality journalistic content. No one is denying that. But some of the owners of Canadian print media and their comrades in the Canadian federal Liberal party have been so focused on saving an obsolete business model for journalism that they appear to have missed the threat that LLM AIs pose to the future of journalism in the years ahead.
In that, Canada owes a debt of gratitude to the New York Times for pressing forward with yesterday’s lawsuit and taking on OpenAI.
Godspeed.