Google AI Overviews = Theft? Court Ruling Sets Precedent

By Chris Barnhart Last updated May 16, 2024

Google’s bold new vision for the future of online search, powered by AI technology, is fuelling an industrywide backlash over fears it could damage the internet’s open ecosystem.

At the center of the controversy are Google’s newly launched “AI Overviews,” which are generated summaries that aim to directly answer search queries by pulling in information from across the web.

AI overviews appear prominently at the top of results pages, potentially limiting users’ need to click through to publishers’ websites.

The move sparked legal action in France, where publishers filed cases accusing Google of violating intellectual property rights by ingesting their content to train AI models without permission.

A group of French publishers won an early court battle in April 2024. A judge ordered Google to negotiate fair compensation for repurposing snippets of their content.

Publishers in the US are raising similar objections as Google’s new AI search overviews threaten to siphon traffic away from sources. They argue that Google unfairly profits from others’ content.

The debate highlights the need for updated frameworks governing the use of online data in the age of AI.

Concerns From Publishers

According to industry watchers, the implications of AI overviews could impact millions of independent creators who depend on Google Search referral traffic.

Frank Pine, executive editor at MediaNews Group, tells The Washington Post:

“If journalists did that to each other, we’d call that plagiarism.”

Pine’s company, which publishes the Denver Post and Boston Herald, is among those suing OpenAI for allegedly scraping copyrighted articles to train their language models.

Google’s revenue model has long been predicated on driving traffic to other websites and monetizing that flow through paid advertising channels.

AI overviews threaten to shift that revenue model.

Kimber Matherne, who runs a food blog, is quoted in the post article stating:

“[Google’s] goal is to make it as easy as possible for people to find the information they want. But if you cut out the people who are the lifeblood of creating that information, then that’s a disservice to the world.”

According to the Post’s report, Raptive, an ad services firm, estimates the changes could result in $2 billion in lost revenue for online creators.

They also believe some websites could lose two-thirds of their search traffic.

Raptive CEO Michael Sanchez tells The Post:

“What was already not a level playing field could tip its way to where the open internet starts to become in danger of surviving.”

Concerns From Industry Professionals

Google’s AI overviews are understandably raising concerns among industry professionals, as expressed through numerous tweets criticizing the move.

Matt Gibbs questioned how Google developed the knowledge base for its AI, bluntly stating, “They ripped it off publishers who did the actual work to create the knowledge. Google are a bunch of thieves.”

From the top of Google’s “Generative AI in Search” article today.

How did they develop that knowledge base?

They ripped it off publishers who did the actual work to create the knowledge.

Google are a bunch of thieves. pic.twitter.com/SIkPqtWZwa

— Matt Gibbs (@ematt) May 14, 2024

In her tweet, Kristine Schachinger echoed similar sentiments, referring to Google’s AI answers as “a complete digital theft engine which will prevent sites getting clicks at all.”

.@sundarpichai and @Google launch AI answers at #GoogleIO2024 otherwise known as a complete digital theft engine which will prevent sites getting clicks at all.

We need government to step in now and press to bring the sunshine.

This is ONE AI ANSWER.
Click into it. pic.twitter.com/5NNtKAURxC

— Kristine (@schachin on Threads) 🇺🇦 (@schachin) May 14, 2024

Gareth Boyd retweeted a quote from the Washington Post article highlighting the struggles of blogger Jake Boly, whose site recently saw a 96% drop in Google traffic.

Boyd said, “The precedent being set by OpenAI and Google is scary…” and that “more people should be equally angry” at both companies for the “open theft of content.”

The precedent being set by OpenAI and Google is scary… more people should be equally angry with OpenAI as well as Google for the open theft of content.

To be clear, I HATE regulation, but by the time AI is quite rightly regulated, it will be too late. https://t.co/KsbNUKopeV

— Gareth Boyd (@garethaboyd) May 15, 2024

In his tweet, Avram Piltch directly accused Google of theft, stating, “the data used to train their AI came from the very publishers that allowed Google to crawl them and are now going to be harmed. This is theft, plain and simple. And it’s a threat to the future of the web.”

You can say that Google doesn’t “owe” publishers anything, but the data used to train their AI came from the very publishers that allowed Google to crawl them and are now going to be harmed. This is theft, plain and simple. And it’s a threat to the future of the web. https://t.co/buDZgRaSuL

— Avram Piltch (@geekinchief) May 15, 2024

Lily Ray made a similar claim about Google: “Using all the content they took from the sites that made Google. With little to no attribution or traffic.”

Using all the content they took from the sites that made Google. With little to no attribution or traffic. https://t.co/0sNwk2ASmT

— Lily Ray 😏 (@lilyraynyc) May 14, 2024

Legal Gray Area

The controversy taps into broader debates around intellectual property and fair use, as AI systems are trained on unprecedented scales of data scraped across the internet.

Google argues its models only ingest publicly available web data and that publishers previously benefited from search traffic.

Publishers implicitly consent to their content being indexed by search engines unless they opt out.

However, laws weren’t conceived with training AI models in mind.

What’s The Path Forward?

This debate highlights the need for new rules around how AI uses online data.

The way forward is unclear, but the stakes are high.

Some suggest revenue sharing or licensing fees when publisher content is used to train AI models. Others propose an opt-in system that gives website owners more control over how their content is used for AI training.

The French rulings suggest that the courts may step in without explicit guidelines and good-faith negotiations.

The web has always relied on a balance between search engines and content creators. If that balance is disrupted without new safeguards, it could undermine the exchange of information that makes the internet so valuable.

Featured Image: Veroniksha/Shutterstock

Read original article here

Denial of responsibility! Search Engine Codex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.