Google introduces Google-Extended to let you block Bard, Vertex AI via robots.txt


Google today announced a new web crawler, Google-Extended, that lets you control whether Bard and Vertex AI can access the content on your site.

This seems to be the end result of a “public discussion” Google initiated in July, when the company promised to gather “voices from across web publishers, civil society, academia and more fields” to talk about choice and control over web content.

The announcement. In a blog post, Google said:

“Today we’re announcing Google-Extended, a new control that web publishers can use to manage whether their sites help improve Bard and Vertex AI generative APIs, including future generations of models that power those products. By using Google-Extended to control access to content on a site, a website administrator can choose whether to help these AI models become more accurate and capable over time.”

– Google’s Danielle Romain, VP, Trust / An update on web publisher controls

Google-Extended. The new crawler has been added to the Google Search Central documentation on web crawlers.

What Google is saying. The company said Google-Extended gives publishers “choice and control”:

  • “Making simple and scalable controls, like Google-Extended, available through robots.txt is an important step in providing transparency and control that we believe all providers of AI models should make available. However, as AI applications expand, web publishers will face the increasing complexity of managing different uses at scale.”

Robots.txt. You can use robots.txt to block Google-Extended from accessing your content, or parts of it. To fully block Google-Extended, add the following to your site’s robots.txt:

User-agent: Google-Extended
Disallow: /

Why we care. We know 242 of the most popular 1,000 websites have already decided to block GPTBot, OpenAI’s web crawler, since it launched in August. Now you can decide whether your website should opt out of helping Google improve its AI products.



Read original article here

Denial of responsibility! Search Engine Codex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More