Do LLM crawlers respect the robots meta tag?

100 views Asked by MediaFormat At 12 December 2023 at 19:21

It is currently possible to use robots.txt to disallow Large Language Model crawlers via user-agent strings:

User-agent: GPTBot
Disallow: /

But this approach is very broad and while it works for site administrators, it wouldn't allow users of a CMS, for example, to opt out on a per account basis.

I'm trying to understand if it's possible to use the robots meta tag, for a more granular permission, for example:

<meta name="robots" content="noindex">

Also do the LLM crawlers even use noindex as an opt-out, or is there a new meta-content-keyword to use? for example noteach, or nolearn.

Original Q&A

TechQA.

Do LLM crawlers respect the robots meta tag?

There are 0 answers

Related Questions in META-TAGS

Related Questions in ROBOTS.TXT

Related Questions in LARGE-LANGUAGE-MODEL

Popular Questions

Trending Questions