SEO to be chosen by AI? llms.txt Role and Installation Guide
Until now, SEO has been based on "being evaluated by Google". However, the era of AI becoming the gateway to information is coming.
Generative AI such as ChatGPT, Perplexity, and Gemini have begun to read, summarize, and cite content on the internet. In the midst of this, the one that is attracting attention is **'llms.txt' (LLM text text)**.
What is llms.txt?
'llms.txt' is a new rule file for AI crawlers. Just as 'robots.txt' is a crawl control file for search engines, 'llms.txt' is responsible for allowing/restricting AI models like ChatGPT, Claude, and Perplexity from browsing and learning from your site.
The installation location is directly under the root directory of the site:
https://example.com/llms.txt
By preparing this file, you can express your intentions to the AI development company.
Why is llms.txt important?
SEO in the future will not only be search. The AI will quote, summarize, and learn from the article, and the question will be whether your blog will be included in the "information sent from the AI's mouth".
For example:
- ChatGPT will cite your blog
- Perplexity displays articles with sources
- Gemini generates summaries in Discover
In these cases, it's important to specify whether the AI can refer to it. The mechanism for this is 'llms.txt'.
Basic Writing
Here's the simplest permission setting:
User-Agent: *
Allow: /
This means "allow all AI to see all pages."
If you want to exclude only certain paths:
User-Agent: *
Disallow: /private/
It is also possible to reject certain models (e.g. GPT-4):
User-Agent: gpt-4
Disallow: /
More peace of mind if you have an AI policy page
In addition to 'llms.txt', creating a '/ai-policy' page or something like that will increase your AI's trust.
Examples of what to include:
- This site allows AI summarization and citation
- Model developers must abide by copyright and citation rules
- Contact: 〇〇
Tips for becoming more resistant to Discover and AI search
To be "chosen" by AI, it's important not only to be 'llms.txt', but also to:
- OGP Settings (Title, Description, Thumbnail Image)
- Structured data (JSON-LD, such as 'BlogPosting')
- Author and publisher information disambiguation (e.g., /about page)
- Simple and accurate page structure
These improvements will make it easier to be introduced in AI search and summary media.
Conclusion
SEO in the AI era needs to be aware of not only the "human eye" but also the "AI eye". 'llms.txt' is one of the "entrances" to future information dissemination.
Let's prepare now to aim for a blog that is naturally chosen by ChatGPT and Gemini.
May your information reach both people and AI.