By SearchScore Team Updated March 2026 14 min read

Technical GEO: How to Optimise Your Website for AI Search

Most GEO improvements are not about rewriting your content. They are technical changes - how your site is configured, what signals it sends to AI crawlers, and how well machines can understand its structure. This guide covers every technical change that materially improves AI search visibility.

In this guide

What to fix first

Not all technical GEO changes are equal. Our analysis of 12,000 websites found that three issues account for the majority of AI search invisibility - and two of them take under an hour to fix.

Highest impact
Unblock AI crawlersrobots.txt fix - takes 10 minutes
Highest impact
Create llms.txtNew file - takes 30 minutes
High impact
Add schema markupJSON-LD injection - 1 to 4 hours
Medium impact
Improve page structureSemantic HTML cleanup

AI crawler permissions in robots.txt

Your robots.txt file tells web crawlers which parts of your site they can access. The problem is that most robots.txt files were written before AI search engines existed - and many contain blanket rules that accidentally block AI crawlers alongside spam bots.

The major AI crawlers and their user-agent names:

Common mistake: Using User-agent: * with Disallow: / to block all bots will also block every AI crawler. This is the single most damaging GEO error and we see it on 73% of websites we audit.

To allow all major AI crawlers, add these lines to your robots.txt:

# Allow major AI crawlers
User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: cohere-ai
Allow: /

If you need to block AI training data collection while allowing search, use more specific directives. OpenAI, Anthropic and others honour different bot names for training versus live retrieval.

Creating your llms.txt file

llms.txt is a plain text file placed at the root of your website (e.g. yoursite.com/llms.txt) that gives AI language models structured guidance about your site. Think of it as a sitemap for AI - not just where pages are, but what your site is, what your most important content covers, and how an AI should understand your brand.

The format is simple Markdown. A basic llms.txt looks like this:

# YourBrand

> One-line description of what your website/business does.

## About
[Brief description of who you are, what you do, and who you serve]

## Key pages
- [Home](https://yoursite.com/): Main landing page
- [About](https://yoursite.com/about/): Company background and team
- [Blog](https://yoursite.com/blog/): Articles and guides

## Key topics
This site covers [your main topic areas]. 
Our content is written by [credentials].

## Contact
[contact@yoursite.com]

Beyond the basics, you can also include a detailed llms-full.txt that contains the complete text of your most important pages - making it trivial for AI models to ingest your content without crawling your full site.

Quick win: Our data shows that 92% of websites have no llms.txt file at all. Simply creating one puts you ahead of almost all of your competitors from an AI search perspective.

Schema markup for AI citation

Schema.org markup is structured data embedded in your HTML that tells machines what your content means. Google has required it for rich results for years - but for GEO, it is even more important. AI engines use schema to verify facts, understand entities, attribute authorship and decide whether to cite your content.

Organisation schema

Organisation schema establishes your brand as a known entity. It should include your official name, URL, logo, contact details, social media profiles and, where applicable, your Wikipedia or Wikidata URL. This is the foundation of brand authority for AI citation.

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Company Name",
  "url": "https://yoursite.com",
  "logo": "https://yoursite.com/logo.png",
  "sameAs": [
    "https://twitter.com/yourhandle",
    "https://linkedin.com/company/yourcompany",
    "https://en.wikipedia.org/wiki/YourCompany"
  ]
}

Article schema

Every blog post and article should have Article schema with a named author, a datePublished, and a publisher reference. This is how AI engines attribute content to real, verified people - a critical E-E-A-T signal.

FAQPage schema

FAQPage schema is one of the most powerful GEO signals available. AI engines that synthesise answers frequently pull from structured Q&A content - and FAQ schema makes your Q&A pairs directly machine-readable. Add it to any page with questions and answers.

Structured data: the bigger picture

Beyond the core three schema types, consider adding structured data relevant to your business type:

Implement schema as JSON-LD in the <head> of your pages. It is easier to maintain than inline microdata and is the format preferred by both Google and AI crawlers.

Platform and performance signals

AI crawlers face the same technical barriers as other bots. Slow load times, JavaScript-heavy rendering, broken pagination and inconsistent canonical URLs all reduce how effectively AI engines can parse your content.

Technical GEO checklist

Run a free technical GEO audit

Find out which technical issues are hurting your AI search visibility - in seconds, for free.

Get My GEO Score →

Continue reading: Technical GEO cluster

Related pillars

Sources & Further Reading