Meta Robots Tags Explained: Every Directive and When to Use It
Understand with AI
Discuss with your preferred AI assistant
index, noindex, follow, nofollow, noarchive, nosnippet, noimageindex, max-snippet and max-image-preview.
Every page without a robots tag is already crawlable and indexable — you only tag the exceptions.
A copy-paste <meta name="robots"> tag plus the equivalent X-Robots-Tag HTTP header.
The meta robots tag is one of the smallest snippets of code on a page, yet it has outsized control over how that page behaves in search. Get it right and you keep thin, duplicate, or private pages out of Google while letting your money pages rank freely. Get it wrong and you can accidentally hide an entire site from search overnight.
This guide explains what the meta robots tag does, every directive worth knowing, how the X-Robots-Tag HTTP header differs, and the practical mistakes that cost real organic traffic.
What Is the Meta Robots Tag?
The meta robots tag is an HTML element placed in the <head> of a page that tells search-engine crawlers how to treat that specific URL. It looks like this:
<meta name="robots" content="noindex, nofollow">
The name attribute targets crawlers — use robots for all of them, or a specific bot such as googlebot to scope the rule. The content attribute holds a comma-separated list of directives. Unlike robots.txt, which controls whether a crawler may access a URL, the meta robots tag controls what a crawler does once it has already read the page — primarily whether to index it and whether to follow its links.
The Core Directives Explained
Most pages only need two decisions: should this URL appear in search, and should its links pass authority?
- index / noindex — index lets the page appear in search results; noindex keeps it out. noindex is the right tool for thank-you pages, internal search results, faceted filters, staging URLs, and thin tag archives.
- follow / nofollow — follow lets crawlers follow and pass equity through the links on the page; nofollow tells them not to. Most pages should use follow even when noindexed, so link equity keeps flowing through your site.
The default behaviour of every page is index, follow, so you only need a tag when you want to change something. Adding "index, follow" everywhere is harmless but unnecessary.
Snippet, Archive, and Preview Controls
Beyond indexing, several directives shape how your result is displayed:
- noarchive — stops search engines from showing a cached copy of the page.
- nosnippet — removes the text and video snippet from your result entirely.
- noimageindex — prevents images hosted on the page from being indexed in image search.
- max-snippet:[number] — caps the snippet length in characters. Use -1 for no limit, 0 to suppress the snippet, or a positive number for a hard cap.
- max-image-preview:[setting] — sets the largest image preview Google may show: none, standard, or large.
These directives only matter on pages that are indexed. If a page is set to noindex it will never appear in results, so layering snippet or preview rules on top of it has no effect.
Meta Robots Tag vs X-Robots-Tag Header
The X-Robots-Tag is the same set of directives delivered as an HTTP response header instead of an HTML tag. It is more flexible because it can control file types that have no HTML head at all.
| Aspect | Meta robots tag | X-Robots-Tag header |
|---|---|---|
| Where it lives | HTML <head> | HTTP response header |
| Works on non-HTML files | No | Yes (PDFs, images, videos) |
| Set at scale | Per page template | Server / CDN config for whole paths |
| Bot scoping | name="googlebot" | X-Robots-Tag: googlebot: noindex |
Use the meta tag for ordinary web pages and the header when you need to noindex a PDF, an image directory, or an entire URL pattern from your web server. The two are interchangeable for HTML pages — pick whichever your stack makes easier to manage.
Common Mistakes to Avoid
- Blocking and noindexing the same URL. If robots.txt blocks a URL, crawlers can never read its noindex tag — so the page can still get indexed from external links. To deindex, allow crawling and let the noindex be seen.
- Site-wide noindex left on after launch. A staging-environment "noindex, nofollow" copied to production is the classic traffic-killer. Audit it before and after every deploy.
- Using nofollow to "save crawl budget." nofollow on internal links mostly just leaks link equity; reserve it for untrusted user-generated links.
- Conflicting directives. Combining nosnippet with max-snippet, or noimageindex with max-image-preview, sends mixed signals — keep your directive set internally consistent.
How to Use This Generator
Toggle index/noindex and follow/nofollow to set the core behaviour, then enable any snippet or preview restrictions you need. The tool instantly produces a clean <meta name="robots"> tag and the matching X-Robots-Tag header, and flags conflicting or redundant directives so you ship a tag that actually does what you intend.
Expert Tips
Never disallow a page you want deindexed
If robots.txt blocks a URL, crawlers can never read its noindex tag — so it can still be indexed from external links. Allow crawling and let the noindex be seen.
Audit for site-wide noindex after every deploy
A staging "noindex, nofollow" accidentally shipped to production is the classic traffic-killer. Check your live robots tags before and after each release.
Frequently Asked Questions
What is the difference between noindex and disallow in robots.txt?
Disallow in robots.txt stops a crawler from fetching a URL, while noindex tells it not to keep the page in the index. They are not interchangeable: a disallowed page can still be indexed from external links because the crawler never sees the noindex. To remove a page from search, allow crawling and add a noindex directive.
Should I use noindex with follow or nofollow?
In most cases use noindex, follow. This keeps the page out of search results while still letting crawlers follow its internal links so link equity flows to the rest of your site. Reserve nofollow for pages whose links you genuinely do not want to endorse.
Do I need a meta robots tag on every page?
No. The default behaviour is index, follow, so a page with no robots tag is already crawlable and indexable. You only need a tag when you want to change that default — for example to noindex a thank-you page or cap snippet length on a sensitive page.
When should I use the X-Robots-Tag header instead of the meta tag?
Use the X-Robots-Tag HTTP header for non-HTML resources like PDFs, images, and videos that have no HTML head to hold a meta tag, or when you want to apply directives to an entire URL pattern at the server or CDN level. For ordinary HTML pages the meta tag and the header are equivalent.