Noindex in robots.txt: Why It Doesn't Work
Google no longer supports the noindex directive in robots.txt. What happened, what to use instead, and how to properly deindex pages.
If you've seen advice to add Noindex to your robots.txt file, ignore it. Google dropped support for this directive in September 2019. Any Noindex line in your robots.txt is silently ignored by every major search engine.
Here's what happened, why it matters, and what you should be doing instead.
What the Noindex Directive Looked Like
For years, some SEOs used an unofficial Noindex directive in robots.txt:
User-agent: Googlebot
Noindex: /old-content/
Noindex: /duplicate-page/
Disallow: /private/
This was never part of the official robots.txt specification. Google's crawler happened to support it as an unofficial extension. Many SEOs relied on it because it was convenient -- you could control indexing from a single file without touching page-level HTML.
Why Google Removed It
In July 2019, Google announced they would formally drop support for the Noindex directive in robots.txt, effective September 1, 2019. Their reasoning was straightforward:
The directive was never standardized. It wasn't part of RFC 9309 (the robots.txt specification), and other search engines never consistently supported it. Google wanted to align their crawler behavior with the published standard rather than maintain non-standard extensions.
Google also published REP (Robots Exclusion Protocol) as an internet standard to formalize what robots.txt should and shouldn't do. Keeping unofficial directives would have contradicted their own standardization effort.
No search engine supports noindex in robots.txt
This isn't just a Google thing. Bing, Yandex, and other major search engines have never officially supported the noindex directive in robots.txt. If it ever worked for you, it was through Google's unofficial support alone.
The Irony of Disallow + Noindex
Here's the part that catches people off guard.
Some site owners, after learning that robots.txt Noindex doesn't work, try this approach: block the page with Disallow in robots.txt and add a <meta name="robots" content="noindex"> tag on the page itself.
This doesn't work either. It's self-defeating.
# robots.txt
User-agent: *
Disallow: /old-page/
<!-- /old-page/ — crawler is blocked, never sees this -->
<head>
<meta name="robots" content="noindex">
</head>
When you Disallow a URL, the crawler never fetches the page. If it never fetches the page, it never sees the noindex meta tag. The tag is invisible.
And the page can still appear in Google's index. If other sites link to it, Google knows the URL exists and may show it in search results -- just without a snippet, since it can't read the page content.
Find conflicting robots.txt rules
Our tester detects when Disallow rules conflict with your indexing strategy.
What to Use Instead
You have three proper alternatives for keeping pages out of search results.
Option 1: Meta Robots Tag (HTML Pages)
The most common approach. Add this tag to the <head> section of any page you want deindexed:
<meta name="robots" content="noindex">
The page must be crawlable for this to work. Do not block it in robots.txt. Google needs to fetch the page, read the meta tag, and process the noindex directive.
You can also target specific bots:
<meta name="googlebot" content="noindex">
<meta name="bingbot" content="noindex">
Option 2: X-Robots-Tag HTTP Header (Any File Type)
For non-HTML content like PDFs, images, or JSON files, use the X-Robots-Tag HTTP header in your server response:
HTTP/1.1 200 OK
X-Robots-Tag: noindex
Content-Type: application/pdf
This works identically to the meta tag but applies to any file type. Configure it in your web server or application:
# Nginx — noindex all PDFs
location ~* \.pdf$ {
add_header X-Robots-Tag "noindex";
}
# Apache — noindex all PDFs
<FilesMatch "\.pdf$">
Header set X-Robots-Tag "noindex"
</FilesMatch>
Option 3: HTTP 410 Gone Status Code
If a page is permanently removed and should never appear in search results again, return a 410 Gone status code. Google will drop it from the index faster than a noindex tag.
HTTP/1.1 410 Gone
This tells crawlers the page is intentionally and permanently gone. Google typically processes a 410 faster than a 404 for deindexing purposes.
Step-by-Step: Properly Deindexing Pages
Follow this process to remove pages from Google's search results.
Remove the Disallow rule (if present)
If the page is currently blocked in robots.txt, remove the Disallow line. Google needs to crawl the page to see your noindex directive.
Add the noindex directive
Add <meta name="robots" content="noindex"> to the page's <head> section. For non-HTML files, use the X-Robots-Tag HTTP header instead.
Request removal in Search Console (optional, for speed)
Use Google Search Console's URL Removal tool to request a temporary removal. This hides the page from results within hours while Google processes the permanent noindex.
Wait for Google to recrawl
Google needs to fetch the page and process the noindex tag. This can take days to weeks depending on how frequently Google crawls your site.
Verify removal
Use Google's "site:" operator to check if the page is still indexed: site:example.com/old-page/. Once it's gone, the noindex has taken effect.
Optionally re-add Disallow
After Google has processed the noindex and dropped the page, you can add the Disallow back to save crawl budget. But this is optional and only matters on large sites.
Validate your deindexing setup
Make sure your robots.txt isn't blocking pages that need noindex processing.
The URL Removal Tool in Search Console
Google Search Console offers a URL Removal tool that provides temporary removal (about 6 months) from search results. This is useful as an immediate measure while your noindex tag is being processed.
Key points about the removal tool:
- It's temporary. The URL will reappear after roughly 6 months unless you have a permanent solution in place (noindex tag, 410 status, or actual page deletion).
- It only affects Google Search. Other search engines won't be impacted.
- You can remove individual URLs or entire URL prefixes.
- It works within hours, much faster than waiting for a recrawl.
Use it alongside the noindex tag for fast results: the removal tool gives you immediate relief, and the noindex tag provides the permanent solution.
What About the Disallow-Only Approach?
Some people ask: "Can't I just Disallow the page and call it done?"
You can, but understand what that actually does. Disallow prevents crawling, not indexing. The page might still appear in search results -- just without a title or snippet. Google will show something like:
example.com/blocked-page/ A description for this result is not available because of this site's robots.txt.
If that's acceptable for your use case, Disallow alone is fine. But if you want the URL completely gone from search results, you need noindex.
Quick Reference
| Method | Prevents Crawling | Prevents Indexing | Works on Non-HTML |
|---|---|---|---|
| robots.txt Disallow | Yes | No | Yes |
| robots.txt Noindex | No (deprecated) | No (deprecated) | N/A |
| Meta robots noindex | No | Yes | No |
| X-Robots-Tag: noindex | No | Yes | Yes |
| 410 Gone status | N/A | Yes (removes) | Yes |
| Search Console removal | No | Temporarily | Yes |
Related Articles
The noindex directive in robots.txt is dead -- use meta tags and HTTP headers to control indexing properly.
Test your robots.txt for free
Validate your robots.txt file instantly. Check directives, find crawling issues, and ensure search engines can access your site.