Noindex in robots.txt: Why It Doesn't Work

If you've seen advice to add Noindex to your robots.txt file, ignore it. Google dropped support for this directive in September 2019 [1]. Any Noindex line in your robots.txt is silently ignored by every major search engine. For the complete robots.txt reference, see our robots.txt Guide.

Here's what happened, why it matters, and what you should be doing instead.

What the Noindex Directive Looked Like

For years, some SEOs used an unofficial Noindex directive in robots.txt:

User-agent: Googlebot
Noindex: /old-content/
Noindex: /duplicate-page/
Disallow: /private/

This was never part of the official robots.txt specification. Google's crawler happened to support it as an unofficial extension. Many SEOs relied on it because it was convenient -- you could control indexing from a single file without touching page-level HTML.

Why Google Removed It

In July 2019, Google announced they would formally drop support for the Noindex directive in robots.txt, effective September 1, 2019. Their reasoning was straightforward:

The directive was never standardized. It wasn't part of RFC 9309 (the robots.txt specification), and other search engines never consistently supported it. Google wanted to align their crawler behavior with the published standard rather than maintain non-standard extensions.

Google also published REP (Robots Exclusion Protocol) as an internet standard to formalize what robots.txt should and shouldn't do. Keeping unofficial directives would have contradicted their own standardization effort.

No search engine supports noindex in robots.txt

This isn't just a Google thing. Bing, Yandex, and other major search engines have never officially supported the noindex directive in robots.txt. If it ever worked for you, it was through Google's unofficial support alone.

The Irony of Disallow + Noindex

Here's the part that catches people off guard.

Some site owners, after learning that robots.txt Noindex doesn't work, try this approach: block the page with Disallow in robots.txt and add a <meta name="robots" content="noindex"> tag on the page itself.

This doesn't work either. It's self-defeating.

# robots.txt
User-agent: *
Disallow: /old-page/

<!-- /old-page/ -- crawler is blocked, never sees this -->
<head>
  <meta name="robots" content="noindex">
</head>

When you Disallow a URL, the crawler never fetches the page. If it never fetches the page, it never sees the noindex meta tag. The tag is invisible.

And the page can still appear in Google's index. If other sites link to it, Google knows the URL exists and may show it in search results -- just without a snippet, since it can't read the page content.

Find conflicting robots.txt rules

Our tester detects when Disallow rules conflict with your indexing strategy.

Test Your robots.txt

What to Use Instead

You have three proper alternatives for keeping pages out of search results.

Option 1: Meta Robots Tag (HTML Pages)

The most common approach. Add this tag to the <head> section of any page you want deindexed:

<meta name="robots" content="noindex">

The page must be crawlable for this to work. Do not block it in robots.txt. See robots.txt vs Meta Robots Tags for a detailed comparison of when to use each approach. Google needs to fetch the page, read the meta tag, and process the noindex directive.

You can also target specific bots:

<meta name="googlebot" content="noindex">
<meta name="bingbot" content="noindex">

Option 2: X-Robots-Tag HTTP Header (Any File Type)

For non-HTML content like PDFs, images, or JSON files, use the X-Robots-Tag HTTP header in your server response:

HTTP/1.1 200 OK
X-Robots-Tag: noindex
Content-Type: application/pdf

This works identically to the meta tag but applies to any file type. Configure it in your web server or application:

# Nginx -- noindex all PDFs
location ~* \.pdf$ {
    add_header X-Robots-Tag "noindex";
}

# Apache -- noindex all PDFs
<FilesMatch "\.pdf$">
    Header set X-Robots-Tag "noindex"
</FilesMatch>

Option 3: HTTP 410 Gone Status Code

If a page is permanently removed and should never appear in search results again, return a 410 Gone status code. Google will drop it from the index faster than a noindex tag.

HTTP/1.1 410 Gone

This tells crawlers the page is intentionally and permanently gone. Google typically processes a 410 faster than a 404 for deindexing purposes.

Step-by-Step: Properly Deindexing Pages

Follow this process to remove pages from Google's search results.

Remove the Disallow rule (if present)

If the page is currently blocked in robots.txt, remove the Disallow line. Google needs to crawl the page to see your noindex directive.

Add the noindex directive

Add <meta name="robots" content="noindex"> to the page's <head> section. For non-HTML files, use the X-Robots-Tag HTTP header instead.

Request removal in Search Console (optional, for speed)

Use Google Search Console's URL Removal tool to request a temporary removal. This hides the page from results within hours while Google processes the permanent noindex.

Wait for Google to recrawl

Google needs to fetch the page and process the noindex tag. This can take days to weeks depending on how frequently Google crawls your site.

Verify removal

Use Google's "site:" operator to check if the page is still indexed: site:example.com/old-page/. Once it's gone, the noindex has taken effect.

Optionally re-add Disallow

After Google has processed the noindex and dropped the page, you can add the Disallow back to save crawl budget. But this is optional and only matters on large sites.

Validate your deindexing setup

Make sure your robots.txt isn't blocking pages that need noindex processing.

Test Your robots.txt

The URL Removal Tool in Search Console

Google Search Console offers a URL Removal tool that provides temporary removal (about 6 months) from search results. This is useful as an immediate measure while your noindex tag is being processed.

Key points about the removal tool:

It's temporary. The URL will reappear after roughly 6 months unless you have a permanent solution in place (noindex tag, 410 status, or actual page deletion).
It only affects Google Search. Other search engines won't be impacted.
You can remove individual URLs or entire URL prefixes.
It works within hours, much faster than waiting for a recrawl.

Use it alongside the noindex tag for fast results: the removal tool gives you immediate relief, and the noindex tag provides the permanent solution.

What About the Disallow-Only Approach?

Some people ask: "Can't I just Disallow the page and call it done?"

You can, but understand what that actually does. Disallow prevents crawling, not indexing. The page might still appear in search results -- just without a title or snippet. Google will show something like:

example.com/blocked-page/ A description for this result is not available because of this site's robots.txt.

If that's acceptable for your use case, Disallow alone is fine. But if you want the URL completely gone from search results, you need noindex.

Quick Reference

Method	Prevents Crawling	Prevents Indexing	Works on Non-HTML
robots.txt Disallow	Yes	No	Yes
robots.txt Noindex	No (deprecated)	No (deprecated)	N/A
Meta robots noindex	No	Yes	No
X-Robots-Tag: noindex	No	Yes	Yes
410 Gone status	N/A	Yes (removes)	Yes
Search Console removal	No	Temporarily	Yes

References

Test your robots.txt for free

Validate your robots.txt file instantly. Check directives, find crawling issues, and ensure search engines can access your site.

Test Your robots.txt

What the Noindex Directive Looked Like

Why Google Removed It

The Irony of Disallow + Noindex

What to Use Instead

Option 1: Meta Robots Tag (HTML Pages)

Option 2: X-Robots-Tag HTTP Header (Any File Type)

Option 3: HTTP 410 Gone Status Code

Step-by-Step: Properly Deindexing Pages

Remove the Disallow rule (if present)

Add the noindex directive

Request removal in Search Console (optional, for speed)

Wait for Google to recrawl

Verify removal

Optionally re-add Disallow

The URL Removal Tool in Search Console

What About the Disallow-Only Approach?

Quick Reference

References

Related Articles

Test your robots.txt for free