Fix Blocked by robots.txt Errors

You open Google Search Console and see pages marked as "Blocked by robots.txt." Your content is invisible to search engines, and you are losing organic traffic. The fix is usually straightforward once you understand what is happening. For the complete robots.txt reference, see our robots.txt Guide.

This guide walks you through diagnosing the problem, identifying the offending rule, fixing it, and getting your pages re-crawled.

What "Blocked by robots.txt" Actually Means

When Google reports a page as "Blocked by robots.txt," it means Googlebot tried to crawl that URL but your robots.txt file told it not to. Googlebot respected the directive and did not fetch the page content.

There are two related but different statuses you might see in Google Search Console:

"Blocked by robots.txt" -- Google found the URL (through a sitemap, internal links, or external links) but cannot crawl it because of a Disallow rule. The page is not indexed.

"Indexed, though blocked by robots.txt" -- Google found the URL through links, and indexed the URL itself (showing it in search results), but could not crawl the page content. The search result will show the URL with no snippet or a snippet pulled from anchor text. This is the worst of both worlds: you are in search results but with no useful content shown.

robots.txt blocks crawling, not indexing

A Disallow rule prevents Googlebot from fetching your page, but it does not prevent the URL from appearing in search results [1]. If you need to prevent indexing entirely, use a noindex meta tag or X-Robots-Tag header. But note: for Google to see a noindex tag, it needs to be able to crawl the page first -- so you must remove the Disallow rule.

Finding Blocked Pages in Google Search Console

Open Google Search Console

Navigate to Google Search Console and select your property.

Go to the Pages report

In the left sidebar, click "Pages" (under "Indexing"). This shows you the indexing status of all discovered URLs.

Filter for robots.txt issues

Look for rows labeled "Blocked by robots.txt" or "Indexed, though blocked by robots.txt." Click on either to see the affected URLs.

Review the affected URLs

Examine the list. Are these pages that should be blocked (admin pages, duplicate content) or pages that should be indexed (blog posts, product pages, landing pages)?

If the blocked pages are ones you intentionally want hidden, no fix is needed. If they are pages you want in search results, you have a problem to solve.

Common Causes

Here are the most frequent reasons pages get blocked unintentionally.

Overly Broad Disallow Rules

The most common cause. A rule like Disallow: / blocks your entire site. A rule like Disallow: /blog blocks /blog, /blog/, /blog-post, and /bloggers -- anything starting with /blog.

# This blocks MORE than you probably intended
User-agent: *
Disallow: /blog

# This blocks only the /blog/ directory
User-agent: *
Disallow: /blog/

Wildcard Rules Gone Wrong

Wildcards are powerful but dangerous. A few common mistakes:

# Blocks every URL with a query string -- including paginated pages
Disallow: /*?

# Blocks every URL containing "page" anywhere in the path
Disallow: /*page*

# Blocks all .html files, even your key landing pages
Disallow: /*.html$

Blocking CSS, JavaScript, and Images

If you block resource directories, Googlebot cannot render your pages. Google treats this as a significant issue.

# Problematic rules that break rendering
Disallow: /wp-content/
Disallow: /static/
Disallow: /assets/css/
Disallow: /js/

CMS or Plugin Changes

WordPress plugins like Yoast SEO, Rank Math, or All in One SEO can modify your robots.txt. See our WordPress robots.txt guide for details. A plugin update or settings change can introduce new rules you did not expect. Similarly, Shopify's default robots.txt blocks certain paths that you might not realize are affected.

Find the rule that is blocking your pages

Paste your robots.txt and test any URL to see exactly which directive is causing the block.

Test Your robots.txt

Diagnosing Which Rule Is Blocking

You know a page is blocked. Now you need to find which rule is causing it.

Step 1: Get your robots.txt. Open https://yourdomain.com/robots.txt in your browser.

Step 2: Identify the relevant User-agent block. For Google indexing issues, look for User-agent: Googlebot first. If there is no Googlebot-specific block, the User-agent: * block applies.

Step 3: Test the URL against each rule. For each Disallow and Allow rule in the block, check whether the blocked URL path matches the pattern.

For example, if the blocked URL is /blog/2024/my-great-post and your rules are:

User-agent: *
Disallow: /blog/2024/drafts/
Disallow: /blog/tag/
Disallow: /blog/*/comments

The third rule (/blog/*/comments) does not match. The first two do not match either. So the block must be coming from somewhere else -- maybe a different User-agent block or a rule you overlooked.

This manual process is tedious with complex files. A robots.txt testing tool does this instantly. Paste the file, enter the URL and user agent, and it tells you exactly which rule is matching.

How to Fix It

Once you know which rule is blocking your pages, you have three options.

Option 1: Remove the Rule

If the Disallow rule is not needed at all, remove it:

# Before
User-agent: *
Disallow: /blog/
Disallow: /admin/

# After -- removed the /blog/ rule
User-agent: *
Disallow: /admin/

Option 2: Make the Rule More Specific

If the rule is too broad, narrow it down:

# Before -- blocks all of /blog/
User-agent: *
Disallow: /blog/

# After -- only blocks drafts and tag pages
User-agent: *
Disallow: /blog/drafts/
Disallow: /blog/tag/

Option 3: Add an Allow Override

If you need the broad rule but want to exempt specific URLs:

User-agent: *
Disallow: /blog/
Allow: /blog/2024/
Allow: /blog/2025/

This blocks /blog/ in general but explicitly allows posts from 2024 and 2025.

Test before deploying

After editing your robots.txt, test it with a validator before uploading. Verify that the previously blocked URLs now show as "Allowed" and that you have not accidentally unblocked pages that should stay hidden.

Getting Re-Indexed After Fixing

After you update your robots.txt, Google will not immediately re-crawl the affected pages. Here is how to speed up the process.

Deploy the updated robots.txt

Upload your corrected file and verify it is live at https://yourdomain.com/robots.txt.

Request indexing in Search Console

Use the URL Inspection tool in Google Search Console. Enter the blocked URL, then click "Request Indexing." Do this for your most important affected pages.

Wait for re-crawling

Google will typically re-crawl the updated robots.txt within a few days. Individual pages may take longer to be re-crawled and indexed. High-authority pages get re-crawled faster.

Monitor the Pages report

Check the Pages report in Search Console over the following weeks. The "Blocked by robots.txt" count should decrease as Google re-crawls and indexes the previously blocked URLs.

Validate your fix

After updating your robots.txt, run it through our tester to confirm the blocked pages are now accessible to crawlers.

Test Your robots.txt

Preventing Future Issues

A few practices to keep robots.txt problems from recurring:

Version control your robots.txt. Keep it in your repository so changes are tracked and reviewable.
Add robots.txt checks to your deployment pipeline. Validate the file automatically before every deploy.
Monitor Google Search Console regularly. Check the Pages report at least monthly for new "Blocked by robots.txt" entries.
Be cautious with wildcards. When adding wildcard rules, test them against a list of your important URLs to catch unintended matches.
Document your rules. Add comments to your robots.txt explaining why each rule exists. Future you (or your teammates) will thank you.

# Block admin area
User-agent: *
Disallow: /admin/

# Block search results pages (thin content)
Disallow: /search

# Block API endpoints (not for public consumption)
Disallow: /api/

References

Test your robots.txt for free

Validate your robots.txt file instantly. Check directives, find crawling issues, and ensure search engines can access your site.

Test Your robots.txt

How to Fix "Blocked by robots.txt" Errors

What "Blocked by robots.txt" Actually Means

Finding Blocked Pages in Google Search Console

Open Google Search Console

Go to the Pages report

Filter for robots.txt issues

Review the affected URLs

Common Causes

Overly Broad Disallow Rules

Wildcard Rules Gone Wrong

Blocking CSS, JavaScript, and Images

CMS or Plugin Changes

Diagnosing Which Rule Is Blocking

How to Fix It

Option 1: Remove the Rule

Option 2: Make the Rule More Specific

Option 3: Add an Allow Override

Getting Re-Indexed After Fixing

Deploy the updated robots.txt

Request indexing in Search Console

Wait for re-crawling

Monitor the Pages report

Preventing Future Issues

References

Related Articles

Test your robots.txt for free