How to Add Your Sitemap to robots.txt

Add your sitemap URL to robots.txt so search engines can find it. Syntax, multiple sitemaps, and common mistakes to avoid.

Adding your sitemap URL to your robots.txt file is one of the simplest and most effective things you can do for SEO. It gives every search engine crawler a direct pointer to your sitemap the moment it visits your site. No registration, no tools, no waiting -- just a single line in a text file.

This guide covers the exact syntax, where to place it, how to handle multiple sitemaps, and the mistakes that trip people up.

The Sitemap Directive Syntax

The syntax is one line:

Sitemap: https://example.com/sitemap.xml

That is it. The directive name (Sitemap), a colon, a space, and the full absolute URL to your XML sitemap.

A few syntax rules to know:

  • The directive name Sitemap is case-insensitive (sitemap, SITEMAP, and Sitemap all work), but convention is to capitalize the first letter.
  • The URL must be absolute. It must include the protocol (https://) and the full domain.
  • There is no closing tag or semicolon.

Where to Place It

The Sitemap directive is unique among robots.txt directives because it is not tied to any User-agent block. It is a standalone directive that applies globally. Place it anywhere in the file, but by convention it goes at the top or bottom.

Placing at the bottom (most common):

User-agent: *
Disallow: /admin/
Disallow: /private/

User-agent: GPTBot
Disallow: /

Sitemap: https://example.com/sitemap.xml

Placing at the top:

Sitemap: https://example.com/sitemap.xml

User-agent: *
Disallow: /admin/
Disallow: /private/

Both are equally valid. Most site owners put it at the bottom because the User-agent blocks are the "main content" of the file and the Sitemap is supplementary information. But there is no functional difference.

Location does not affect scope

No matter where you place the Sitemap directive in your file -- even inside a User-agent block -- it applies to all crawlers. The Sitemap directive has no relationship to User-agent blocks.

Adding Multiple Sitemaps

If your site has multiple sitemaps, list each one on its own line:

Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap-posts.xml
Sitemap: https://example.com/sitemap-pages.xml
Sitemap: https://example.com/sitemap-products.xml

There is no limit on the number of Sitemap directives you can include.

Using a Sitemap Index

If you have many sitemaps, a better approach is to use a sitemap index file. A sitemap index is an XML file that lists all your individual sitemaps. Then you only need one line in your robots.txt:

Sitemap: https://example.com/sitemap_index.xml

The sitemap index file itself looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-posts.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
  </sitemap>
</sitemapindex>

Search engines follow sitemap index files automatically. This approach is cleaner and easier to maintain, especially if your sitemaps change frequently.

Verify your sitemap reference

Test your robots.txt to confirm the Sitemap directive uses the correct URL format and points to a valid sitemap.

The Absolute URL Requirement

The most common mistake people make with the Sitemap directive is using a relative URL. This does not work.

# WRONG -- relative URL
Sitemap: /sitemap.xml

# WRONG -- missing protocol
Sitemap: example.com/sitemap.xml

# WRONG -- missing domain
Sitemap: sitemap.xml

# CORRECT -- absolute URL with protocol and domain
Sitemap: https://example.com/sitemap.xml

The URL must include:

  • The protocol (https:// or http://)
  • The full domain name
  • The complete path to the sitemap file

Relative URLs are silently ignored

If you use a relative URL, crawlers will not throw an error. They will simply ignore the directive. Your sitemap will not be discovered through robots.txt, and you may not realize it until you notice indexing issues.

Cross-Domain Sitemaps

The Sitemap directive in robots.txt can point to a sitemap hosted on a different domain. This is valid and sometimes useful:

# In robots.txt on example.com
Sitemap: https://cdn.example.com/sitemap.xml
Sitemap: https://www.example.com/sitemap.xml

However, the sitemap itself must only contain URLs from the domain where the robots.txt is hosted (or that you have verified in Google Search Console). A sitemap on cdn.example.com cannot list URLs for differentsite.com.

Common CMS Configurations

Different platforms handle the sitemap-robots.txt relationship in different ways.

WordPress

WordPress includes its sitemap URL in the virtual robots.txt automatically:

Sitemap: https://yoursite.com/wp-sitemap.xml

If you use a plugin like Yoast SEO or Rank Math that generates its own sitemap, the plugin typically updates the robots.txt to reference its sitemap instead:

Sitemap: https://yoursite.com/sitemap_index.xml

If you create a physical robots.txt file, you need to add the Sitemap directive yourself -- it will not be auto-generated.

Shopify

Shopify automatically includes the sitemap reference in its generated robots.txt:

Sitemap: https://yourstore.com/sitemap.xml

Next.js and Static Sites

If you build your site with a framework like Next.js, Gatsby, or Hugo, you manage the robots.txt file yourself. Make sure to include the Sitemap directive:

User-agent: *
Allow: /

Sitemap: https://yoursite.com/sitemap.xml

Check your sitemap setup

Validate that your robots.txt correctly references your sitemap and that the sitemap URL is accessible to crawlers.

Do You Still Need to Submit via Search Console?

Yes, you should do both. Adding the sitemap to robots.txt and submitting it through Google Search Console serve different purposes.

MethodWho Discovers ItFeedback ProvidedWhen to Use
robots.txt Sitemap directiveAll crawlers (Google, Bing, Yandex, etc.)NoneAlways -- it is free and universal
Google Search ConsoleGoogle onlyCrawl stats, errors, index coverageAlways -- provides monitoring data
Bing Webmaster ToolsBing onlyCrawl stats, errorsIf you want Bing-specific data

The robots.txt Sitemap directive is a passive discovery mechanism. It tells crawlers where your sitemap is, but you get no feedback on whether they actually found it or processed it successfully.

Google Search Console submission gives you detailed data: how many URLs were submitted, how many were indexed, and what errors were found. You cannot get this data from robots.txt alone.

Use both. The robots.txt directive handles broad discovery (including bots that do not have their own submission tools), and Search Console gives you the visibility you need to debug issues.

Verifying Your Setup

After adding the Sitemap directive to your robots.txt, verify everything works.

1

Check the robots.txt file

Open https://yourdomain.com/robots.txt in your browser. Confirm the Sitemap directive is present with the correct absolute URL.

2

Verify the sitemap URL

Click or navigate to the sitemap URL from the directive. You should see valid XML content, not a 404 error or HTML page.

3

Test with a robots.txt validator

Use a testing tool to parse your robots.txt and verify the Sitemap directive is correctly formatted.

4

Check Search Console

If you use Google Search Console, go to the Sitemaps section and verify the sitemap status. Look for any errors reported.

Mistakes to Avoid

Using relative URLs

Always use the full absolute URL including https:// and the domain. Relative paths are silently ignored by crawlers.

Pointing to a broken URL

If your sitemap URL returns a 404 or 500 error, the directive is useless. Verify the sitemap actually exists at the specified URL.

Forgetting to update after migration

If you move your site to a new domain or change your sitemap URL, update the robots.txt Sitemap directive. Old URLs will return 404s.

Listing sitemaps for blocked content

If your robots.txt blocks crawling of certain paths, do not include those URLs in your sitemap. The conflict confuses crawlers and wastes crawl budget.

Using HTTP when your site is HTTPS

If your site is on HTTPS, the sitemap URL in robots.txt should also use HTTPS. Mismatched protocols can cause issues with some crawlers.


One line in robots.txt. Every crawler finds your sitemap. That is a good trade.

Test your robots.txt for free

Validate your robots.txt file instantly. Check directives, find crawling issues, and ensure search engines can access your site.