How to Add Your Sitemap to robots.txt
Add your sitemap URL to robots.txt so search engines can find it. Syntax, multiple sitemaps, and common mistakes to avoid.
Adding your sitemap URL to your robots.txt file is one of the simplest and most effective things you can do for SEO. It gives every search engine crawler a direct pointer to your sitemap the moment it visits your site. No registration, no tools, no waiting -- just a single line in a text file.
This guide covers the exact syntax, where to place it, how to handle multiple sitemaps, and the mistakes that trip people up.
The Sitemap Directive Syntax
The syntax is one line:
Sitemap: https://example.com/sitemap.xml
That is it. The directive name (Sitemap), a colon, a space, and the full absolute URL to your XML sitemap.
A few syntax rules to know:
- The directive name
Sitemapis case-insensitive (sitemap,SITEMAP, andSitemapall work), but convention is to capitalize the first letter. - The URL must be absolute. It must include the protocol (
https://) and the full domain. - There is no closing tag or semicolon.
Where to Place It
The Sitemap directive is unique among robots.txt directives because it is not tied to any User-agent block. It is a standalone directive that applies globally. Place it anywhere in the file, but by convention it goes at the top or bottom.
Placing at the bottom (most common):
User-agent: *
Disallow: /admin/
Disallow: /private/
User-agent: GPTBot
Disallow: /
Sitemap: https://example.com/sitemap.xml
Placing at the top:
Sitemap: https://example.com/sitemap.xml
User-agent: *
Disallow: /admin/
Disallow: /private/
Both are equally valid. Most site owners put it at the bottom because the User-agent blocks are the "main content" of the file and the Sitemap is supplementary information. But there is no functional difference.
Location does not affect scope
No matter where you place the Sitemap directive in your file -- even inside a User-agent block -- it applies to all crawlers. The Sitemap directive has no relationship to User-agent blocks.
Adding Multiple Sitemaps
If your site has multiple sitemaps, list each one on its own line:
Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap-posts.xml
Sitemap: https://example.com/sitemap-pages.xml
Sitemap: https://example.com/sitemap-products.xml
There is no limit on the number of Sitemap directives you can include.
Using a Sitemap Index
If you have many sitemaps, a better approach is to use a sitemap index file. A sitemap index is an XML file that lists all your individual sitemaps. Then you only need one line in your robots.txt:
Sitemap: https://example.com/sitemap_index.xml
The sitemap index file itself looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-posts.xml</loc>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-products.xml</loc>
</sitemap>
</sitemapindex>
Search engines follow sitemap index files automatically. This approach is cleaner and easier to maintain, especially if your sitemaps change frequently.
Verify your sitemap reference
Test your robots.txt to confirm the Sitemap directive uses the correct URL format and points to a valid sitemap.
The Absolute URL Requirement
The most common mistake people make with the Sitemap directive is using a relative URL. This does not work.
# WRONG -- relative URL
Sitemap: /sitemap.xml
# WRONG -- missing protocol
Sitemap: example.com/sitemap.xml
# WRONG -- missing domain
Sitemap: sitemap.xml
# CORRECT -- absolute URL with protocol and domain
Sitemap: https://example.com/sitemap.xml
The URL must include:
- The protocol (
https://orhttp://) - The full domain name
- The complete path to the sitemap file
Relative URLs are silently ignored
If you use a relative URL, crawlers will not throw an error. They will simply ignore the directive. Your sitemap will not be discovered through robots.txt, and you may not realize it until you notice indexing issues.
Cross-Domain Sitemaps
The Sitemap directive in robots.txt can point to a sitemap hosted on a different domain. This is valid and sometimes useful:
# In robots.txt on example.com
Sitemap: https://cdn.example.com/sitemap.xml
Sitemap: https://www.example.com/sitemap.xml
However, the sitemap itself must only contain URLs from the domain where the robots.txt is hosted (or that you have verified in Google Search Console). A sitemap on cdn.example.com cannot list URLs for differentsite.com.
Common CMS Configurations
Different platforms handle the sitemap-robots.txt relationship in different ways.
WordPress
WordPress includes its sitemap URL in the virtual robots.txt automatically:
Sitemap: https://yoursite.com/wp-sitemap.xml
If you use a plugin like Yoast SEO or Rank Math that generates its own sitemap, the plugin typically updates the robots.txt to reference its sitemap instead:
Sitemap: https://yoursite.com/sitemap_index.xml
If you create a physical robots.txt file, you need to add the Sitemap directive yourself -- it will not be auto-generated.
Shopify
Shopify automatically includes the sitemap reference in its generated robots.txt:
Sitemap: https://yourstore.com/sitemap.xml
Next.js and Static Sites
If you build your site with a framework like Next.js, Gatsby, or Hugo, you manage the robots.txt file yourself. Make sure to include the Sitemap directive:
User-agent: *
Allow: /
Sitemap: https://yoursite.com/sitemap.xml
Check your sitemap setup
Validate that your robots.txt correctly references your sitemap and that the sitemap URL is accessible to crawlers.
Do You Still Need to Submit via Search Console?
Yes, you should do both. Adding the sitemap to robots.txt and submitting it through Google Search Console serve different purposes.
| Method | Who Discovers It | Feedback Provided | When to Use |
|---|---|---|---|
| robots.txt Sitemap directive | All crawlers (Google, Bing, Yandex, etc.) | None | Always -- it is free and universal |
| Google Search Console | Google only | Crawl stats, errors, index coverage | Always -- provides monitoring data |
| Bing Webmaster Tools | Bing only | Crawl stats, errors | If you want Bing-specific data |
The robots.txt Sitemap directive is a passive discovery mechanism. It tells crawlers where your sitemap is, but you get no feedback on whether they actually found it or processed it successfully.
Google Search Console submission gives you detailed data: how many URLs were submitted, how many were indexed, and what errors were found. You cannot get this data from robots.txt alone.
Use both. The robots.txt directive handles broad discovery (including bots that do not have their own submission tools), and Search Console gives you the visibility you need to debug issues.
Verifying Your Setup
After adding the Sitemap directive to your robots.txt, verify everything works.
Check the robots.txt file
Open https://yourdomain.com/robots.txt in your browser. Confirm the Sitemap directive is present with the correct absolute URL.
Verify the sitemap URL
Click or navigate to the sitemap URL from the directive. You should see valid XML content, not a 404 error or HTML page.
Test with a robots.txt validator
Use a testing tool to parse your robots.txt and verify the Sitemap directive is correctly formatted.
Check Search Console
If you use Google Search Console, go to the Sitemaps section and verify the sitemap status. Look for any errors reported.
Mistakes to Avoid
Using relative URLs
Always use the full absolute URL including https:// and the domain. Relative paths are silently ignored by crawlers.
Pointing to a broken URL
If your sitemap URL returns a 404 or 500 error, the directive is useless. Verify the sitemap actually exists at the specified URL.
Forgetting to update after migration
If you move your site to a new domain or change your sitemap URL, update the robots.txt Sitemap directive. Old URLs will return 404s.
Listing sitemaps for blocked content
If your robots.txt blocks crawling of certain paths, do not include those URLs in your sitemap. The conflict confuses crawlers and wastes crawl budget.
Using HTTP when your site is HTTPS
If your site is on HTTPS, the sitemap URL in robots.txt should also use HTTPS. Mismatched protocols can cause issues with some crawlers.
Related Articles
One line in robots.txt. Every crawler finds your sitemap. That is a good trade.
Test your robots.txt for free
Validate your robots.txt file instantly. Check directives, find crawling issues, and ensure search engines can access your site.