robots.txt for Squarespace Sites

How robots.txt works on Squarespace, what the default file contains, what you can and cannot customize, and how to handle common Squarespace crawling issues.

Squarespace generates a robots.txt file automatically for every site. You cannot directly edit this file, which makes Squarespace different from most other platforms. This limitation matters because robots.txt controls how search engine crawlers interact with your site.

This guide covers what Squarespace's default robots.txt contains, what control you do have, and how to work around the limitations. For a general introduction to robots.txt, see our robots.txt guide.

Squarespace's Default robots.txt

Every Squarespace site has a robots.txt file at https://yourdomain.com/robots.txt. You can view yours right now by adding /robots.txt to your domain.

The default Squarespace robots.txt typically looks like this:

User-agent: *
Disallow: /config
Disallow: /search
Disallow: /account
Disallow: /api
Disallow: /commerce/checkout
Disallow: /commerce/digital-download/

Sitemap: https://yourdomain.com/sitemap.xml

What these rules do

Disallow: /config -- Blocks crawlers from Squarespace's internal configuration pages. These are admin-only pages that should not be indexed.

Disallow: /search -- Blocks the built-in search results page. This prevents Google from indexing pages of search results, which would be low-value duplicate content.

Disallow: /account -- Blocks member account pages. These are behind authentication and should not be in search results.

Disallow: /api -- Blocks API endpoints used internally by Squarespace.

Disallow: /commerce/checkout -- Blocks checkout pages. These are transactional pages that should not be indexed.

Disallow: /commerce/digital-download/ -- Blocks digital download delivery pages.

Sitemap: -- Points crawlers to your sitemap.

These defaults are sensible. Squarespace blocks the pages that should be blocked and allows everything else.

Can You Edit robots.txt on Squarespace?

No. Squarespace does not provide a way to directly edit the robots.txt file. There is no text editor, no file upload option, and no setting in the admin panel that lets you modify the file's contents.

This is a deliberate design choice. Squarespace manages the infrastructure and does not give users access to server-level files. It prevents users from accidentally blocking their entire site with a misconfigured robots.txt, but it also limits advanced SEO control.

What you can control

While you cannot edit robots.txt, Squarespace offers other ways to control crawler behavior:

Page-level SEO settings. You can hide individual pages from search engines by toggling the "Enable Page" switch off or by adding a noindex meta tag via page settings.

Password protection. Password-protected pages are automatically blocked from crawlers. Squarespace handles this at the server level.

URL redirects. You can set up 301 redirects in Settings > Advanced > URL Mappings. This helps manage crawler behavior when pages move or are deleted.

Sitemap inclusion. Squarespace automatically generates a sitemap that includes your published, visible pages. Pages marked as "hidden" in the navigation are still included in the sitemap unless you specifically disable SEO indexing for them.

Squarespace's Automatic Sitemap

Squarespace generates a sitemap at /sitemap.xml that is referenced in the robots.txt. This sitemap includes:

  • All published pages
  • All published blog posts
  • All published products (for commerce sites)
  • All published events
  • All published gallery items

The sitemap updates automatically when you publish or unpublish content. You do not need to maintain it manually.

Sitemap limitations

Squarespace's sitemap does not include:

  • lastmod dates (the sitemap does not indicate when content was last modified)
  • Image sitemap entries
  • Video sitemap entries
  • Hreflang annotations (relevant for multilingual sites)
  • Priority or change frequency values (though these are ignored by Google anyway)

The missing lastmod dates are the most significant limitation. Without them, Google cannot use the sitemap to identify recently updated content. Google relies on its own crawl schedule instead.

Common Squarespace Crawling Issues

Pages not being indexed

If specific pages are not appearing in Google search results, check:

  1. Is the page published? Draft and disabled pages are not crawlable.
  2. Is SEO indexing enabled for the page? Go to the page settings > SEO tab and make sure "Hide this page from search results" is not checked.
  3. Is the page password-protected? Protected pages are blocked from crawlers.
  4. Does the page have enough content? Very thin pages may be crawled but not indexed.

Duplicate content from tags and categories

Squarespace blog category and tag pages can create duplicate content issues. If you have many categories and tags, Google may crawl and attempt to index these listing pages, which often contain the same content snippets as other listing pages.

You cannot block these with robots.txt on Squarespace. Instead, make sure your blog posts have enough unique content that the listing pages are not the primary indexed version.

Commerce pages being indexed

If you see checkout or cart pages appearing in Google results, verify that the default robots.txt rules are in place. Check yourdomain.com/robots.txt to confirm the commerce paths are blocked.

Staging site being indexed

If you set up your Squarespace site on a .squarespace.com subdomain before connecting a custom domain, the staging URL may get indexed. After connecting your custom domain, Squarespace should redirect the old URL. But if both versions appear in Google, add the .squarespace.com URL to Search Console and request removal.

Working Around robots.txt Limitations

Since you cannot edit robots.txt, here are alternative approaches for common needs:

Blocking specific pages from indexing

Use the page-level noindex setting:

  1. Open the page in the Squarespace editor
  2. Click the gear icon (page settings)
  3. Go to the SEO tab
  4. Check "Hide this page from search results"

This adds a <meta name="robots" content="noindex"> tag to the page. Google will crawl the page but not index it. This is functionally different from a robots.txt block (which prevents crawling), but the end result for most purposes is the same -- the page will not appear in search results. See robots.txt vs. meta robots for the nuanced difference.

Blocking specific bots (like AI crawlers)

You cannot block specific bots (like GPTBot or ClaudeBot) via robots.txt on Squarespace because you cannot edit the file. This is a real limitation for site owners who want to prevent AI companies from using their content for training.

Workaround options:

  • Contact Squarespace support to request robot blocking features
  • Use a third-party service or CDN that sits in front of Squarespace and can filter bot requests
  • Add <meta name="robots" content="noai, noimageai"> tags via code injection (if your plan supports it), though support for these tags varies by bot

This is an area where Squarespace's locked-down approach creates friction. Other platforms give site owners direct control over which bots can access their content.

Controlling crawl rate

Squarespace does not support crawl-delay in robots.txt, and you cannot add it. In practice, this is rarely an issue because Squarespace's infrastructure is designed to handle crawler traffic without performance problems.

Code injection for meta robots

Squarespace Business and Commerce plans support code injection (Settings > Advanced > Code Injection). You can add meta robots tags site-wide through the header injection field. This gives you some additional control over indexing behavior beyond what the page settings offer. For example, you could add a noindex tag to all pages matching certain conditions using conditional JavaScript, though this is an advanced approach.

Squarespace SEO Settings That Affect Crawling

Beyond robots.txt, several Squarespace settings influence how crawlers interact with your site:

SSL / HTTPS

Squarespace provides free SSL certificates and serves all sites over HTTPS. Your robots.txt and sitemap use HTTPS URLs automatically.

Custom 404 page

Squarespace lets you customize the 404 page design, but the HTTP status code is correctly set to 404. Crawlers receive the proper error response.

URL slugs

You can customize URL slugs for pages, blog posts, and products. Clean, descriptive URLs help crawlers understand your content structure.

Canonical URLs

Squarespace adds canonical tags to pages automatically. This helps prevent duplicate content issues when the same content is accessible at multiple URLs (for example, through tags or categories).

Page titles and meta descriptions

These do not affect crawling directly, but they influence what appears in search results after crawling and indexing. Set them for every important page.

Verifying Your Squarespace Site with Search Engines

Google Search Console

Squarespace integrates with Google Search Console. You can verify your site by adding your Search Console verification code in Settings > SEO > Google Search Console.

After verification, submit your sitemap URL and monitor crawl and indexing reports.

Bing Webmaster Tools

Add your Squarespace site to Bing Webmaster Tools using DNS verification or the meta tag method (add the verification tag via code injection).

Summary

Squarespace generates a reasonable default robots.txt that blocks admin, search, and checkout pages while allowing everything else. You cannot edit this file directly. For page-level control, use the noindex setting in page options. For bot-specific blocking (especially AI crawlers), Squarespace's limitations may require workarounds like CDN-level filtering. The built-in sitemap and SSL work well out of the box. Verify your site with Google Search Console and monitor crawl reports for issues.

Test your Squarespace robots.txt

See what your robots.txt allows and blocks. Verify that search engines can reach your important pages.

Test Your robots.txt