|
Medium impact
Super easy

Optimize Robots.txt

Robots.txt is a plain text file at the root of your domain that tells search engine crawlers which pages they can and can't access. It's one of the most powerful technical SEO controls on your site — and one of the easiest to get wrong.

A misconfigured robots.txt file can block Google from crawling your entire site without any warning. It won't throw an error you'll see in your browser. Your site will look completely normal while Google sits outside the gate unable to read any of it. This has happened to large companies before a major launch. It's a genuinely dangerous mistake.

Webflow generates a default robots.txt automatically at https://yourdomain.com/robots.txt. The default typically allows all crawlers and references your sitemap URL. For most Webflow sites, the default is fine. Where you need to make changes: if you have staging pages, admin areas, or thank you pages that shouldn't be indexed, you can add Disallow rules for those paths.

What a basic robots.txt looks like:

User-agent: *Allow: /Sitemap: https://yourdomain.com/sitemap.xml

The User-agent: * applies the rule to all crawlers. Allow: / permits access to everything. Add Disallow: /thank-you to block a specific path. Never use robots.txt to block pages you want to keep private — it's publicly readable. Use noindex tags instead for genuinely private content.

To customize robots.txt in Webflow: go to Site Settings → SEO → Robots.txt. Edit the content there. Webflow will serve whatever you put in that field at your domain's /robots.txt path.

After launch, open https://yourdomain.com/robots.txt and read it. Then use the robots.txt tester in Google Search Console (under Settings → Crawling → Open robots.txt tester) to confirm that Google can access the pages you want it to. Run this check whenever you modify the robots.txt file.