Robots.txt Generator
Define crawler access by user agent, add allow/disallow paths, and include your sitemap URL in a standards-friendly robots.txt file. Use our free robots.txt generator to create, validate, and download your file in seconds.
Inputs
Live GeneratorGenerated output
How this tool helps
What Is a Robots.txt File?
A robots.txt file is a plain text file that lives at the root of your website. It uses the Robots Exclusion Protocol to tell search engine crawlers like Googlebot, Bingbot, and others which parts of your site they can and cannot access.
Every major search engine checks for robots.txt before crawling a site. Without one, crawlers will attempt to access every URL they discover. With a properly configured robots.txt, you control how your crawl budget is spent and prevent low-value pages from clogging your index.
Robots.txt Syntax Guide
A robots.txt file uses four main directives:
- User-agent: Specifies which crawler the rules apply to. Use
*for all crawlers, or target specific bots likeGooglebot,Bingbot, orGPTBot. - Disallow: Tells the specified crawler not to access a path. Example:
Disallow: /admin/blocks the entire admin directory. - Allow: Overrides a Disallow directive for a more specific path. Example:
Allow: /admin/public/makes that subdirectory accessible even if/admin/is blocked. - Sitemap: Points crawlers to your XML sitemap for faster URL discovery. Example:
Sitemap: https://example.com/sitemap.xml.
Common Robots.txt Mistakes to Avoid
- Blocking CSS and JavaScript: Google needs access to these files to render your pages. Blocking them hurts your rankings.
- Using Disallow instead of Noindex: Disallow prevents crawling but does not prevent indexing. If other sites link to a disallowed URL, Google may still show it in search results without a snippet.
- Forgetting the trailing slash:
Disallow: /blogblocks URLs starting with /blog (including /blog-archive).Disallow: /blog/only blocks the /blog/ directory and its children. - Wrong file location: Robots.txt must be at the domain root (example.com/robots.txt). Placing it in a subdirectory has no effect.
- Blocking your entire site:
Disallow: /blocks all crawling. This is useful during development but catastrophic in production.
Robots.txt Examples by Platform
WordPress robots.txt:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /wp-includes/
Disallow: /trackback/
Disallow: /xmlrpc.php
Sitemap: https://example.com/sitemap_index.xml Shopify robots.txt:
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkouts/
Disallow: /checkout
Disallow: /*/checkouts
Disallow: /carts
Disallow: /account
Sitemap: https://yourstore.myshopify.com/sitemap.xml Next.js / React robots.txt:
User-agent: *
Allow: /
Disallow: /api/
Disallow: /_next/
Sitemap: https://example.com/sitemap.xml Robots.txt vs Noindex vs Nofollow
| Method | Prevents Crawling | Prevents Indexing | Best For |
|---|---|---|---|
| Robots.txt Disallow | ✅ Yes | ❌ No | Saving crawl budget, blocking admin paths |
| Meta Noindex | ❌ No | ✅ Yes | Removing pages from search results |
| Meta Nofollow | ❌ No | ❌ No | Preventing PageRank from passing through links |
How to Block AI Crawlers
Many website owners now want to block AI training crawlers while keeping search engine access. Here are the key user-agent strings:
# Block OpenAI's crawler
User-agent: GPTBot
Disallow: /
# Block Google's AI training (but keep search indexing)
User-agent: Google-Extended
Disallow: /
# Block Common Crawl (used by many AI models)
User-agent: CCBot
Disallow: /
# Block Anthropic's crawler
User-agent: anthropic-ai
Disallow: / How to Test Your Robots.txt File
After generating your robots.txt file with our free robots txt generator above, validate it using these methods:
- Google Search Console: Navigate to Settings → robots.txt to test URLs against your rules and check for errors.
- Browser check: Visit
yourdomain.com/robots.txtdirectly to verify the file is accessible and correctly formatted. - Google Rich Results Test: Enter a URL to see if Googlebot can access the page or if robots.txt is blocking it.
Use this free robots.txt generator tool whenever you update your site structure, launch new sections, or need to block specific crawlers. A well-maintained robots.txt file is one of the simplest ways to improve your technical SEO.
Got questions?
What is a robots.txt file and what does it do?
Can robots.txt block a page from appearing in Google?
Should I add my sitemap URL to robots.txt?
What is the User-agent directive in robots.txt?
What is crawl delay and should I use it?
How do I test if my robots.txt file works correctly?
What is the difference between Allow and Disallow?
Should I block CSS and JavaScript files in robots.txt?
Can I use robots.txt to block AI crawlers like GPTBot?
Where should I place my robots.txt file?
Related Tools
Explore our other free AI and SEO utilities.
Automate Support & Capture Leads
with
AI Agents
Start using AI agents to answer customer questions, capture leads, and support your business 24/7 — without adding more work to your team.
Free trial · Setup in 5 minutes · Cancel anytime
Questions? Talk to us.