Sitemap: http://idahopotato.com/sitemap.xml # =================== # Global Rule Set # =================== User-agent: * Crawl-delay: 10 Disallow: /sp-revision Disallow: /tag/ Disallow: /recipes/tag/ Disallow: /gallery/show/ Disallow: /dr-potato/tag/ Disallow: /cgi-bin/ Disallow: /admin/ # =================== # Major Search Bots – Allowed but throttled # =================== User-agent: Googlebot Crawl-delay: 10 User-agent: Bingbot Crawl-delay: 10 User-agent: Applebot Crawl-delay: 15 User-agent: Slurp Crawl-delay: 10 # =================== # Social + Previews # =================== User-agent: facebookexternalhit Crawl-delay: 20 User-agent: Facebot Crawl-delay: 20 User-agent: Twitterbot Crawl-delay: 15 User-agent: LinkedInBot Crawl-delay: 20 # =================== # Aggressive or SEO Bots – Blocked # =================== User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: dotbot Disallow: / User-agent: BLEXBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: Exabot Disallow: / User-agent: XoviBot Disallow: / User-agent: Riddler Disallow: / User-agent: trendictionbot Disallow: / User-agent: Genieo Disallow: / User-agent: seoscanners.net Disallow: / User-agent: spbot Disallow: / User-agent: Site24x7 Disallow: / User-agent: Meta-external Disallow: / User-agent: ClaudeBot Disallow: / User-agent: bytedance Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: OpenAI Disallow: / User-agent: CCBot Disallow: / User-agent: Wotbox Disallow: / User-agent: crawler4j Disallow: / User-agent: Baiduspider Disallow: / User-agent: Synapse Disallow: / User-agent: TurnitinBot Disallow: / User-agent: worldwebheritage.org Disallow: / # =================== # Archival or Monitoring – Disallowed (if not needed) # =================== User-agent: pingdom Disallow: / User-agent: Siteimprove Disallow: / User-agent: msnbot-media Disallow: / User-agent: ia_archiver Disallow: / # =================== # Notes: # - Legit bots (Google, Bing, etc.) are allowed with crawl delay. # - SEO, AI, and scraping bots are denied. # - Archive/monitoring tools are blocked unless required. # - Any of these bots ignoring rules should also be blocked in .htaccess.