# HealthBase robots.txt # https://healthbase.io # ── Allow all legitimate crawlers ── User-agent: * Allow: / # Block query strings and non-content paths Disallow: /*?* Disallow: /*.json$ # ── Google ── User-agent: Googlebot Allow: / Crawl-delay: 1 User-agent: Googlebot-Image Allow: / # ── Bing ── User-agent: Bingbot Allow: / Crawl-delay: 2 # ── AI Answer Engines & Research Crawlers ── # Explicitly allow AI/LLM crawlers for GEO (Generative Engine Optimization) User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: PerplexityBot Allow: / User-agent: Claude-Web Allow: / User-agent: anthropic-ai Allow: / User-agent: cohere-ai Allow: / User-agent: YouBot Allow: / # ── Social & Preview Crawlers ── User-agent: Twitterbot Allow: / User-agent: facebookexternalhit Allow: / User-agent: LinkedInBot Allow: / # ── Research & Academic Crawlers ── User-agent: Semanticscholar Allow: / User-agent: ia_archiver Allow: / # ── Block Known Bad Actors / Scrapers ── User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: PetalBot Disallow: / # ── Sitemaps ── Sitemap: https://healthbase.io/sitemap.xml