Perplexity's Crawlers: PerplexityBot, Perplexity-User, and the Stealth-Crawling Controversy

PerplexityBot indexes for citations; Perplexity-User fetches pages users ask about (and ignores robots.txt by design). Plus the August 2025 Cloudflare report that Perplexity crawled sites that blocked it.

By Andrej Ruckij · · 3 min read

Perplexity’s Crawlers: PerplexityBot, Perplexity-User, and the Stealth-Crawling Controversy

By Andrej Ruckij · June 17, 2026

TL;DR: PerplexityBot indexes content so Perplexity can cite it (allow it). Perplexity-User fetches a page a user asked about — and, by Perplexity’s own documentation, ignores robots.txt because it’s user-triggered. Both publish IP ranges. But Perplexity also carries a caution: in August 2025 Cloudflare reported it crawling sites that had blocked it, by rotating user-agents.

A cluster under the AI crawler directory. Perplexity is the one major vendor where “does it comply?” has a genuinely complicated answer.

The two Perplexity bots

User-agentJobRespects robots.txtIP rangesRecommendation
PerplexityBotIndex for Perplexity citationsYesperplexity.com/perplexitybot.jsonAllow (drives citations)
Perplexity-UserFetch a page a user asked aboutNo (by design)perplexity.com/perplexity-user.jsonAllow (it’s a visitor)

PerplexityBot — allow it

PerplexityBot is the search/retrieval crawler that builds Perplexity’s citation index. Allow it for the standard reason: it’s how you get cited (with a link) in Perplexity answers. Perplexity states it’s not used for training.

Perplexity-User — ignores robots.txt on purpose

Here’s a wrinkle worth knowing: Perplexity’s own docs say Perplexity-User generally ignores robots.txt, on the rationale that it’s acting on behalf of a specific human who asked for that page — the same logic that says you should never block user-fetch bots anyway. So in practice this is fine (you wouldn’t block a visitor), but it’s a notable departure from the “all reputable bots honor robots.txt” pattern.

The August 2025 stealth-crawling report

The bigger caution: in August 2025, Cloudflare reported observing Perplexity crawling sites that had explicitly disallowed it — rotating user-agents and source networks, and using a generic “Chrome on macOS” identity to bypass blocks tied to its declared crawler. Cloudflare de-listed Perplexity from its verified-bots list and added detection heuristics. Perplexity disputed the findings, calling the report a “sales pitch.”

The lesson, regardless of the dispute: even a named, reputable bot’s compliance can’t be assumed. If you need to actually keep Perplexity (or anything) out, robots.txt isn’t enough — you need firewall-level enforcement and IP-range verification. See robots-txt-vs-waf-ai-bots and can-robots-txt-stop-ai-scrapers.

User-agent: PerplexityBot
Allow: /

Allow PerplexityBot for citations; you wouldn’t block Perplexity-User (it’s a visitor) and blocking it via robots.txt wouldn’t work anyway. If you have a specific reason to exclude Perplexity entirely, do it at the firewall and verify by IP range — not in robots.txt alone.

Key takeaways

  • PerplexityBot (allow — citations) and Perplexity-User (visitor; ignores robots.txt by design).
  • Both publish IP ranges for verification.
  • August 2025: Cloudflare reported Perplexity stealth-crawling blocked sites and de-listed it as a verified bot — a reminder that robots.txt compliance can’t be assumed and enforcement is a firewall job.

Sources