Perplexity's Crawlers: PerplexityBot, Perplexity-User, and the Stealth-Crawling Controversy
PerplexityBot indexes for citations; Perplexity-User fetches pages users ask about (and ignores robots.txt by design). Plus the August 2025 Cloudflare report that Perplexity crawled sites that blocked it.
Perplexity’s Crawlers: PerplexityBot, Perplexity-User, and the Stealth-Crawling Controversy
By Andrej Ruckij · June 17, 2026
TL;DR: PerplexityBot indexes content so Perplexity can cite it (allow it). Perplexity-User fetches a page a user asked about — and, by Perplexity’s own documentation, ignores robots.txt because it’s user-triggered. Both publish IP ranges. But Perplexity also carries a caution: in August 2025 Cloudflare reported it crawling sites that had blocked it, by rotating user-agents.
A cluster under the AI crawler directory. Perplexity is the one major vendor where “does it comply?” has a genuinely complicated answer.
The two Perplexity bots
| User-agent | Job | Respects robots.txt | IP ranges | Recommendation |
|---|---|---|---|---|
PerplexityBot | Index for Perplexity citations | Yes | perplexity.com/perplexitybot.json | Allow (drives citations) |
Perplexity-User | Fetch a page a user asked about | No (by design) | perplexity.com/perplexity-user.json | Allow (it’s a visitor) |
PerplexityBot — allow it
PerplexityBot is the search/retrieval crawler that builds Perplexity’s citation index. Allow it for the standard reason: it’s how you get cited (with a link) in Perplexity answers. Perplexity states it’s not used for training.
Perplexity-User — ignores robots.txt on purpose
Here’s a wrinkle worth knowing: Perplexity’s own docs say Perplexity-User generally ignores robots.txt, on the rationale that it’s acting on behalf of a specific human who asked for that page — the same logic that says you should never block user-fetch bots anyway. So in practice this is fine (you wouldn’t block a visitor), but it’s a notable departure from the “all reputable bots honor robots.txt” pattern.
The August 2025 stealth-crawling report
The bigger caution: in August 2025, Cloudflare reported observing Perplexity crawling sites that had explicitly disallowed it — rotating user-agents and source networks, and using a generic “Chrome on macOS” identity to bypass blocks tied to its declared crawler. Cloudflare de-listed Perplexity from its verified-bots list and added detection heuristics. Perplexity disputed the findings, calling the report a “sales pitch.”
The lesson, regardless of the dispute: even a named, reputable bot’s compliance can’t be assumed. If you need to actually keep Perplexity (or anything) out, robots.txt isn’t enough — you need firewall-level enforcement and IP-range verification. See robots-txt-vs-waf-ai-bots and can-robots-txt-stop-ai-scrapers.
Recommended setup
User-agent: PerplexityBot
Allow: /
Allow PerplexityBot for citations; you wouldn’t block Perplexity-User (it’s a visitor) and blocking it via robots.txt wouldn’t work anyway. If you have a specific reason to exclude Perplexity entirely, do it at the firewall and verify by IP range — not in robots.txt alone.
Key takeaways
PerplexityBot(allow — citations) andPerplexity-User(visitor; ignores robots.txt by design).- Both publish IP ranges for verification.
- August 2025: Cloudflare reported Perplexity stealth-crawling blocked sites and de-listed it as a verified bot — a reminder that robots.txt compliance can’t be assumed and enforcement is a firewall job.
Related articles
- ai-crawler-user-agents-directory — the full cross-vendor bot table
- robots-txt-vs-waf-ai-bots — why a firewall, not robots.txt, enforces
- can-robots-txt-stop-ai-scrapers — the non-compliant-crawler problem
- glossary/bytespider — the other canonical non-compliant case