Pages tagged "waf"
3 pages tagged with waf.
← all tags
- AI Crawler Access Control — Bot Taxonomy, robots.txt vs WAF How to decide which AI bots to allow or block: the training / retrieval / user-fetch taxonomy, why a WAF enforces where robots.txt only requests, and the current (2026) user-agent strings for OpenAI, Anthropic, Google, Perplexity, and Meta crawlers.
- Bytespider — Definition Bytespider is ByteDance's (TikTok's parent) web crawler, widely reported to ignore robots.txt and crawl aggressively. It's the canonical example of why robots.txt alone can't stop a non-compliant AI scraper.
- WAF (Web Application Firewall) — Definition A WAF is a firewall that inspects and blocks web requests at the edge before they reach your server. For AI bots it's the enforcement layer robots.txt isn't — it acts, robots.txt only asks.