Google's AI Crawlers: Google-Extended, Google-CloudVertexBot, and Gemini
Google's AI crawling is confusing because Google-Extended isn't a crawler — it's an opt-out token. Here's how Google-Extended, Google-CloudVertexBot, and Googlebot relate to Gemini training and AI features.
Google’s AI Crawlers: Google-Extended, Google-CloudVertexBot, and Gemini
By Andrej Ruckij · June 17, 2026
TL;DR: Google’s AI access is confusing because the main control, Google-Extended, is not a crawler — it’s an opt-out token that tells Google not to use already-crawled content for training Gemini/Vertex. It makes no requests and has no IP ranges. Google-CloudVertexBot is a real crawler. And blocking Google-Extended does not affect your Google Search ranking (that’s Googlebot).
A cluster under the AI crawler directory. Google is the most-misunderstood vendor here, because its primary AI control behaves unlike every other bot on the list.
Google-Extended is a token, not a crawler
This is the key thing to understand. Google-Extended does not crawl anything. It has no user-agent string of its own and makes no HTTP requests. It’s a robots.txt opt-out signal: adding it tells Google not to use content (that Googlebot already fetched for Search) to train Gemini and Vertex AI.
User-agent: Google-Extended
Disallow: /
What this means in practice:
- It changes downstream data use, not crawling. Googlebot still crawls and ranks you normally.
- Blocking Google-Extended has zero effect on your Google Search ranking — a common fear, and an unfounded one.
- There are no IP ranges to verify, because nothing is making requests.
See glossary/llms-txt-adjacent confusion aside — this opt-out-token-vs-crawler distinction is the single most common Google AI mistake.
Google-CloudVertexBot — the real crawler
Unlike Google-Extended, Google-CloudVertexBot is an actual crawler. It fetches content for Google Cloud Vertex AI Search (typically when a Cloud customer builds a search app over sites they own/are entitled to). It’s controllable via robots.txt with its own token. Most general site owners won’t need to think about it; it matters mainly in Cloud/enterprise contexts.
Where Gemini’s data comes from
Gemini’s training/grounding draws on Google’s broader crawling (governed by Googlebot + the Google-Extended opt-out). Separately — and relevant for ecommerce — Gemini’s product recommendations read your Google Merchant Center feed directly, which is why product-feed quality matters for Gemini (see chatgpt-shopping-is-google-shopping for the cross-engine feed picture).
Recommended setup
- To opt out of Gemini/Vertex training:
Disallow: Google-Extended. This won’t hurt Search. - Leave Googlebot allowed: blocking it would remove you from Google Search entirely — almost never what you want.
- Google-CloudVertexBot: decide only if it’s relevant to your Cloud setup.
Key takeaways
Google-Extendedis an opt-out token, not a crawler — no UA, no requests, no IP ranges.- Blocking Google-Extended opts you out of Gemini/Vertex training and does not affect Google Search ranking.
Google-CloudVertexBotis a real crawler, mostly relevant in Cloud/enterprise contexts.- Gemini reads your Merchant Center feed directly for product recommendations.
Related articles
- ai-crawler-user-agents-directory — the full cross-vendor bot table
- which-ai-bots-to-block — the overall allow/block policy
- does-blocking-ai-bots-hurt-seo — why blocking AI training doesn’t hurt Google SEO
- chatgpt-shopping-is-google-shopping — Gemini’s feed dependency in the cross-engine picture