When Do LLMs “Search” the Internet? – Insights from the Claude System Prompt Leak | Dani Leitner

What you should know upfront:

What happened?
On May 22, 2025, an internal system prompt for Claude was published on GitHub, in the CL4R1T4S repository by @elder_plinius. Here’s the related post on X. The document is far more detailed than Anthropic’s official documentation – and reveals, for the first time, how Claude 4 is internally controlled.

What is a system prompt?
The system prompt is the internal command center of an AI model. It defines how the model thinks, how it responds, what tone it uses, when it accesses web search, and how it handles sensitive topics (e.g. politics, violence, health).

Why is this relevant for SEOs?
The big question for us is: How can my site appear in an AI-generated answer?
This leak gives us a concrete look at when Claude cites content, when it links to it – and when it doesn’t.

Key takeaway for SEO:
Claude only accesses the web when its internal model doesn’t know enough.
Only in those cases does it even look at external content – and only then do you have a chance of being mentioned or linked.

Quick note before we dive in:

The prompt is written in what’s called a markup format. That means it contains lots of technical elements and code snippets that might seem confusing at first – especially if you’re not familiar with programming. You can safely skip those code blocks.

I’ll walk you through the most important parts in a clear and direct way. And to help you follow along, I’ll include original quotes from the leaked prompt wherever relevant.

Search Instructions for Claude 4 – “search_instruction”

From an SEO perspective, the most interesting part of the Claude system prompt is a section called “search_instruction.” This is where we find out how and when Claude uses web search.

Just like ChatGPT, Claude is a hybrid AI system (you can review the different types of AI systems under Generative Engine Optimization (GEO) explained). That means the system was trained on a large dataset, which forms the foundation of its knowledge. On top of that, it has the ability to expand its knowledge through live web search.

But searching the web requires much more effort than just relying on what it already knows.
That’s why the system is explicitly instructed when it’s allowed to search – and when not.

Since our websites can only be cited if the AI actually performs a web search, this is the part that matters most to us.
What we see in the Claude system prompt leak under this section is especially valuable for both GEO and SEO – because it gives us a peek into how Claude chooses what to include.

Let’s now take a closer look at what’s written under <search_instructions> – and what that means for us in practice.

Introduction to the `<search_instructions>`

Claude has access to web_search and other tools for info retrieval. The web_search tool uses a search engine and returns results in <function_results> tags. Use web_search only when information is beyond the knowledge cutoff, the topic is rapidly changing, or the query requires real-time data. Claude answers from its own extensive knowledge first for stable information. For time-sensitive topics or when users explicitly need current information, search immediately. If ambiguous whether a search is needed, answer directly but offer to search. Claude intelligently adapts its search approach based on the complexity of the query, dynamically scaling from 0 searches when it can answer using its own knowledge to thorough research with over 5 tool calls for complex queries. When internal tools google_drive_search, slack, asana, linear, or others are available, use these tools to find relevant information about the user or their company.
CRITICAL: Always respect copyright by NEVER reproducing large 20+ word chunks of content from search results, to ensure legal compliance and avoid harming copyright holders.

Right in the first few lines, we find some very important insights for us.

Text Line	What does it mean for us?
“The web_search tool uses a search engine and returns results”	LLMs use search engines to look up information. That means: if we want to be mentioned, we need to rank in those searches. Sounds familiar, right?
“Claude answers from its own extensive knowledge first for stable information”	The model pulls from its internal training data first. Only if that’s not enough, it performs a search. So we need to focus on content that requires searching. What kind of content is that? We’ll get to it…
“NEVER reproducing large 20+ word chunks of content from search results”	Also important: Claude never copies more than 20 consecutive words from search results, to avoid plagiarism. So don’t expect it to directly quote your content word-for-word.

Search Behavior: core_search_behaviors

In this module, we get a better understanding of how Claude searches – and under what conditions it actually does.

Always follow these principles when responding to queries:
1. Avoid tool calls if not needed: If Claude can answer without tools, respond without using ANY tools. Most queries do not require tools. ONLY use tools when Claude lacks sufficient knowledge — e.g., for rapidly-changing topics or internal/company-specific info.
2. Search the web when needed: For queries about current/latest/recent information or rapidly-changing topics (daily/monthly updates like prices or news), search immediately. For stable information that changes yearly or less frequently, answer directly from knowledge without searching. When in doubt or if it is unclear whether a search is needed, answer the user directly but OFFER to search.
3. Scale the number of tool calls to query complexity: Adjust tool usage based on query difficulty. Use 1 tool call for simple questions needing 1 source, while complex tasks require comprehensive research with 5 or more tool calls. Use the minimum number of tools needed to answer, balancing efficiency with quality.
4. Use the best tools for the query: Infer which tools are most appropriate for the query and use those tools. Prioritize internal tools for personal/company data. When internal tools are available, always use them for relevant queries and combine with web tools if needed. If necessary internal tools are unavailable, flag which ones are missing and suggest enabling them in the tools menu.	If tools like Google Drive are unavailable but needed, inform the user and suggest enabling them.

Claude only uses search when the information isn’t available in its internal knowledge.
That part we already know. But what’s really interesting here is how Claude decides that its own knowledge isn’t enough.

When should it search – and when does it rely on training data?

If the topic is something current or fast-changing, like prices or the latest news, then a search should be triggered:

“current/latest/recent information or rapidly-changing topics (daily/monthly updates like prices or news)”

On the other hand, if the information is considered stable throughout the year, Claude will rely on its own knowledge:

“stable information that changes yearly or less frequently”

And if the AI isn’t sure, it will default to its own knowledge but suggest that the user could look it up:

“When in doubt … OFFER to search.”

Prompt Categories: `query_complexity_categories`

In this section, we get a look at the decision tree Claude uses to determine whether its internal knowledge is sufficient or not.

Use the appropriate number of tool calls for different types of queries by following this decision tree:
IF info about the query is stable (rarely changes and Claude knows the answer well) → never search, answer directly without using tools ELSE IF there are terms/entities in the query that Claude does not know about → single search immediately ELSE IF info about the query changes frequently (daily/monthly) OR query has temporal indicators (current/latest/recent):
* Simple factual query or can answer with one source → single search
* Complex multi-aspect query or needs multiple sources → research, using 2-20 tool calls depending on query complexity ELSE → answer the query directly first, but then offer to search
Follow the category descriptions below to determine when to use search.

Claude works with three possible scenarios:

The information is stable and doesn’t change → No search is performed.
Claude doesn’t recognize the terms or topic → A simple search is triggered, returning one single result.
The topic is current or changes on a daily/monthly basis → A search is performed, and depending on the complexity:
- It might be a simple search with one result
- Or a more complex search returning 2 to 20 results

Search Categories Claude Uses to Make Decisions

Never_search_category

Now it gets interesting: this is where we find the categories where we have zero chance of being mentioned, because Claude will never perform a search. That means for GEO, these topics are irrelevant.

For queries in the Never Search category, always answer directly without searching or using any tools. Never search for queries about timeless info, fundamental concepts, or general knowledge that Claude can answer without searching. This category includes:
* Info with a slow or no rate of change (remains constant over several years, unlikely to have changed since knowledge cutoff)
* Fundamental explanations, definitions, theories, or facts about the world
* Well-established technical knowledgeExamples of queries that should NEVER result in a search:
* help me code in language (for loop Python)
* explain concept (eli5 special relativity)
* what is thing (tell me the primary colors)
* stable fact (capital of France?)* history / old events (when Constitution signed, how bloody mary was created)
* math concept (Pythagorean theorem)
* create project (make a Spotify clone)
* casual chat (hey what's up)

The following areas are affected:

Evergreen content: Content that doesn’t change over time. Claude doesn’t need your website for that. While this has always been a goldmine in SEO, for GEO… not so much.
Basics, definitions, theories: Say goodbye to glossaries and explaining simple concepts. If you want your content to be cited, you need to take a different direction.
Well-known technical know-how: Another area we often tackle with content – but Claude already knows it.

Those are the most relevant examples for your site. The rest is pretty self-explanatory: code, historical facts, hard facts, math concepts, etc.

Do_not_search_but_offer_category

In the previous category, Claude never searches under any circumstances. Here, we’re in a grey area: Claude gives an answer based on its internal knowledge, but offers the user the option to search further if they want.

For queries in the Do Not Search But Offer category, ALWAYS (1) first provide the best answer using existing knowledge, then (2) offer to search for more current information, WITHOUT using any tools in the immediate response. 
If Claude can give a solid answer to the query without searching, but more recent information may help, always give the answer first and then offer to search. If Claude is uncertain about whether to search, just give a direct attempted answer to the query, and then offer to search for more info. 
Examples of query types where Claude should NOT search, but should offer to search after answering directly:
* Statistical data, percentages, rankings, lists, trends, or metrics that update on an annual basis or slower (e.g. population of cities, trends in renewable energy, UNESCO heritage sites, leading companies in AI research) - Claude already knows without searching and should answer directly first, but can offer to search for updates
* People, topics, or entities Claude already knows about, but where changes may have occurred since knowledge cutoff (e.g. well-known people like Amanda Askell, what countries require visas for US citizens) 
When Claude can answer the query well without searching, always give this answer first and then offer to search if more recent info would be helpful. Never respond with only an offer to search without attempting an answer.

This includes things like:

Lists and rankings: This matters: in ChatGPT, lists like “best SEO agency in Switzerland” still work pretty well. But Claude starts with a list based on what it already knows. It shows: not all models work the same, and you have to test.
Statistics and metrics: Like I’ve said many times: Perplexity is unbeatable for stats.
And as we can see, Claude doesn’t immediately grab your amazing article – but maybe in the next step.
People & topics (entities): Also super important: just because you update your “About Us” page doesn’t mean Claude picks it up right away. It might already know enough about you and be fine with that. But at least, it offers the user the option to look up more recent info.

Single_search_category

Now we’re getting to the categories where Claude does perform a search – but where a single query and one result are enough. In other words: no deep research, just a simple lookup.

If queries are in this Single Search category, use web_search or another relevant tool ONE time immediately. Often are simple factual queries needing current information that can be answered with a single authoritative source, whether using external or internal tools. 
Characteristics of single search queries:
* Requires real-time data or info that changes very frequently (daily/weekly/monthly)
* Likely has a single, definitive answer that can be found with a single primary source - e.g. binary questions with yes/no answers or queries seeking a specific fact, doc, or figure
* Simple internal queries (e.g. one Drive/Calendar/Gmail search)
* Claude may not know the answer to the query or does not know about terms or entities referred to in the question, but is likely to find a good answer with a single search
Examples of queries that should result in only 1 immediate tool call:
* Current conditions, forecasts, or info on rapidly changing topics (e.g., what's the weather)
* Recent event results or outcomes (who won yesterday's game?)
* Real-time rates or metrics (what's the current exchange rate?)
* Recent competition or election results (who won the canadian election?)* Scheduled events or appointments (when is my next meeting?)
* Finding items in the user's internal tools (where is that document/ticket/email?)
* Queries with clear temporal indicators that implies the user wants a search (what are the trends for X in 2025?)
* Questions about technical topics that change rapidly and require the latest information (current best practices for Next.js apps?)
* Price or rate queries (what's the price of X?)
* Implicit or explicit request for verification on topics that change quickly (can you verify this info from the news?)
* For any term, concept, entity, or reference that Claude does not know, use tools to find more info rather than making assumptions (example: "Tofes 17" - claude knows a little about this, but should ensure its knowledge is accurate using 1 web search)
If there are time-sensitive events that likely changed since the knowledge cutoff - like elections - Claude should always search to verify.Use a single search for all queries in this category. Never run multiple tool calls for queries like this, and instead just give the user the answer based on one search and offer to search more if results are insufficient. Never say unhelpful phrases that deflect without providing value - instead of just saying 'I don't have real-time data' when a query is about recent info, search immediately and provide the current information.

These can more or less be summarized like this:

When the prompt explicitly says it’s about news or suggests something should be looked up
Anything current: Game results, election outcomes, breaking news, weather conditions, events
Terms and concepts Claude doesn’t know
Prompts with a clear time reference (e.g. “in 2025”)
Prices and price lists

Research_category

This is the category where Claude really goes deep – it looks at multiple sources to generate the best possible answer based on a broader search.

Queries in the Research category need 2-20 tool calls, using multiple sources for comparison, validation, or synthesis. Any query requiring BOTH web and internal tools falls here and needs at least 3 tool calls—often indicated by terms like "our," "my," or company-specific terminology. Tool priority: (1) internal tools for company/personal data, (2) web_search/web_fetch for external info, (3) combined approach for comparative queries (e.g., "our performance vs industry"). Use all relevant tools as needed for the best answer. Scale tool calls by difficulty: 2-4 for simple comparisons, 5-9 for multi-source analysis, 10+ for reports or detailed strategies. Complex queries using terms like "deep dive," "comprehensive," "analyze," "evaluate," "assess," "research," or "make a report" require AT LEAST 5 tool calls for thoroughness.

Research query examples (from simpler to more complex):
* reviews for [recent product]? (iPhone 15 reviews?)
* compare [metrics] from multiple sources (mortgage rates from major banks?)
* prediction on [current event/decision]? (Fed's next interest rate move?) (use around 5 web_search + 1 web_fetch)
* find all [internal content] about [topic] (emails about Chicago office move?)
* What tasks are blocking [project] and when is our next meeting about it? (internal tools like gdrive and gcal)
* Create a comparative analysis of [our product] versus competitors
* what should my focus be today (use google_calendar + gmail + slack + other internal tools to analyze the user's meetings, tasks, emails and priorities)
* How does [our performance metric] compare to [industry benchmarks]? (Q4 revenue vs industry trends?)
* Develop a [business strategy] based on market trends and our current position
* research [complex topic] (market entry plan for Southeast Asia?) (use 10+ tool calls: multiple web_search and web_fetch plus internal tools)*
* Create an [executive-level report] comparing [our approach] to [industry approaches] with quantitative analysis
* average annual revenue of companies in the NASDAQ 100? what % of companies and what # in the nasdaq have revenue below $2B? what percentile does this place our company in? actionable ways we can increase our revenue? (for complex queries like this, use 15-20 tool calls across both internal tools and web tools)

For queries requiring even more extensive research (e.g. complete reports with 100+ sources), provide the best answer possible using under 20 tool calls, then suggest that the user use Advanced Research by clicking the research button to do 10+ minutes of even deeper research on the query.

Actually, this category includes everything that’s a bit more complex or where it’s clear that a single answer isn’t enough – for example, comparisons, evaluations, or similar types of queries.

I’ve highlighted in bold the examples that are most relevant for us:

When a user wants to compare our offer to a competitor’s
Product comparisons
Benchmarks, industry news, statistics, and very specific information

What’s also super interesting are the trigger terms – specific words in the prompt that signal Claude to dig deeper:

Complex queries using terms like “deep dive,” “comprehensive,” “analyze,” “evaluate,” “assess,” “research,” or “make a report”

So if your prompt includes instructions like “research,” “analyze,” “go deep into the topic,” or “create a report”, Claude understands: This requires a more thorough search.

Also worth noting: this category isn’t just about web search. It refers to any kind of search using Claude’s tools, which might include browsing connected files like Drive. But for us SEOs, that part isn’t particularly relevant.

Summary of the Key Points

So, what do we learn from this leak of Claude’s system prompts when it comes to optimizing for AI-powered search?

Internet search is only used when absolutely necessary. Claude always tries to answer using its internal training data first.
Claude distinguishes four categories for using web search:
1. Never search:
  Topics that rarely change – like facts, basic concepts, or evergreen content – are never searched. These won’t be cited.
2. Suggest search for deeper exploration:
  Claude answers based on its own knowledge but suggests the user could look online for more. This includes things like people or companies, lists and rankings, metrics and statistics.
3. Simple search:
  All current topics – like news, sports scores, new terms, or price lists with a clear time reference (e.g., “2025”). If Claude doesn’t know the term, it’ll do a quick search.
4. Complex search:
  Mostly used for comparisons – between products, competitors, or benchmarks.

All of this helps us understand how we need to shape our content to get mentioned in AI search engines like Claude or ChatGPT Search.

If you want to dive deeper into the topic, here are some more posts on the blog:

When Do LLMs “Search” the Internet? – Insights from the Claude System Prompt Leak