¿Cuándo “buscan” los LLMs en Internet? – Respuestas filtradas del System Prompt de Claude

Lo que deberías saber antes de empezar:

¿Qué ha pasado?
El 22 de mayo de 2025 se publicó en GitHub un system prompt interno de Claude, en el repositorio CL4R1T4S de @elder_plinius. Aquí también el post correspondiente en X. El documento es mucho más detallado que la documentación oficial de Anthropic – y por primera vez muestra cómo funciona Claude 4 internamente.

¿Qué es un system prompt?
El system prompt es el centro de control interno de un modelo de IA. Define cómo piensa el modelo, cómo responde, qué tono utiliza, cuándo accede a la búsqueda web y cómo maneja temas sensibles (como política, violencia o salud).

¿Por qué esto es interesante para el SEO?
La gran pregunta para nosotros es: ¿Cómo aparece mi página en una respuesta de IA? Esta filtración nos da, por primera vez, una idea concreta de cuándo Claude menciona contenidos, cuándo enlaza – y cuándo no.

Punto clave para el SEO:
Claude solo accede a la web cuando su conocimiento interno no es suficiente.
Solo entonces analiza contenido externo – y solo entonces tienes una posibilidad real de ser mencionado o enlazado.

Pequeño aviso antes de empezar:

El prompt está escrito en un formato llamado markup. Esto significa que contiene muchos elementos técnicos y fragmentos de código que, a primera vista, pueden parecer complicados – especialmente si no tienes experiencia en programación. Puedes saltarte estos bloques de código sin problema.

Yo te explico las partes más relevantes de forma clara y directa. Para que puedas seguir el hilo, incluiré de vez en cuando los textos originales del leak.

Instrucciones de búsqueda para Claude 4 – “search_instruction”

Desde el punto de vista del SEO, la parte más interesante del system prompt de Claude es una sección llamada “search_instruction”. Ahí se define cómo y cuándo Claude utiliza la búsqueda en línea.

Al igual que ChatGPT, Claude es un sistema de IA híbrido (puedes repasar los distintos tipos de sistemas en la sección sobre Optimización SEO para IA (GEO)). Esto significa que el sistema fue entrenado con grandes volúmenes de datos, que constituyen la base de su conocimiento. Además, tiene la capacidad de ampliar ese conocimiento a través de búsquedas en la web.

Sin embargo, buscar en Internet requiere mucho más esfuerzo que simplemente usar el conocimiento ya existente. Por eso, el sistema recibe instrucciones muy específicas sobre cuándo debe recurrir a la búsqueda externa – y cuándo no.

Dado que nuestras páginas web solo pueden ser citadas si el modelo realiza efectivamente una búsqueda en línea, esta parte es la que más nos interesa. Lo que encontramos en el leak del prompt de Claude bajo esta sección es especialmente valioso para GEO y SEO – porque nos da una idea clara de cómo Claude selecciona los contenidos.

Ahora vamos a ver más de cerca lo que dice exactamente en <search_instructions> – y qué significa eso para nosotras y nosotros en la práctica.

Introducción de las <search_instructions>

Claude has access to web_search and other tools for info retrieval. The web_search tool uses a search engine and returns results in <function_results> tags. Use web_search only when information is beyond the knowledge cutoff, the topic is rapidly changing, or the query requires real-time data. Claude answers from its own extensive knowledge first for stable information. For time-sensitive topics or when users explicitly need current information, search immediately. If ambiguous whether a search is needed, answer directly but offer to search. Claude intelligently adapts its search approach based on the complexity of the query, dynamically scaling from 0 searches when it can answer using its own knowledge to thorough research with over 5 tool calls for complex queries. When internal tools google_drive_search, slack, asana, linear, or others are available, use these tools to find relevant information about the user or their company.
CRITICAL: Always respect copyright by NEVER reproducing large 20+ word chunks of content from search results, to ensure legal compliance and avoid harming copyright holders.

Ya en las primeras líneas encontramos información muy relevante para nosotros.

Texto	¿Qué significa para nosotros?
“The web_search tool uses a search engine and returns results”	Los LLMs utilizan motores de búsqueda para encontrar información. Eso significa: si queremos ser mencionados, tenemos que posicionarnos en esas búsquedas. Te suena, ¿verdad?
“Claude answers from its own extensive knowledge first for stable information”	Primero se consulta el conocimiento interno del modelo. Solo si no es suficiente, se realiza una búsqueda. Así que tenemos que centrarnos en contenido que sí necesite ser buscado. ¿Cuál es ese contenido? Ya llegamos…
“NEVER reproducing large 20+ word chunks of content from search results”	Importante también: nunca se citan más de 20 palabras seguidas de una fuente externa, para evitar problemas de plagio. Así que no esperes que Claude copie y pegue literalmente tu contenido.

Comportamiento de búsqueda: core_search_behaviors

En este módulo entendemos mejor cómo busca Claude y bajo qué condiciones lo hace.

Always follow these principles when responding to queries:
1. Avoid tool calls if not needed: If Claude can answer without tools, respond without using ANY tools. Most queries do not require tools. ONLY use tools when Claude lacks sufficient knowledge — e.g., for rapidly-changing topics or internal/company-specific info.
2. Search the web when needed: For queries about current/latest/recent information or rapidly-changing topics (daily/monthly updates like prices or news), search immediately. For stable information that changes yearly or less frequently, answer directly from knowledge without searching. When in doubt or if it is unclear whether a search is needed, answer the user directly but OFFER to search.
3. Scale the number of tool calls to query complexity: Adjust tool usage based on query difficulty. Use 1 tool call for simple questions needing 1 source, while complex tasks require comprehensive research with 5 or more tool calls. Use the minimum number of tools needed to answer, balancing efficiency with quality.
4. Use the best tools for the query: Infer which tools are most appropriate for the query and use those tools. Prioritize internal tools for personal/company data. When internal tools are available, always use them for relevant queries and combine with web tools if needed. If necessary internal tools are unavailable, flag which ones are missing and suggest enabling them in the tools menu.	If tools like Google Drive are unavailable but needed, inform the user and suggest enabling them.

Claude solo utiliza la búsqueda cuando la información no está presente en su conocimiento interno. Eso ya lo sabíamos. Pero lo realmente interesante aquí es cómo decide Claude que su propio conocimiento no es suficiente.

¿Cuándo debe buscar y cuándo recurre a los datos de entrenamiento?

Si el tema es algo actual o que cambia rápidamente, como precios o noticias recientes, entonces debe realizar una búsqueda:

“current/latest/recent information or rapidly-changing topics (daily/monthly updates like prices or news)”

En cambio, si la información se mantiene estable durante el año, Claude usará su propio conocimiento:

“stable information that changes yearly or less frequently”

Y si la IA no está segura, primero responde con su conocimiento interno, pero sugiere que el usuario podría buscarlo por su cuenta:

“When in doubt … OFFER to search.”

Categorías de prompts: query_complexity_categories

En esta sección vemos el árbol de decisiones que utiliza Claude para determinar si su conocimiento interno es suficiente o no.

Use the appropriate number of tool calls for different types of queries by following this decision tree:
IF info about the query is stable (rarely changes and Claude knows the answer well) → never search, answer directly without using tools ELSE IF there are terms/entities in the query that Claude does not know about → single search immediately ELSE IF info about the query changes frequently (daily/monthly) OR query has temporal indicators (current/latest/recent):
* Simple factual query or can answer with one source → single search
* Complex multi-aspect query or needs multiple sources → research, using 2-20 tool calls depending on query complexity ELSE → answer the query directly first, but then offer to search
Follow the category descriptions below to determine when to use search.

Claude contempla tres escenarios posibles:

La información es estable y no cambia → No se realiza ninguna búsqueda
Claude no conoce los términos o temas → Se hace una búsqueda simple, con un solo resultado como referencia.
Se trata de un tema actual o que cambia a nivel diario/mensual → Se realiza una búsqueda, y dependiendo de la complejidad:
- Puede ser una búsqueda simple con un único resultado
- O una investigación más compleja con 2 a 20 resultados.

Categorías de búsqueda que utiliza Claude para tomar decisiones

Never_search_category

Ahora se pone interesante: aquí tenemos categorías en las que no tenemos ninguna posibilidad de ser mencionados, porque Claude nunca realiza una búsqueda. Eso significa que, para GEO, estos temas son irrelevantes.

For queries in the Never Search category, always answer directly without searching or using any tools. Never search for queries about timeless info, fundamental concepts, or general knowledge that Claude can answer without searching. This category includes:
* Info with a slow or no rate of change (remains constant over several years, unlikely to have changed since knowledge cutoff)
* Fundamental explanations, definitions, theories, or facts about the world
* Well-established technical knowledgeExamples of queries that should NEVER result in a search:
* help me code in language (for loop Python)
* explain concept (eli5 special relativity)
* what is thing (tell me the primary colors)
* stable fact (capital of France?)* history / old events (when Constitution signed, how bloody mary was created)
* math concept (Pythagorean theorem)
* create project (make a Spotify clone)
* casual chat (hey what's up)

Las áreas afectadas son:

Contenido evergreen: Es decir, contenidos que no cambian con el tiempo. Para eso, Claude no necesita tu web. Aunque en SEO siempre han sido temas estrella, en GEO tienen poco valor.
Conceptos básicos, definiciones, teorías: Bye bye glosarios y explicaciones de conceptos simples. Si quieres que tu contenido sea citado, necesitas un enfoque completamente diferente.
Conocimiento técnico general o ampliamente conocido: También algo que solemos cubrir con contenido… pero Claude ya lo sabe.

Estas son las áreas más relevantes para nuestra web. El resto es bastante obvio: código, historia, hechos fijos, conceptos matemáticos, etc.

Do_not_search_but_offer_category

Mientras que en la categoría anterior jamás se busca bajo ninguna circunstancia, aquí entramos en una zona gris: Claude genera primero una respuesta con su propio conocimiento, pero sugiere al usuario que podría complementar con una búsqueda si lo desea.

For queries in the Do Not Search But Offer category, ALWAYS (1) first provide the best answer using existing knowledge, then (2) offer to search for more current information, WITHOUT using any tools in the immediate response. 
If Claude can give a solid answer to the query without searching, but more recent information may help, always give the answer first and then offer to search. If Claude is uncertain about whether to search, just give a direct attempted answer to the query, and then offer to search for more info. 
Examples of query types where Claude should NOT search, but should offer to search after answering directly:
* Statistical data, percentages, rankings, lists, trends, or metrics that update on an annual basis or slower (e.g. population of cities, trends in renewable energy, UNESCO heritage sites, leading companies in AI research) - Claude already knows without searching and should answer directly first, but can offer to search for updates
* People, topics, or entities Claude already knows about, but where changes may have occurred since knowledge cutoff (e.g. well-known people like Amanda Askell, what countries require visas for US citizens) 
When Claude can answer the query well without searching, always give this answer first and then offer to search if more recent info would be helpful. Never respond with only an offer to search without attempting an answer.

Esto aplica a temas como:

Personas y temas (entidades): También súper importante: solo porque actualices tu página “Sobre nosotros” no significa que Claude ya use esos datos. Quizás ya sabe suficiente sobre ti y le basta con lo que tiene. Pero, al menos, ofrece al usuario buscar más información si quiere.
Listas y rankings: Esto es clave: en ChatGPT, listas como “mejores agencias SEO en Suiza” funcionan bastante bien. Pero Claude primero crea una lista basada en su conocimiento interno. Eso muestra que cada modelo actúa diferente, y hay que experimentar.
Estadísticas y métricas: Como ya he dicho muchas veces: Perplexity es insuperable para estadísticas. Y como vemos aquí, Claude no va a tu artículo directamente – pero quizás sí en un segundo paso.

Single_search_category

Aquí entramos en las categorías en las que Claude sí realiza una búsqueda, pero donde una sola búsqueda con un único resultado le basta. No se trata de una investigación profunda, sino de una búsqueda rápida y puntual.

If queries are in this Single Search category, use web_search or another relevant tool ONE time immediately. Often are simple factual queries needing current information that can be answered with a single authoritative source, whether using external or internal tools. 
Characteristics of single search queries:
* Requires real-time data or info that changes very frequently (daily/weekly/monthly)
* Likely has a single, definitive answer that can be found with a single primary source - e.g. binary questions with yes/no answers or queries seeking a specific fact, doc, or figure
* Simple internal queries (e.g. one Drive/Calendar/Gmail search)
* Claude may not know the answer to the query or does not know about terms or entities referred to in the question, but is likely to find a good answer with a single search
Examples of queries that should result in only 1 immediate tool call:
* Current conditions, forecasts, or info on rapidly changing topics (e.g., what's the weather)
* Recent event results or outcomes (who won yesterday's game?)
* Real-time rates or metrics (what's the current exchange rate?)
* Recent competition or election results (who won the canadian election?)* Scheduled events or appointments (when is my next meeting?)
* Finding items in the user's internal tools (where is that document/ticket/email?)
* Queries with clear temporal indicators that implies the user wants a search (what are the trends for X in 2025?)
* Questions about technical topics that change rapidly and require the latest information (current best practices for Next.js apps?)
* Price or rate queries (what's the price of X?)
* Implicit or explicit request for verification on topics that change quickly (can you verify this info from the news?)
* For any term, concept, entity, or reference that Claude does not know, use tools to find more info rather than making assumptions (example: "Tofes 17" - claude knows a little about this, but should ensure its knowledge is accurate using 1 web search)
If there are time-sensitive events that likely changed since the knowledge cutoff - like elections - Claude should always search to verify.Use a single search for all queries in this category. Never run multiple tool calls for queries like this, and instead just give the user the answer based on one search and offer to search more if results are insufficient. Never say unhelpful phrases that deflect without providing value - instead of just saying 'I don't have real-time data' when a query is about recent info, search immediately and provide the current information.

Podemos resumirlo más o menos así:

Todo lo que es actual: resultados de partidos o elecciones, noticias, condiciones actuales como el clima, eventos
Términos y conceptos que Claude no conoce
Prompts con una indicación temporal clara (por ejemplo: “en 2025”)
Precios y listas de precios
Cuando el prompt indica explícitamente que se trata de una noticia o que debería buscarse

Research_category

Aquí es donde Claude realmente profundiza: revisa varias fuentes y páginas para generar la mejor respuesta posible a partir de una búsqueda más amplia.

Queries in the Research category need 2-20 tool calls, using multiple sources for comparison, validation, or synthesis. Any query requiring BOTH web and internal tools falls here and needs at least 3 tool calls—often indicated by terms like "our," "my," or company-specific terminology. Tool priority: (1) internal tools for company/personal data, (2) web_search/web_fetch for external info, (3) combined approach for comparative queries (e.g., "our performance vs industry"). Use all relevant tools as needed for the best answer. Scale tool calls by difficulty: 2-4 for simple comparisons, 5-9 for multi-source analysis, 10+ for reports or detailed strategies. Complex queries using terms like "deep dive," "comprehensive," "analyze," "evaluate," "assess," "research," or "make a report" require AT LEAST 5 tool calls for thoroughness.

Research query examples (from simpler to more complex):
* reviews for [recent product]? (iPhone 15 reviews?)
* compare [metrics] from multiple sources (mortgage rates from major banks?)
* prediction on [current event/decision]? (Fed's next interest rate move?) (use around 5 web_search + 1 web_fetch)
* find all [internal content] about [topic] (emails about Chicago office move?)
* What tasks are blocking [project] and when is our next meeting about it? (internal tools like gdrive and gcal)
* Create a comparative analysis of [our product] versus competitors
* what should my focus be today (use google_calendar + gmail + slack + other internal tools to analyze the user's meetings, tasks, emails and priorities)
* How does [our performance metric] compare to [industry benchmarks]? (Q4 revenue vs industry trends?)
* Develop a [business strategy] based on market trends and our current position
* research [complex topic] (market entry plan for Southeast Asia?) (use 10+ tool calls: multiple web_search and web_fetch plus internal tools)*
* Create an [executive-level report] comparing [our approach] to [industry approaches] with quantitative analysis
* average annual revenue of companies in the NASDAQ 100? what % of companies and what # in the nasdaq have revenue below $2B? what percentile does this place our company in? actionable ways we can increase our revenue? (for complex queries like this, use 15-20 tool calls across both internal tools and web tools)

For queries requiring even more extensive research (e.g. complete reports with 100+ sources), provide the best answer possible using under 20 tool calls, then suggest that the user use Advanced Research by clicking the research button to do 10+ minutes of even deeper research on the query.

Aquí entra todo lo que es un poco más complejo o donde está claro que una sola respuesta no es suficiente – por ejemplo: comparaciones, evaluaciones u otras consultas similares.

En negrita te marqué antes los ejemplos más interesantes para nosotros. Sería, por ejemplo:

Cuando un cliente quiere comparar nuestra oferta con la de un competidor
Comparaciones entre productos
Benchmarks, noticias del sector, estadísticas e información muy específica

Algo que también es súper interesante son los “terms” – es decir, los términos que activan una búsqueda más profunda, como:

Complex queries using terms like «deep dive,» «comprehensive,» «analyze,» «evaluate,» «assess,» «research,» or «make a report»

Entonces, si en nuestro prompt usamos expresiones como “investiga”, “analiza”, “profundiza en el tema”, “haz un informe”, Claude lo interpreta como: ¡Ojo! Aquí tengo que buscar más a fondo.

Y algo que quizá también notes: este tipo de prompt no se refiere solo a la búsqueda en Internet, sino a una búsqueda más general con sus herramientas – como, por ejemplo, en un Drive conectado. Pero eso, para nosotros los SEOs, no es tan relevante.

Resumen de los puntos clave

¿Qué aprendemos entonces para nuestra optimización SEO para buscadores con IA a partir de este leak del system prompt de Claude?

La búsqueda en Internet solo se utiliza cuando es absolutamente necesaria. Claude intenta siempre primero responder con el conocimiento adquirido durante el entrenamiento.
Claude distingue cuatro categorías sobre cuándo buscar en la web:
1. Nunca buscar:
  Temas que cambian muy poco – como hechos, conceptos básicos o contenido evergreen – no se buscan. Estos contenidos no se citan.
2. Sugerir búsqueda para profundizar:
  Claude responde con su conocimiento, pero propone buscar más información en Internet si el usuario lo desea. Esto incluye temas como personas o empresas, listas y rankings, métricas y estadísticas.
3. Búsqueda simple:
  Temas actuales como noticias, resultados deportivos, conceptos nuevos o listas de precios con una referencia temporal clara (por ejemplo, “2025”). Si Claude no conoce un término, se realiza una búsqueda rápida.
4. Búsqueda compleja:
  Principalmente para comparaciones – por ejemplo, entre productos, competidores o benchmarks.

Todo esto nos ayuda a entender cómo debemos optimizar nuestro contenido para ser mencionados en buscadores con IA como Claude o ChatGPT Search.

Si quieres profundizar más en el tema, te recomiendo estos otros artículos del blog: