Back to Blog
Field Notes 9 min read

How AI Agents Use Google Search (And What It Means For Your Content)

We caught AI agents querying Google for our content. One of them leaked its own system prompt straight into our Search Console. Here's what 222 Microsoft Copilot citations, a Perplexity-style exclusion list, and a broken agent template reveal about how AI systems read, cite and recommend brands.

Paul Byrne, Founder 15 May 2026 Last verified: 15 May 2026

What we found in our Search Console this week

Friday morning, reviewing our own Google Search Console, three things lined up that are worth pulling apart in public. First, a literal AI agent system prompt logged as a search query against our domain, with the variables unfilled and the template exposed. Second, a second query using a Perplexity-style negative-site operator string that strips Reddit, Twitter, TripAdvisor, Booking, YouTube and most user-generated platforms from the results. Third, a separate dashboard inside Bing Webmaster Tools showing 222 Microsoft Copilot citations across the same three months, all clustered on one of our blog posts, and almost zero clicks against those citations in Google.

Put together, these are not isolated signals. They describe a working AI retrieval pipeline that runs against your content whether you optimise for it or not. The patterns are visible if you look. Below is what we think they mean, and what we are doing about them.

The leaked agent prompt

This is the literal query that appeared in our Search Console performance report, six impressions, position 11. We have lightly reformatted it for readability but every word is verbatim:

based on this user profile, generate 8 specific news search queries that would find the most relevant strategic intelligence content from the last 24 hours. user profile: - industry: undefined - role: undefined - organization: undefined - focus areas: str…

This is not a question a person typed into Google. It is an AI agent's system prompt. The agent was built to take a user profile, ask a language model to generate eight news search queries, run those queries on Google, retrieve the top results, and synthesise a strategic intelligence briefing. Every variable in the template (industry, role, organisation, focus areas) was unfilled. The token "str" at the end is a truncated Python type hint that gives away how the prompt was templated.

What happened mechanically: the agent's input layer failed, the variables never populated, and instead of bailing, the workflow submitted the template itself to Google. Google logged it. We appeared in the result set, probably because the page that surfaced was one of our guides on strategic search intelligence, and the topical embedding matched.

The interesting part is not the bug. The interesting part is that this workflow exists, runs at scale, and is one of an unknown number of agentic retrieval tools currently using Google as their search index. Every one of them is, somewhere, asking questions like the ones your customers ask, and reading the answers from sites like yours.

The exclusion list

The second query in our Search Console looked like this:

"ai overviews" -site:reddit.com -site:twitter.com -site:x.com -site:wykop.pl -site:tripadvisor.com -site:youtube.com -site:yelp.com -site:booking.com -site:facebook.com -site:instagram.com -site:tiktok.com

That is a negative-site operator string of the kind used by research-mode retrieval agents (Perplexity supports per-query domain filters via its Sonar API; ChatGPT Atlas and similar tools support equivalent configuration). It is not anyone's default behaviour. It is a configured exclusion: whoever ran this query did not want Reddit threads, tweets, YouTube comments, TripAdvisor reviews, Booking.com listings or Facebook posts in the result set. They wanted editorial sources, expert blogs, and original analysis.

That is a specific worldview about what counts as a trustworthy source for an AI to cite. It also describes, almost exactly, the kind of content SearchIntel publishes, and the kind that won us 222 Microsoft Copilot citations on a single page.

222 Copilot citations on one page

Our blog post "Why Brands Don't Show Up In AI Answers" has, over the 90 days to 12 May 2026, generated:

That is the AI search era in one paragraph. The page is doing its job, but the job is not driving Google clicks. The job is being the source of truth that an AI assistant cites when a buyer asks a category question. The user gets the answer inside Copilot, often never visits the site, and yet our positioning lands directly inside the conversation they are having with the AI. Industry-wide, this gap is widening: Copilot accounts for roughly 3% of AI referral traffic versus ChatGPT's 87%, but converts a higher share of the traffic it does send.

Why this matters commercially. The brand recognition is real. The click is not. If you measure AI search only through Google Analytics, this entire channel is invisible to you. You need Bing Webmaster Tools' AI Performance view, and equivalent surfaces from each AI provider, to see what is actually happening.

What the patterns tell us

Across the three signals (the leaked prompt, the exclusion list, the Copilot citation cluster) six things are now provable, not theoretical.

1. AI agents are a measurable share of your impressions

Search Console reports impressions, not just human eyeballs. Every time an AI agent runs a query and your page appears in the result set it consumes, that counts. We can see ours. You can see yours. That partly explains why high-impression, low-click pages are increasingly common across the web in 2026. Some of those impressions are agentic, and agents do not click.

2. Agents read passages, not pages

The Copilot grounding queries we are cited for are paragraph-shaped, not page-shaped: "why brands missing AI generated answers". The agent extracts a passage that answers the question and quotes it. Pages that lead each section with a one-sentence definition get cited more than pages that bury the answer.

3. UGC weighting varies massively by platform

The operator string above is a category-defining pattern in research mode across Perplexity, ChatGPT browsing and several agentic frameworks. UGC weighting in the underlying answers is not uniform. According to 5W's AI Platform Citation Source Index (2026), Reddit and Wikipedia together drive over 25% of ChatGPT citations in the US, while Reddit's share on Google Gemini sat near 0.1% in early 2026. If your strategy is to seed Reddit and rely on community proof to win AI mentions, you are optimising for the ChatGPT half of the AI surface while being almost invisible on Gemini and excluded outright by configured research agents.

4. Editorial and structured sources are favoured

What gets through the negative-site filter is publishers, expert blogs, original research, .edu, .gov, and first-party brand sites with substantive content. Independent research backs this up: Norg AI's citation analysis finds that listicles account for roughly half of top AI citations and that tables increase a page's citation rate by about 2.5× compared with unstructured prose. The "answer economy" is, in practice, an editorial and structured-content economy.

5. Citations decouple from clicks

The 222 Copilot citations to 3 Google clicks ratio tells you the unit economics of AI search. If you are still buying audience through click-through performance, you will systematically under-invest in the AI surface where the brand impression actually lands.

6. The signals are visible, but only if you look

None of this requires special tooling. Google Search Console, Bing Webmaster Tools' AI Performance tab, and a careful read of long-tail queries with unusual operator syntax. That is the stack. We use these every week. Most brands do not look.

What AI agents look for in content

From the citations we win, the citations we lose, and the patterns visible in agent-style queries, five content traits keep showing up:

Trait What it looks like on the page Why agents reward it
Self-contained passage A 50–150 word answer immediately below the H2 that resolves the heading completely The agent extracts the passage as a quotable unit; it does not need the rest of the page
Question-shaped heading H2 phrased the way a real prompt is phrased: "Why do brands not appear in AI answers?" Direct semantic match to the agent's retrieval query
Inline date stamp "As of May 2026…" inside the body, not just in metadata Agents prioritise recency and need a textual signal of when the data is from
Named source attribution "According to BrightEdge's 2026 research…" with a link to the original Agents build trust through the citation chain; quoted, attributed claims are easier to re-cite
Original numbers "Across 75 UK brands we scored in 2026, the median was 62/100" Data the agent cannot find elsewhere increases the citation value of the page

The manipulable signals: a tier list

If you want more of your content cited by AI agents and AI Overviews (which is the underlying economic question), here is what is in your control today, ordered by impact and effort.

Tier 1: high impact, low effort

  1. Rewrite H2s as questions. Every section heading becomes the exact phrasing of a real prompt. "Brand mention rate" becomes "How do you measure brand mention rate in AI answers?". Five minutes per page. Big lift.
  2. Lead each section with a definition. First sentence under every H2 must answer the heading completely. Move context, caveats and examples beneath. Agents pull the first sentence.
  3. Add FAQPage and Article schema. Microsoft, Google and Perplexity all read structured data. Five questions and answers in JSON-LD give you five extra citation surfaces per page.
  4. Stamp dates inline. "Last verified May 2026" near the top, and dated qualifications in body copy. Freshness is a retrieval signal even when the underlying claim is timeless.

Tier 2: medium impact, medium effort

  1. Publish original numbers. Run your own audit. Share the results. SearchIntel's 75-brand benchmark is exactly this. It gives every page a data point no competitor can replicate. Your data does not need to be huge. It needs to be specific and yours.
  2. Quote named experts. Two or three named industry sources per piece. Agents weight content with attributed claims higher than free-floating opinion.
  3. Show methodology. Publish the prompts, models, sample sizes and dates behind any data you cite. Methodology transparency is the highest-trust signal for an agent assessing whether to cite you.
  4. Cluster, then cross-link. A single guide ranks badly; a hub with eight cluster pages cross-linking back to it builds topical authority. Each cluster page can win its own long-tail; the hub absorbs the equity.

Tier 3: structural plays

  1. Register with Bing Webmaster Tools. Microsoft Copilot is the only major AI surface that currently exposes a citation dashboard. ChatGPT and Gemini do not. That is why Copilot citation data is visible to us at all, and why Bing Webmaster Tools is the cheapest measurement upgrade an AI search programme can make.
  2. Earn citations in editorial publishers. The Perplexity-style exclusion list passes editorial and expert sources through. Trade press placements, industry publication features and authoritative third-party reviews compound this advantage.
  3. Build an llms.txt and keep it clean. Optional, but signals AI-friendliness and gives a defined entry surface for agentic crawlers.
  4. Avoid writing like Reddit. The same exclusion list that strips out UGC will treat your content the same way if it reads like a forum reply. Editorial register, named author, structured headings.

The two things this does not mean

Two clarifications, because every claim about AI search currently lives downstream of two myths.

This is not "SEO is dead." Google still drives most search traffic. Microsoft Copilot sends measurable but small click volume (around 3% of AI referral traffic by 2026 estimates). AI search visibility is a parallel channel, not a replacement, and brands that pivot away from organic search to chase AI citations will end up with neither. The thing to do is measure both, in their own terms.

This is not a black box. The signals above are observable, measurable and improvable. The agent's exclusion list is in plain text. The Copilot citation count is in a dashboard Microsoft gives you free. The patterns repeat across our client work and our own site. Anyone willing to look weekly at the data can see them.

What we are doing about it

Three things, in this order:

  1. Rewriting our highest-impression, lowest-click pages using the Tier 1 patterns above. The biggest single opportunity on our site is our guide on AI search visibility, which sits at 3,733 impressions a month and position 56. That is exactly the profile of a page that ranks because the topic is right but loses the click because the page is not yet structured for the answer the user actually wants.
  2. Publishing more original data. Our 75-brand benchmark programme exists precisely because original numbers are the highest-leverage form of citable content.
  3. Watching the agent queries weekly. Every Friday we now read the long-tail queries in our own Search Console for patterns of agentic retrieval. We will publish what we find here.

The short version

AI agents are using Google to read your content. They do not click. They cite, sometimes hundreds of times, without sending a single visit. The patterns of what they prefer are visible if you look: structured, dated, attributed, original. The brands that take this seriously are already writing for the agent as well as the human. The ones that aren't will keep watching their click-through rate drop while their share of citation quietly belongs to someone else.

If you want to know whether AI assistants are recommending your brand or your competitors today, you can run a free AI visibility check here. If you want a full diagnostic (Copilot citation tracking, agent-query analysis and a 12-month roadmap) that is what our 45-day diagnostic covers.

Want to Know If AI Cites Your Brand?

Run a free 30-second AI visibility check across ChatGPT, Claude, Gemini, Perplexity, Google AI Overviews and Google AI Mode

Run a free check

Or book a call to discuss a full diagnostic.

Related Articles