[Research] GEO Isn't Established Yet: Start by Measuring

More people now shop after consulting ChatGPT or Gemini, and getting named in those AI answers — GEO (Generative Engine Optimization) — is drawing real attention. But the bottom line is that GEO isn't an established field yet. There's no settled know-how of "do this and it works." What the research shows isn't a single winning move, but that methods differ sharply in how much they help, and that the same move can reverse depending on a site's ranking. So pouring money into a trending GEO package is close to a gamble. This article covers why GEO isn't settled yet, what the variance in the research really is, and the most realistic move right now: building a foundation to measure whether a move worked before you chase new tactics — all in plain terms.

Table of contents

What GEO is and why it is not settled yet
The research shows variance, not a single method
Before you jump on a GEO package
Build a measurable foundation before tactics
How RevenueScope helps
FAQ

Summary / References / Related articles

TL;DR#

GEO (Generative Engine Optimization) is the effort to get your site named in AI answers. But it's a young field, with no settled know-how of "do this and it works."
The research shows not a single winning move but variance. Methods differ widely in how much they help, and the same move can give a big lift to a low-ranked site yet backfire on one that already ranks at the top.
So pouring money into an "AI-pleasing package" is close to a gamble. When everyone plays the same move the effect fades, and inflated language erodes reader trust too.
The most realistic move right now is, before any new tactic, to have a foundation that measures whether a move worked. AI traffic hides inside "Direct" in GA4, so lining that up is the starting point.

1. What GEO is and why it is not settled yet#

Bottom line: GEO is a young field, and there's no settled know-how yet of "do this and it works."

GEO stands for "Generative Engine Optimization" — the effort to get your site or products named inside answers from AIs like ChatGPT and Gemini. The main act of search is shifting from picking blue links yourself to asking an AI and getting an answer. So "can you show up in the AI's answer" becomes the new contest, and aiming for it is GEO. The goal itself is clear.

The problem is that the "moves that work" aren't settled. Plenty of methods get talked about as ways to please AI. But whether they actually work, and where and how much, is still being felt out. As the chart below shows, the shape of search has changed, everyone is hunting for moves, and a shared answer hasn't emerged — that's where we stand.

Here's a study that's often cited. Titled "GEO: Generative Engine Optimization," it tested nine ways of writing to make a site more likely to appear in AI answers, and actually measured how much each one helps [1]. It's a peer-reviewed study presented at a conference (ACM SIGKDD), and it's widely referenced as the origin point that popularized the term GEO. What it showed wasn't a single answer of "just do this and you win," but something more nuanced — and more useful in practice. We look at it in the next section. For how AI actually decides which sites to surface, see What Makes AI Pick a Store.

2. The research shows variance, not a single method#

Bottom line: the research shows not a single right move, but that methods differ sharply in how much they help.

In that study, writing well lifted visibility in AI answers by up to about 40% overall, the research reports [1]. But here's the crux. As the chart below shows, how much each method helped was spread wide (the chart keeps the study's rank order and shows the size of the effect on a relative scale). Adding sources and evidence as quotations helped a lot, while leaning on unique wording helped little. Even under the single word "optimization," the payoff differs wildly by method.

What matters even more is that the same move doesn't help by a fixed amount. The study reports that the same move — citing sources — strongly lifted visibility for a low-ranked site, yet backfired on one already at the top. In numbers, that's roughly a doubling (over +100%) for the once-invisible site and about a 30% drop for the top one [1]. It also shows that stuffing keywords tends not to help, and in some domains lowers visibility. In other words, more is not always better. Change the method, the ranking, or the category, and the payoff can flip.

One thing to make clear: this study was tested across several topics, but the numbers themselves are about that test and its subjects. Read the charts here as a relative guide (illustrative). The skeleton — "there's no single move that works; the payoff varies by method, rank, and category" — likely applies broadly to e-commerce and service sites, though how strongly it applies varies by category. Don't take it as "a universal law proven by research"; confirm with your own site's data. For how to gauge how visible your brand is in AI search, see Measuring Your Brand's Visibility in AI Search.

See it with sample data

3. Before you jump on a GEO package#

Bottom line: with the payoff this variable, pouring money into a "do this and AI loves you" package is close to a gamble.

If the effect varies by method and rank, then "doing the bundle of tactics everyone recommends" can be a whiff for your own site. The chart below places GEO tactics on two axes: how certain the effect is, and whether you can measure it for your own site. Legitimate stacking — carefully citing sources and evidence, building reviews and reputation — can be moved to the top right (it works, and you can measure it) once you have a foundation. Meanwhile, dropping in a trending package as-is sits in the bottom left, where the effect is uncertain and you can't tell whether it worked — in other words, a gamble.

There's another easy-to-miss trap. Even if a move works, the gain thins out when everyone plays it. You might cut in temporarily with inflated claims, but if everyone clones the same template the effect levels off — and inflated language erodes reader trust. For the structure behind why AI's pick tilts toward big brands, and a realistic way to break it, see Why AI Recommends Big Brands. The move to make is not the same as everyone else's, but stacking legitimate strengths only you can claim. And — this is the crux — being in a state where you can measure whether a move actually worked.

4. Build a measurable foundation before tactics#

Bottom line: the move to make now is, before any new GEO tactic, to have a foundation that measures whether a move worked.

The reason is simple. With the payoff varying by method, rank, and category, which move works for your site is unknowable until you try it and measure. Add tactics without measuring, and what worked versus what was wasted can never be separated. As the table below shows, whether you have a measurement foundation decides whether the same GEO tactic lets you "choose what to keep by numbers" or leaves you "running on gut." Tactics come next. Not flipping that order is the shortest path.

But this "measuring" is the hard part. Traffic from AI answers often carries no origin tag on the link, so in GA4 it gets lost in "Direct" or unknown origin. That means even when people arrive via AI and buy, separating it out as AI-driven is structurally hard with the defaults. A 2026 "AI Assistant" channel was added, but it only picks up the share that passed a tag; the rest stays sunk in "Direct." For how AI traffic hides inside "Direct" and how to spot it, see AI Traffic Hidden in 'Direct': How to Spot It and Link It to Revenue. That's exactly why lining up this split first is the starting point for GEO.

RevenueScope helps

Bottom line: in GA4 and by hand, AI traffic gets lost in "Direct" and is hard to separate, and tracking whether it drove revenue every month is heavy. This is where RevenueScope comes in.

RevenueScope is a tool with the aggregations needed for e-commerce revenue decisions built in ahead of time. It separates AI-referred traffic out of the "Direct" pile, excludes bots, and lines up traffic, revenue per session (RPS), and revenue by page and by citing engine (ChatGPT, Claude, Perplexity, Gemini, and others). It also surfaces pages that could plausibly be cited by AI but are being missed. Seeing your current position in numbers first — "where, how much AI traffic is arriving, and whether it's buying" — that's the foundation. Ask the AI, and it answers like this (figures shown are demo data).

Pages arriving via AI, and their revenue:

Page	AI traffic	Revenue per session (RPS)	Revenue
Product comparison guide	88	¥1,420	¥124,960
Featured product page	36	¥3,980	¥143,280
How-to / helpful article	124	¥540	¥66,960

Pages that could plausibly be cited by AI but are being missed:

Pages possibly being missed	Content	AI traffic
Older comparison article	Answers the question directly	0
Case study page	Includes concrete numbers	2

The point of the table above is that traffic count and revenue aren't in the same order. The how-to article draws the most AI traffic, but its revenue per session is low. The featured product page gets less traffic, but its visitors buy well, so it leads on revenue. The second table lists pages that look citation-worthy yet are still missing traffic. If you make a GEO move, starting from this current position and comparing "which move moved which page's AI traffic and revenue" lets you choose what to keep and what to drop by numbers, not gut. The trap that growing AI traffic isn't the same as effect is covered in A Rise in AI Traffic Isn't the Effect: How to Spot the Real Gain.

To be clear: RevenueScope counts only the click traffic that actually arrived and its revenue. It does not measure exposure where your name merely appeared (no click), or the total visibility of "how much you're mentioned in ChatGPT." And because it identifies AI traffic from referrer and access signals, it can't catch 100% of traffic that leaves no tag at all. It does not calculate gross margin or inventory. What RevenueScope takes over is preparing the material — splitting the share it can catch, without monthly manual work, by page and engine, lined up with revenue. Which GEO move to make is up to you.

See AI traffic and missed pages with sample data

FAQ#

Frequently asked questions#

Q. Should I start GEO in earnest right now?

A. Pouring money into a package in a hurry isn't recommended. The research reports that the payoff varies widely by method, rank, and category, and that the same move can backfire. It's more realistic to start with legitimate stacking that's known to work (citing sources and evidence, building reviews and reputation) while setting up a state where you can measure whether a move worked. Add tactics without measuring, and you lose the ability to separate what worked.

Q. Can I win by cloning an "AI-pleasing" template?

A. It may help in part, but it won't win on its own. When everyone plays the same move the effect levels off, and inflated language erodes reader trust. The research even shows that keyword stuffing tends not to help, and in some domains lowers visibility. The move is not the same as everyone else's, but stacking legitimate strengths only you can claim, then measuring and keeping the moves that work.

Q. Can I measure GEO's effect in GA4?

A. With the defaults, it's hard. Traffic from AI answers often carries no origin tag, so GA4 lumps it into "Direct" or unknown origin. You can check once by hand, but tracking it page by page and month by month is structurally laborious. Separating AI traffic out and being able to see it by revenue rather than visit count is the foundation for measuring effect.

Conclusion#

GEO (Generative Engine Optimization) is the effort to get your site named in AI answers. The goal is clear, but as a field it isn't established yet. What the research shows isn't a single winning move, but that the payoff varies widely by method, rank, and category — and that the same move can backfire.

So pouring money into a trending package is close to a gamble. When everyone plays the same move the effect fades, and inflated language erodes trust. The move to make is stacking legitimate strengths only you can claim. And, before that, having a foundation that measures whether a move worked.

AI traffic hides inside "Direct" in GA4 and is hard to measure; only by splitting it out can you see which GEO move drove which page's revenue. Before jumping on tactics, build a foundation you can measure with. Not flipping that order is the shortest path.

See which ads actually drive revenue, at a glance

Free up to 5,000 sessions/month, AI analyst included. No credit card required. Up and running in 5 minutes.

References#

[1] Aggarwal et al. "GEO: Generative Engine Optimization" (2024)

TL;DR#

1. What GEO is and why it is not settled yet#

2. The research shows variance, not a single method#

3. Before you jump on a GEO package#

4. Build a measurable foundation before tactics#

RevenueScope helps

FAQ#

Frequently asked questions#

Conclusion#

See which ads actually drive revenue, at a glance

References#

Related articles

[Research] What Makes AI Recommend Your Site: It Comes Down to Reviews and Reputation

[Research] How Visible Is Your Brand in AI Search? You Can Measure It

[Research] Why AI Recommends Big Brands: A Single Rating Can Flip It

When AI Traffic Hides in 'Direct': Read It by Revenue, Not Visits