The Quiet Backbone of AI: Taxonomies in an Agentic World

When two humans negotiate an advertising deal, ambiguity is manageable. A media buyer says “I want lifestyle content targeting women 21–45 interested in beauty,” and a publisher sales rep applies judgment, asks clarifying questions, and lands on something reasonable. When two AI agents negotiate that same deal — autonomously, across multiple buyers and sellers — ambiguity becomes a systematic failure mode. In a negotiation between agents there are inputs and outputs e.g. what was specified, and how each agent’s probability distribution happened to interpret it.

To avoid ambiguity in campaign constraints and targeting, we shouldn’t rely solely on natural language interpretations. Taxonomy guardrails provide a stable foundation on which the emerging agentic advertising ecosystem can build.

The Agentic Advertising Problem

The Agentic Ad Management Protocols (AAMP) framework addresses the industry’s goal to test agentic media buying and it describes how existing specifications can be used to expedite this development in the agentic ecosystem. Within the AAMP framework there are three pillars: Trust and Transparency, Agentic Protocols, and Agentic Foundations. The Agentic Protocols pillar, which includes Taxonomy Guardrails, addresses one of the most fundamental challenges of multi-agent systems: getting agents that were trained differently, fine-tuned differently, and operated by different companies to understand the same thing when they exchange a brief.

LLMs generate outputs as probability distributions over language — they do not look up facts, they predict likely completions. When a buyer agent sends “women aged 21–45 interested in wellness” to a seller agent, that phrase is not a specification. It is a prompt, and different models will complete it differently. The buyer’s model may anchor on 30–40 as the “prime wellness demographic.” The seller’s model may interpret “wellness” as fitness-focused, serving the ad on running content rather than beauty content. Neither agent is wrong by its own internal logic, but the outcome is an ad mismatch that no human explicitly approved.

Taxonomy IDs replace probabilistic interpretation with deterministic lookup. They remove the interpretation layer from agent-to-agent negotiation entirely.

The IAB Taxonomies: A Shared Semantic Contract

The IAB Tech Lab has 3 taxonomies that standardize the description of inventory between buyers and sellers: the Content, Ad Product, and Audience Taxonomies. These function as shared languages that let buyers, sellers, and intermediaries describe what content is about, what is being advertised, and who an audience is in a consistent, machine-readable way.

The Content Taxonomy describes “aboutness” of pages, apps, or videos which publishers and platforms use for targeting, brand suitability, and measurement – for example, Automotive > Auto Type > Green Vehicles [ID: 22] is a distinct and unambiguous node, separate from Automotive > Car Culture [ID: 25], even though both involve cars.

The Ad Product Taxonomy describes how the product or service being advertised is labeled. This helps with further contextual targeting and provides a consistent schema for adhering to brand suitability categories – for example, Alcohol > Wine [ID: 1007] and Alcohol > Beer [ID: 1004] are distinct nodes that a publisher can allow or block independently.

The Audience Taxonomy adds a common naming convention for segments based on demographic, interest, and purchase-intent attributes. Demographic > Gender > Female [ID: 49] combined with Demographic > Age Range > 30–34 [ID: 6] is an exact specification. “Women in their prime spending years” is not.

Together, these three taxonomies form a shared vocabulary that allows buyers, sellers, and intermediaries to describe the same transaction using the same terms — regardless of which LLM, which platform, or which version of a model is doing the reasoning.

What This Looks Like in Practice

Consider a lipstick brand running a campaign. Their requirements: reach women aged 21–45 on beauty content, with no placements on alcohol or cannabis content. In natural language, this brief travels through a buyer’s agent, a sellers agent, an audience agent and a publisher’s ad platform. That is four hops, each with its own LLM re-processing the context. At each hop, the brief is paraphrased slightly. “Lifestyle content” creeps in as a synonym for “beauty content.” “No adult content” gets interpreted inconsistently across models. By the fifth hop, the ad is served on a wine review page to an audience tagged as “women 28–42.” Every agent followed the brief as it understood it.

With taxonomy IDs, the buyer agent transmits:

Audience: Female [49] + Age 21-24 [4] + Age 25-29 [5] + Age 30-34 [6] + Age 35-39 [7] + Age 40-44 [8]

Content: Beauty [553] + Skin Care [559]

Ad Product Blocklist: Alcohol [1002] + Cannabis [1049] + Adult Products and Services [1001]

These integers pass through every hop unchanged. The publisher’s ad server performs boolean matching — not inference, not interpretation. The wine review page carries Ad Product ID [1002] in the blocklist. The impression is rejected automatically, at every node in the chain, consistently.

This is what deterministic interoperability looks like.

Let’s take a look at another concrete example

The brief: “Avoid fuel and gasoline related content because we’re an electric vehicle brand running a sustainability campaign.”

The problem: “Gasoline Prices” and “Auto Parts” both sit in the automotive cluster in embedding space. The word “fuel” appears in both “fuel economy” (positive for EV) and “fuel prices” (what the campaign wants to avoid). An LLM parsing this brief probabilistically cannot reliably distinguish them.

With taxonomy:

Block: Content Taxonomy > Automotive > Auto Insurance [ID: 31] (conflation risk)
Allow: Content Taxonomy > Automotive > Auto Type > Green Vehicles [ID: 22]
Allow: Content Taxonomy > Automotive > Auto Technology [ID: 37]

The taxonomy makes the distinction crisp and unambiguous. The EV ad runs on electric vehicle review pages and not on “Best Gas Prices Near You” articles.

Why This Matters for LLM Reasoning Quality (Beyond Matching)

Beyond pure matching, there is a reasoning quality argument. When you give an LLM a structured, taxonomy-grounded context window, its chain-of-thought reasoning is more precise.

Unstructured context:

“User likes beauty and health content, is a woman in her 30s”

Structured context:

"User segment: IAB Audience 1.1 > Demographic > Gender > Female [ID:49]; Demographic > Age Range > 30–34 [ID:6]; Interest > Style & Fashion > Beauty & Personal Care [ID:677]"

The second version gives the LLM explicit category labels to reason over. It is less likely to drift into adjacent categories (e.g., “health → fitness → sports → active lifestyle → outdoor gear”). The taxonomy acts as retrieval grounding by constraining what the model is allowed to infer rather than what it finds plausible.

This is why RAG (Retrieval Augmented Generation) systems that pre-tag documents with taxonomy labels outperform pure semantic search for precision. Taxonomy-tagged retrieval combines the best of exact-match and semantic reasoning.

A More Granular Taxonomy Means Better Outcomes

The precision of taxonomy labels has direct downstream effects on campaign performance, not just technical correctness. A more granular taxonomy lets AI systems understand content at a contextual level that coarse natural language descriptions cannot reach. An electric vehicle manufacturer blocking “fuel-related content” in natural language will get inconsistent results. “Fuel economy tips” and “gasoline prices” sit close together in LLM embedding space even though one is relevant to EV buyers and the other is not. Content Taxonomy ID 22 (Green Vehicles) and a targeted blocklist of petroleum-adjacent IDs make that distinction explicit and auditable.

Consistent taxonomy labeling also improves measurement. When every participant in the ecosystem uses the same content and audience labels, attribution, reach, and frequency calculations are based on the same definitions. This matters especially as AI agents begin to optimize campaigns autonomously. They need a stable, shared ground truth to optimize against.

Honest Limitations

Taxonomies, while providing a more deterministic and accurate foundation, may fall short in some scenarios.

They update slowly relative to how fast new content formats and product categories emerge. For example, “AI-generated video content” and “creator economy” content have limited taxonomy coverage today.

They are only as accurate as the entities doing the tagging, and self-tagging creates incentive problems that the industry has not yet fully addressed with verification standards.

And for soft targeting preferences, such as, “premium editorial feel,” or, “brand-safe but culturally relevant,” natural language and LLM semantic reasoning remain more expressive than any fixed taxonomy can be.

These limitations reinforce the need for better taxonomies, better tagging practices, and hybrid architectures that use taxonomy IDs for hard constraints while allowing natural language and LLM reasoning for discovery, enrichment, and novel categories. The limitations should not be an excuse for abandoning structured classification in favor of natural language alone.

Conclusion

As advertising systems move from generating outputs to taking autonomous actions like planning and buying media, optimizing, and reporting without humans in the loop, the stakes of semantic ambiguity increase dramatically. A misinterpreted brief between two humans is a recoverable mistake. A misinterpreted brief compounding across five autonomous agents, running thousands of times per second, may not be recoverable and cause permanent harm.

In an agentic advertising ecosystem, shared taxonomies are not optional infrastructure. They are the semantic contract that makes interoperability possible. They are the reason a buyer’s agent and a seller’s agent, trained by different companies, on different data, optimizing for different objectives, can still mean the same thing when they exchange a brief. Without them, every agent-to-agent negotiation is a probabilistic guessing game, and the compounding errors across the supply chain will produce outcomes that no single agent intended and no human approved.

As models grow more capable, they do not outgrow the need for shared structure. They become more dependent on it.