Principle 8: Entity Identity

An entity's name is not its identity. Identity in the context of AI systems is the complete, machine-readable declaration of what an entity is, how it is uniquely identified, and how all of its representations across the web connect to a single canonical profile. When identity is incomplete or fragmented across sources, AI systems cannot build a unified representation, and the entity loses coherence in AI-generated responses.

Name vs. Identity vs. Disambiguated Profile

These three concepts are often confused, but they serve distinct roles in how AI systems process and represent entities. Understanding the difference is essential to building an entity that AI systems can recognize, trust, and recommend.

The Entity Name

The name is the human-readable label attached to an entity. "Acme Corp" is a name. Names are inherently ambiguous. Multiple entities can share the same name. A single entity can be referenced by different name variations across different platforms. AI systems cannot rely on names alone to identify an entity because names are not unique identifiers.

Name variations are common and expected: the legal name ("Acme Corporation Inc."), the brand name ("Acme Corp"), informal abbreviations ("Acme"), and stylized versions ("ACME"). AI systems need to know that all of these refer to the same entity. Without explicit declaration, the system may treat each variant as a separate entity or fail to connect information across sources that use different variants.

The Declared Identity

The declared identity is the structured, machine-readable set of facts that defines what an entity is. It goes far beyond the name. It includes the entity's type (organization, person, product), its core attributes (founding date, headquarters, industry classification), its relationships (founder, parent organization, subsidiaries), and its unique identifiers (tax ID, DUNS number, NAICS code). The declared identity lives in your Schema.org markup and serves as the canonical reference that all other sources should match.

identity-declaration.json

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Acme Corp",
  "legalName": "Acme Corporation Inc.",
  "alternateName": ["Acme", "ACME"],
  "url": "https://www.acmecorp.com",
  "description": "Enterprise predictive analytics platform for supply chain optimization",
  "foundingDate": "2018-03-15",
  "founder": {
    "@type": "Person",
    "name": "Sarah Chen",
    "jobTitle": "CEO"
  },
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "San Francisco",
    "addressRegion": "CA",
    "addressCountry": "US"
  },
  "iso6523Code": "0060:DUNS_NUMBER",
  "taxID": "XX-XXXXXXX",
  "naics": "511210"
}

Notice the elements that go beyond a simple name: legalName declares the formal legal name. alternateName lists known variants so AI systems can connect them. iso6523Code and taxID provide globally unique identifiers that are unambiguous. naics places the entity in a specific industry classification. Together, these fields create a machine-readable identity that is far more precise than a name string.

The Disambiguated Profile

The disambiguated profile is the resolved, unified representation that AI systems construct after cross-referencing an entity's declared identity against all available sources. It is the result of disambiguation — the process described in Principle 3: Disambiguation. A successfully disambiguated profile means the AI system has high confidence that it knows exactly which entity is being discussed, can connect all relevant information to that single entity, and can distinguish it from other entities with similar names.

The disambiguated profile is not something you create directly. It is something AI systems build from the evidence you provide. Your role is to ensure that the evidence — your declared identity plus the consistent facts across all external platforms — gives the AI system everything it needs to construct an accurate, unified profile.

Identity Is the Foundation of Recognition

Every other AEO principle depends on the AI system knowing which entity you are. Authority signals are meaningless if they cannot be attributed to the correct entity. Multi-source confirmation fails if sources reference different identity fragments. Freshness signals lose value if the AI cannot connect updated content to a persistent entity identity. Get identity right first, and every other principle becomes more effective.

How Fragmented Identity Breaks AI Understanding

When an entity's identity is inconsistent across sources, AI systems encounter a problem that is structurally identical to the problem librarians face when cataloging books by an author who publishes under multiple names. Without a clear authority record, the works get scattered across multiple entries and the author's full body of work becomes invisible.

fragmented-identity-problem.txt

Source: website
  Name: "Acme Corp"
  Type: "Enterprise analytics platform"
  Founded: "2018"
  HQ: "San Francisco"

Source: Crunchbase
  Name: "Acme Corporation"
  Type: "Business intelligence software"
  Founded: "March 2018"
  HQ: "San Francisco, CA"

Source: LinkedIn
  Name: "Acme Corporation Inc."
  Type: "Information Technology & Services"
  Founded: "2018"
  HQ: "San Francisco, California"

Source: Press release (2023)
  Name: "ACME"
  Type: "AI-powered analytics"
  Founded: "2017" (date founders began working on concept)
  HQ: "SF Bay Area"

AI System's Internal State:
  -> 4 potential name variants detected
  -> 3 different category descriptions
  -> 2 conflicting founding dates
  -> Result: LOW CONFIDENCE — entity fragments into partial representations

In this example, four sources present four overlapping but inconsistent identity representations. The AI system cannot determine whether "Acme Corp," "Acme Corporation," "Acme Corporation Inc.," and "ACME" are the same entity, related entities, or different entities entirely. The conflicting founding dates compound the problem. The different category descriptions prevent the system from confidently classifying the entity.

The result is entity fragmentation: instead of one high-confidence entity representation, the AI system maintains multiple low-confidence partial representations. Queries that should surface this entity may instead return competitors whose identity is cleanly resolved, or the AI may present conflicting information with hedging language that undermines user trust.

The Resolved State

resolved-identity-state.txt

Source: website (canonical)
  Name: "Acme Corp"
  legalName: "Acme Corporation Inc."
  alternateName: ["Acme", "ACME"]
  Type: "Enterprise predictive analytics platform"
  Founded: "2018-03-15"
  HQ: "San Francisco, CA, US"

Source: Crunchbase -> matches canonical
Source: LinkedIn -> matches canonical
Source: Wikipedia -> matches canonical
Source: Wikidata -> matches canonical (structured)
Source: Press coverage -> references consistent identity

AI System's Internal State:
  -> Single canonical name with known variants
  -> Consistent category across sources
  -> Single founding date confirmed by 5+ sources
  -> Result: HIGH CONFIDENCE — unified entity representation

When identity is properly declared and consistently maintained across sources, the AI system can collapse all representations into a single high-confidence profile. The canonical name is established, known variants are mapped to it, and every source confirms the same core facts. This is the state that produces confident, assertive AI responses about your entity.

The sameAs Network

The Schema.org sameAs property is the primary mechanism for declaring entity identity connections across the web. Each URL in your sameAs array tells AI systems: "This external profile refers to the same entity as this website." The array creates a network of linked identity nodes that AI systems can traverse to build a complete picture of who you are.

sameas-network.json

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Acme Corp",
  "url": "https://www.acmecorp.com",
  "sameAs": [
    "https://en.wikipedia.org/wiki/Acme_Corp",
    "https://www.wikidata.org/wiki/Q12345678",
    "https://www.crunchbase.com/organization/acme-corp",
    "https://www.linkedin.com/company/acme-corp",
    "https://twitter.com/acmecorp",
    "https://github.com/acmecorp",
    "https://www.bloomberg.com/profile/company/ACM:US",
    "https://www.sec.gov/cgi-bin/browse-edgar?company=acme+corp",
    "https://www.google.com/maps?cid=1234567890",
    "https://www.youtube.com/@acmecorp"
  ]
}

What Makes an Effective sameAs Network

An effective sameAs network has three properties: comprehensiveness, accuracy, and currency.

Comprehensiveness means including every major platform where your entity has a verified presence. Missing a platform means the AI system must infer the connection rather than knowing it. Each platform you include adds a confirmed identity node that strengthens the overall profile.

Accuracy means every URL in the array resolves to an active profile that contains facts matching your declared identity. A sameAs link to a Crunchbase profile that lists a different founding date than your website does not strengthen identity — it introduces a conflict that the AI system must resolve. Every linked profile must match.

Currency means keeping the array up to date. Dead links, redirects, and defunct profiles weaken the signal. When you close an account on a platform, remove it from the array. When you create a new profile, add it. Treat the sameAs array as a living document that requires periodic review.

Priority Platforms for sameAs

Not all platforms carry equal weight. Prioritize knowledge bases (Wikipedia, Wikidata) and business intelligence platforms (Crunchbase, Bloomberg) because AI systems reference these most frequently for entity resolution. Professional networks (LinkedIn) and government registries (SEC EDGAR) add institutional trust. Social media profiles add useful but lower-weight confirmation.

Person Entities and Identity

Entity identity applies to people as well as organizations. Founders, executives, and subject matter experts who appear in content need their own declared identities. When a person is referenced across your website, LinkedIn, Google Scholar, and press coverage, the same identity principles apply: the AI system needs to know that all references point to the same individual.

person-identity-schema.json

{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Dr. Sarah Chen",
  "givenName": "Sarah",
  "familyName": "Chen",
  "honorificPrefix": "Dr.",
  "jobTitle": "CEO & Co-Founder",
  "worksFor": {
    "@type": "Organization",
    "name": "Acme Corp",
    "url": "https://www.acmecorp.com"
  },
  "alumniOf": [
    {
      "@type": "CollegeOrUniversity",
      "name": "MIT",
      "department": "Computer Science"
    }
  ],
  "sameAs": [
    "https://www.linkedin.com/in/sarahchen",
    "https://scholar.google.com/citations?user=XXXXXX",
    "https://orcid.org/0000-0002-1234-5678",
    "https://twitter.com/sarahchen"
  ],
  "knowsAbout": [
    "Predictive analytics",
    "Supply chain optimization",
    "Machine learning"
  ]
}

Person identity is particularly important for Principle 5: Authority Signals. Expert authorship markup connects content authority to a verified person entity. Without clear person identity, the credentials of your authors cannot be verified by AI systems, and the content loses the expert authority signal.

Identity Verification Checklist

Use this checklist to audit your entity identity across the identity stack. Every item contributes to the AI system's ability to construct a unified, high-confidence entity profile.

Canonical name declared in Schema.org markup. Your name, legalName, and alternateName fields are present and accurate in your Organization schema.
Unique identifiers included. Tax ID, DUNS number, or other globally unique identifiers are present in your structured data where applicable.
sameAs array is comprehensive and current. Every major platform profile is linked, every URL resolves, and every linked profile contains matching facts.
All name variants are mapped. The alternateName field lists every known abbreviation, acronym, or variant that appears in external sources.
Industry classification is explicit. NAICS code, industry keywords in knowsAbout, or equivalent classification is present in your structured data.
Person entities have declared identities. Key people associated with the organization have their own Schema.org Person markup with sameAs connections to their external profiles.
External profiles match the canonical identity. Every Crunchbase field, LinkedIn detail, Wikipedia fact, and directory listing matches your website's declared identity exactly. This aligns with Principle 2: Entity Consistency.
No orphaned identity fragments exist. No platform profiles with outdated information, no press mentions with incorrect facts, and no directory listings with stale data remain uncorrected.

Identity Drift Is Continuous

Entity identity is not a set-it-and-forget-it task. Platforms update their data formats, profiles get auto-modified, press coverage introduces new name variants, and organizational changes create new facts that must be propagated. Schedule quarterly identity audits to catch drift before it compounds into fragmentation. See Principle 7: Freshness for guidance on maintaining current information across sources.

Entity identity is the foundational declaration that makes every other AEO principle effective. Without a clear, consistent, machine-readable identity, AI systems cannot reliably attribute authority signals, confirm facts across sources, or distinguish your entity from competitors. Invest in getting identity right — declare it explicitly in structured data, connect it through sameAs networks, and maintain it across every platform where your entity appears. For related principles, see Principle 2: Entity Consistency and Principle 3: Disambiguation.