You don’t have Agentforce Deployment problem. You have data problem which agentforce is about to expose. Before you configure a single Topic, Action – run this audit.
Agent Doesn’t Know your Data is Broken
I keep repeating to clients when they’re excited about Agentforce: an AI Agent is only as smart as the data it can retrieve, process and act on it.
The gap between “our Salesforce instance is good enough for our team” and “our Salesforce is good enough for an AI agent” is much wider than most organizations expect.
Framework
Four Layers. One Audit. Build Confidence or Kill False Confidence.
The framework is sequential by design. Each layer compounds on the one before it. You can have a perfectly clean knowledge Base, but if your Account and Case records are polluted, the agent will still fail to serve the right content to the right customer.
- Duplicate & Dirty Records: Agent’s Worst Enemy
- Agent-Critical Fields
- Knowledge Base Gaps
- Objects & Relationship Model
- Duplicate & Dirty Records: Agent’s Worst Enemy:
Deduplication has always been a data hygiene concern. With Agentforce, it becomes a reasoning integrity concern. When an agent retrieves two Account records for the same company, it may blend their data, respond based on the wrong one, or present conflicting information to a customer in the same conversation.
What to Audit:
- Duplicate Accounts by Name + Billing City
- Duplicate Contacts by Email
- Test/Demo Records in Production
- Cases with Null AccountId
- Contacts with Null AccountId
- Stale Open Cases (no activity > 90days)
- …
Every customer/industry will have a different definition of what data completeness means to them.
2. Missing Fields: What the Agent Needs that Nobody Filled In
This is the most underestimated layer. Agents reason through slot-filling – they identify what information they need to complete a task, then look for those values in your data. If the field is null, the agent either halts, guesses, or falls back to a generic response.
The critical point: agents don’t care if a field was optional in your data entry design. If the agent’s Topic requires Account.Industry to route the right response – and 60% of your Accounts have no Industry – its a broken experience 60% of the time.
High-risk fields by use case:
| Account | Contact | Case | Opportunity |
| Industry | Priority | StageName | |
| Type | AccountId | Type | CloseDate |
| Owner | Title | Origin | Amount |
| BillingCountry | MailingCountry | Reason | Type |
| Phone | LeadSource | Description | NextStep |
3. Knowledge Base Gaps: The Agent’s Answer Library
For service-oriented agents, Salesforce knowledge is the primary grounding source. The agent searches Knowledge to find relevant articles, extracts content, and synthesizes an answer. If your Knowledge base doesn’t cover the topics customers actually ask about – the agent has nothing to retrieve. No retrieval = no grounded response = hallucination risk or escalation.
This is coverage + quality audit, not just a quantity audit. A thousand articles that are outdated or miscategorized are worse than 200 articles that are current and well-structured.
Knowledge audit checklist
- Published vs Draft article ratio
- Article last-modified date distribution
- Top 20 case types vs Knowledge coverage
- Data Category taxonomy alignment
- Channel visibility configuration
- Article body length and structure
Object & Relationship Model: Can the Agent Navigate Your Org?
This is the layer most architects think about last but should think about first. Agents traverse your object model using relationships – lookups and master-detail fields – to find context. An agent answering “What’s the status of my order?” needs to walk: Contact -> Account -> Order -> Order Item. Every broken link in that chain kills the answer.
This isn’t just about standard objects. If you have custom objects that hold critical customer, product or entitlement data – and those objects aren’t properly related to the standard object graph – your agent is working with an incomplete map.
Relationship model audit:
- Cases without Contact (ContactId null)
- Entitlement records not linked to Accounts/Contacts
- Custom product/order objects without Account FK
- Junction objects for M:M relationships
- Cross-object traversal depth > 5 hops
- Assets not linked to Accounts
Scoring:
Your Agentforce Data Readiness Scorecard:
After running the audit, score your org across the four layers. This is the same scoring model used in the Agentforce Reality Check tool – so if you’ve run the tool, your layer scores will already be visible there.
| Layer | Green | Amber | Red |
| Duplicates & Dirty Data | < 10 dupes No test records | 10-50 dupes Some test records | 50+ dupes or active test records |
| Agent – Critical Fields | > 85% pop. Clean picklists | 70-85% pop. Some picklist drift | <70% on any critical field or taxonomy chaos |
| Knowledge Base | > 80% top cases covered Modifier <1yr | 60-80% coverage Some stale articles | <60% coverage or majority articles >2yr old |
| Object Model | All standard FKs intact Custom objects mapped | Some orphaned records Custom gaps identified | Significant orphaned records or broken traversal paths |
What To Do With Your Score
A data audit without a remediation plan is just a list of problems. Here’s how to translate your scores into a sequenced action plan:
- If you’re Red on Layer 1 or 2 (Records & Fields)
This is a data governance engagement before it’s an Agentforce engagement. Run a formal deduplication project using Salesforce’s native Duplicate Management rules or a third-party tool like Cloudingo or DemandTools. Establish field population standards and enforce them via Validation Rules before agent deployment.
Timeline signal: expect 4-8 weeks for a serious dedup project on a mid-size org. Don’t rush it.
- If you’re Red or Amber on Layer 3 (Knowledge)
Prioritize article creation over agent configuration. Use the coverage gap matrix (top 30 case reasons * article existence) to drive a focused content sprint. Assign article owners. Set a minimum quality bar: published, <2 years old, correct Data Category, visible on the right channel.
Timeline signal: a focused content sprint for 20-30 articles typically takes 3-4 weeks with one dedicated author.
- If you’re Red or Amber on Layer 4 (Object Model)
This the architect territory. Map the traversal paths your planned agents need. Identify breaks. For each break, decide: fix the relationship (correct path), create a Flow-based Action that materializes the data for the agent (workaround path), or defer the use case (strategic deferral). Don’t let object model debt block all agent use cases – sequence around it.






Leave a Reply