Not all content is created equal when it comes to AI training. Some types of content dramatically improve your agent's answers — others add noise and can actively make it worse. This guide ranks content types by effectiveness so you know exactly what to prioritise.
Highest-impact content types
1. FAQ Q&A pairs — the gold standard
Manual Q&A entries are the single most effective content type. They give the AI an exact question matched to a specific answer, eliminating ambiguity. When someone asks "Do you offer refunds?", a manual Q&A pair that says exactly that is retrieved and used with very high confidence.
How many to write: Start with 20–30. Add more as you discover what visitors actually ask. 50+ well-written Q&As creates a very robust agent.
Format tip: Write the question the way visitors actually ask it — casual, short, conversational. Not "What is the company's refund policy?" but "Do I get a refund if I'm not happy?"
2. Specific pricing information
Pricing questions are the most common reason agents underperform. Content that says "competitive pricing" or "contact us for a quote" gives the AI nothing to work with — so it gives the visitor nothing useful in return.
What works: Starting prices, package names, what's included in each, any add-ons. Even a range ("£500–£2,000 depending on scope") is dramatically better than nothing.
3. Service and product descriptions
Detailed descriptions of exactly what you offer — not marketing fluff, but actual specifics. What's included? What's the outcome? Who is it for? How long does it take?
One focused description per service or product works better than one giant page covering everything.
4. Process and how-it-works content
"What happens after I sign up?", "How does the process work?", "What do I need to provide?" — these questions are extremely common and easily answered if you have step-by-step process content in your knowledge base.
Medium-impact content
- Testimonials and case studies — useful for trust-based questions ("Have you worked with businesses like mine?")
- About page content — good for "Who are you?" and "How long have you been in business?"
- Blog articles — only useful if they contain specific, factual information relevant to customer questions. Most blog content is too broad to be useful in a chat context.
Low-impact or harmful content types
- Generic marketing copy — "We're passionate about delivering excellence" contains zero useful information for answering visitor questions. The AI can't answer "How much does it cost?" from marketing fluff.
- Cookie notices and legal pages — actively harmful if included. Remove these from your knowledge base.
- Navigation text and headers — meaningless without context.
- Duplicate content — if the same information appears multiple times with slightly different wording, the AI can get confused about which version is correct.