ChatForge agents typically respond in 1–3 seconds. If you're experiencing consistently slow responses, there's usually a specific cause. Here's how to diagnose and fix it.
What counts as "slow"?
Typical response times: under 2 seconds is excellent, 2–4 seconds is acceptable, 4+ seconds indicates an issue worth investigating. Note that the very first message in a new session may take a second or two longer as the agent initialises — this is normal.
Cause 1 — Very large knowledge base
If your knowledge base has thousands of pages or very large documents, the retrieval step takes longer. Signs: responses are consistently slow across all questions, not just specific ones.
Fix: Audit and clean your knowledge base. Remove duplicate content, irrelevant pages, and boilerplate. A focused KB of 50–200 high-quality entries retrieves faster than a bloated one of 2,000 vague pages.
Cause 2 — Response length is set too high
A high max_tokens setting (e.g. 2,000+) means the model generates more text, which takes longer. Most chat responses don't need more than 300–400 tokens.
Fix: In agent settings, lower the max response length to 300–400 tokens. Also add to your system prompt: "Keep all responses to 2–3 sentences." This speeds up generation and improves response quality simultaneously.
Cause 3 — Network or device issue
Slow responses on your own device but not others' suggests a local network issue rather than a platform problem.
Fix: Test the widget on a different device and internet connection. Check our status page for any reported platform slowdowns.