My Website Didn't Scrape Properly — ChatForge Help Center

Website scraping doesn't always capture everything you expect. Here are the most common scraping issues and exactly how to fix each one.

Issue 1 — Important pages didn't get imported

Why it happens:

The page loads content dynamically via JavaScript (common in modern websites and SPA frameworks)
The page is behind a login, paywall, or form submission
The page URL doesn't appear in any navigation link from your homepage
The page is blocked by your website's robots.txt file

Fix: Manually add the specific page URL in the Knowledge Base tab under "Add Website" → "Add Specific Page". You can add individual URLs one at a time.

Issue 2 — Content was scraped but is too vague

Why it happens: Your website has marketing copy ("We provide excellent service") rather than specific, factual content that the AI can use to answer questions.

Fix: This is a content quality problem, not a scraping problem. Two options:

Add manual Q&A entries for each question that needs a specific answer
Improve your website copy to include specific information, then re-scrape

Issue 3 — Irrelevant pages got imported

Why it happens: The scraper picks up all public pages — including cookie policy pages, privacy policies, navigation text, and legal disclaimers.

Fix: In the Knowledge Base tab, click on each scraped source and review what was imported. Delete any entries that contain irrelevant boilerplate. You can identify these quickly by looking at the preview text.

Issue 4 — Scrape seems to have failed or timed out

Signs: No pages were imported, the import shows 0 pages, or the scrape hangs indefinitely.

Fix:

Check that your URL includes https:// — missing the protocol is a common cause of scrape failure
Try scraping again — temporary connectivity issues can cause one-off failures
Try adding individual page URLs manually if the full domain scrape continues to fail
Check if your website is accessible — sometimes maintenance mode or password protection blocks scraping

💡

After any scrape, always review the KB Spend 5 minutes looking at what was imported. This is the fastest way to spot issues — missing pages, irrelevant content, or thin copy — before they affect the agent's answers.

Was this article helpful?