Website scraping is the fastest way to build your agent's knowledge base. In under a minute, ChatForge reads your entire website and turns every page into knowledge your agent can draw on when answering visitor questions.
What does "scraping" mean?
When you add a website URL, ChatForge's crawler visits each page of your site, reads the text content, and stores it in your agent's knowledge base. It works like a search engine indexing your site — except instead of powering search results, it powers intelligent conversation.
Pages it typically reads: Home, About, Services/Products, Pricing, FAQ, Contact, Blog posts.
How to add your website
What gets scraped (and what doesn't)
| Included | Not included |
|---|---|
| ✓ All text content on public pages | ✗ Pages behind a login |
| ✓ Service/product descriptions | ✗ PDF content (upload separately) |
| ✓ FAQ sections | ✗ Images (only alt text) |
| ✓ Blog posts & articles | ✗ Dynamic content loaded by JavaScript |
| ✓ Contact and about pages | ✗ Pages blocked by robots.txt |
Tips for best scraping results
- Make sure your website has real content. If your pages say "Coming soon" or have very thin text, the agent will have little to work with.
- Have a dedicated FAQ page. This is the single most valuable page for your agent. If you don't have one, create it.
- Include pricing. "Contact us for pricing" is the number one reason agents fail. Put real numbers on your site.
- Re-scrape when you update your site. The agent doesn't update automatically — re-run the scrape any time your content changes significantly.
Adding specific pages manually
If the crawler misses an important page, you can add it manually by entering the full URL of that specific page. This is useful for deep-linked pages that aren't in your main navigation.