> For the complete documentation index, see [llms.txt](https://docs.eesel.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.eesel.ai/integrations/website.md).

# Website

Connect your agent with your website so that it can crawl your site pages and use them as knowledge.

You can have your agent link to specific URLs, and walk users through troubleshooting or recommend product pages.

Note: to set up automatic syncing of your website pages, this is only available on a paid plan upon request.

## Setting up the crawler

1. Go to **Integrations > Website** (or use Quick Start when creating an agent)
2. Enter your website URL (e.g., `https://www.yourcompany.com`)
3. Optionally configure path filters
4. Click **Connect** to start crawling

### Prerequisites

* A publicly accessible website
* The crawler respects `robots.txt`

## How it works

The crawler:

1. Starts at the URL you provide
2. Follows links to discover pages
3. Extracts text content from each page
4. Indexes the content for your agent to search

### Crawl limits

| Plan           | Max pages |
| -------------- | --------- |
| **Trial**      | 100 pages |
| **Paid plans** | 500 pages |

## Configuring path filters

Use include and exclude paths to control what gets crawled:

**Include paths** — Only crawl pages matching these paths

* Example: `/help`, `/docs`, `/support`

**Exclude paths** — Skip pages matching these paths

* Example: `/blog`, `/careers`, `/pricing`

This is useful for large websites where you only want your agent to learn from specific sections.

## Sync frequency and updates

* The crawler periodically re-crawls your site to pick up changes
* You can manually trigger a re-crawl from the integration settings
* New pages linked from existing pages are discovered automatically

## Tips

**Use path filters on large sites.** If your site has thousands of pages, focus the crawler on help and documentation sections.

**Crawl your help center.** If your help center is on a subdomain (like `help.yourcompany.com`), enter that URL directly.

**Combine with other sources.** The website crawler is great for getting started quickly, but add help center articles, documents, and past tickets for comprehensive coverage.

**Check what got crawled.** After the crawl completes, browse the indexed pages in your Files tab to verify the right content was picked up.

## Troubleshooting

**Crawler not finding pages?**

* Make sure pages are publicly accessible (not behind a login)
* Check that `robots.txt` isn't blocking the crawler
* Verify that pages are linked from the starting URL (orphan pages won't be found)

**Too many irrelevant pages crawled?**

* Add exclude paths for sections you don't want (blog, careers, etc.)
* Use include paths to restrict crawling to specific sections


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.eesel.ai/integrations/website.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
