Docusaurus
Sync your Docusaurus documentation site into AI SmartTalk's knowledge base. Your AI will learn from every page in your sitemap—perfect for support bots that answer technical questions.
Overview
The Docusaurus integration enables you to:
- Import all pages from your sitemap automatically
- Keep docs synced when you publish updates
- Answer questions about your documentation conversationally
- Reduce support load by letting AI handle common questions
Fun fact: AI SmartTalk's own documentation uses Docusaurus, and this integration powers our support chatbot!
Prerequisites
Before you begin, ensure you have:
- An active AI SmartTalk account
- A Docusaurus site with a valid
sitemap.xml - Your site must be publicly accessible (or provide authentication)
Step-by-Step Setup
Step 1: Locate Your Sitemap
Docusaurus automatically generates a sitemap. Find it at:
https://your-docs-site.com/sitemap.xml
Verify it loads in your browser and contains your documentation pages.
Step 2: Add the Docusaurus Integration
- Log into your AI SmartTalk account
- Navigate to Settings → Integrations
- Find Docusaurus and click Connect
- Enter your sitemap URL
- Click Validate
Step 3: Configure Import Settings
After validation, configure your import:
| Setting | Description |
|---|---|
| Sitemap URL | Full URL to your sitemap.xml |
| Include patterns | Only sync pages matching patterns (optional) |
| Exclude patterns | Skip specific pages or sections (optional) |
Step 4: Start the Import
- Click Import Pages
- AI SmartTalk crawls each URL in your sitemap
- Content is extracted and added to your knowledge base
- Wait for the import to complete (progress shown)
Step 5: Verify the Import
- Go to Knowledge in AI SmartTalk
- Your documentation pages should appear
- Test your AI by asking questions about your docs
What Gets Synced
| Content | How It's Processed |
|---|---|
| Page title | Used as document identifier |
| Page content | Full text extracted from HTML |
| Headings | Preserved for structure |
| Code blocks | Included as-is |
| Tables | Converted to readable format |
| URLs | Page URL stored for reference |
Content Extraction
AI SmartTalk extracts the main content area and ignores:
- Navigation menus
- Sidebars
- Footers
- Scripts and styles
Sync Behavior
Manual Import
Click Import in the integration settings to:
- Fetch the latest sitemap
- Add new pages
- Update changed pages
- Remove deleted pages
Keeping Docs Fresh
For always-current documentation:
- Manual refresh: Click Import after publishing updates
- Scheduled sync: Use SmartFlow to automate imports
SmartFlow Scheduled Import
Workflow: Docusaurus Auto-Sync
Trigger: Scheduled (Daily at 3:00 AM)
Actions:
- Sync Connector:
Type: Docusaurus
Sitemap: https://docs.example.com/sitemap.xml
URL Patterns
Include Patterns
Only sync specific sections:
| Pattern | Effect |
|---|---|
/docs/api/* | Only API documentation |
/docs/guides/* | Only guides section |
/blog/* | Only blog posts |
Exclude Patterns
Skip certain pages:
| Pattern | Effect |
|---|---|
/docs/internal/* | Skip internal docs |
/changelog | Skip changelog page |
*/draft-* | Skip draft pages |
Use Cases
Technical Support Bot
Sync your product documentation:
- "How do I install the SDK?"
- "What are the API rate limits?"
- "Show me an example of authentication"
Developer Documentation
Sync API references and guides:
- "What parameters does the /users endpoint accept?"
- "How do I handle webhooks?"
- "What's the difference between v1 and v2 API?"
Internal Knowledge Base
Sync company wikis and procedures:
- "What's the process for requesting PTO?"
- "How do I set up my development environment?"
- "Where do I find the brand guidelines?"
Troubleshooting
Sitemap Issues
| Issue | Solution |
|---|---|
| "Invalid sitemap" | Verify URL returns valid XML |
| "No pages found" | Check sitemap contains <url> entries |
| "Access denied" | Ensure sitemap is publicly accessible |
Import Issues
| Issue | Solution |
|---|---|
| Pages missing | Check include/exclude patterns |
| Import stuck | Large sites take time; wait or import in batches |
| Old content | Re-import to fetch latest versions |
Content Quality
| Issue | Solution |
|---|---|
| Wrong content extracted | Report issue—may need custom extraction |
| Missing code blocks | Verify code is in standard <pre><code> tags |
| Garbled text | Check page encoding (UTF-8 recommended) |
Managing the Integration
| Action | How |
|---|---|
| Re-import all | Click Import in integration settings |
| Change sitemap | Update URL and re-import |
| Remove content | Disconnect integration or delete from Knowledge |
| Disconnect | Settings → Integrations → Docusaurus → Disconnect |
Best Practices
- Quality content: Well-written docs = better AI answers
- Clear structure: Use headings, lists, and tables
- Descriptive titles: Page titles help AI understand context
- Regular syncs: Keep AI updated with latest documentation
- Test thoroughly: Ask common questions to verify AI accuracy
Docusaurus Configuration Tips
Optimize for AI Extraction
In your docusaurus.config.js:
module.exports = {
// Ensure sitemap is generated
plugins: ['@docusaurus/plugin-sitemap'],
// Use descriptive page titles
title: 'Your Product Docs',
// Include metadata
themeConfig: {
metadata: [{
name: 'description',
content: 'Documentation for Your Product'
}],
},
};
Exclude Pages from Sitemap
To prevent certain pages from being synced:
// In page frontmatter
---
title: Internal Page
sitemap:
exclude: true
---
Related Documentation
- Integrations Overview
- Knowledge Base Management
- RSS Feed Integration — For blog/news content
- SmartFlow Scheduled Triggers — Automate imports