Перейти к основному содержимому

Docusaurus

Sync your Docusaurus documentation site into AI SmartTalk's knowledge base. Your AI will learn from every page in your sitemap—perfect for support bots that answer technical questions.


Overview

The Docusaurus integration enables you to:

  • Import all pages from your sitemap automatically
  • Keep docs synced when you publish updates
  • Answer questions about your documentation conversationally
  • Reduce support load by letting AI handle common questions

Fun fact: AI SmartTalk's own documentation uses Docusaurus, and this integration powers our support chatbot!


Prerequisites

Before you begin, ensure you have:

  • An active AI SmartTalk account
  • A Docusaurus site with a valid sitemap.xml
  • Your site must be publicly accessible (or provide authentication)

Step-by-Step Setup

Step 1: Locate Your Sitemap

Docusaurus automatically generates a sitemap. Find it at:

https://your-docs-site.com/sitemap.xml

Verify it loads in your browser and contains your documentation pages.

Step 2: Add the Docusaurus Integration

  1. Log into your AI SmartTalk account
  2. Navigate to SettingsIntegrations
  3. Find Docusaurus and click Connect
  4. Enter your sitemap URL
  5. Click Validate

Step 3: Configure Import Settings

After validation, configure your import:

SettingDescription
Sitemap URLFull URL to your sitemap.xml
Include patternsOnly sync pages matching patterns (optional)
Exclude patternsSkip specific pages or sections (optional)

Step 4: Start the Import

  1. Click Import Pages
  2. AI SmartTalk crawls each URL in your sitemap
  3. Content is extracted and added to your knowledge base
  4. Wait for the import to complete (progress shown)

Step 5: Verify the Import

  1. Go to Knowledge in AI SmartTalk
  2. Your documentation pages should appear
  3. Test your AI by asking questions about your docs

What Gets Synced

ContentHow It's Processed
Page titleUsed as document identifier
Page contentFull text extracted from HTML
HeadingsPreserved for structure
Code blocksIncluded as-is
TablesConverted to readable format
URLsPage URL stored for reference

Content Extraction

AI SmartTalk extracts the main content area and ignores:

  • Navigation menus
  • Sidebars
  • Footers
  • Scripts and styles

Sync Behavior

Manual Import

Click Import in the integration settings to:

  • Fetch the latest sitemap
  • Add new pages
  • Update changed pages
  • Remove deleted pages

Keeping Docs Fresh

For always-current documentation:

  1. Manual refresh: Click Import after publishing updates
  2. Scheduled sync: Use SmartFlow to automate imports

SmartFlow Scheduled Import

Workflow: Docusaurus Auto-Sync
Trigger: Scheduled (Daily at 3:00 AM)
Actions:
- Sync Connector:
Type: Docusaurus
Sitemap: https://docs.example.com/sitemap.xml

URL Patterns

Include Patterns

Only sync specific sections:

PatternEffect
/docs/api/*Only API documentation
/docs/guides/*Only guides section
/blog/*Only blog posts

Exclude Patterns

Skip certain pages:

PatternEffect
/docs/internal/*Skip internal docs
/changelogSkip changelog page
*/draft-*Skip draft pages

Use Cases

Technical Support Bot

Sync your product documentation:

  • "How do I install the SDK?"
  • "What are the API rate limits?"
  • "Show me an example of authentication"

Developer Documentation

Sync API references and guides:

  • "What parameters does the /users endpoint accept?"
  • "How do I handle webhooks?"
  • "What's the difference between v1 and v2 API?"

Internal Knowledge Base

Sync company wikis and procedures:

  • "What's the process for requesting PTO?"
  • "How do I set up my development environment?"
  • "Where do I find the brand guidelines?"

Troubleshooting

Sitemap Issues

IssueSolution
"Invalid sitemap"Verify URL returns valid XML
"No pages found"Check sitemap contains <url> entries
"Access denied"Ensure sitemap is publicly accessible

Import Issues

IssueSolution
Pages missingCheck include/exclude patterns
Import stuckLarge sites take time; wait or import in batches
Old contentRe-import to fetch latest versions

Content Quality

IssueSolution
Wrong content extractedReport issue—may need custom extraction
Missing code blocksVerify code is in standard <pre><code> tags
Garbled textCheck page encoding (UTF-8 recommended)

Managing the Integration

ActionHow
Re-import allClick Import in integration settings
Change sitemapUpdate URL and re-import
Remove contentDisconnect integration or delete from Knowledge
DisconnectSettings → Integrations → Docusaurus → Disconnect

Best Practices

  1. Quality content: Well-written docs = better AI answers
  2. Clear structure: Use headings, lists, and tables
  3. Descriptive titles: Page titles help AI understand context
  4. Regular syncs: Keep AI updated with latest documentation
  5. Test thoroughly: Ask common questions to verify AI accuracy

Docusaurus Configuration Tips

Optimize for AI Extraction

In your docusaurus.config.js:

module.exports = {
// Ensure sitemap is generated
plugins: ['@docusaurus/plugin-sitemap'],

// Use descriptive page titles
title: 'Your Product Docs',

// Include metadata
themeConfig: {
metadata: [{
name: 'description',
content: 'Documentation for Your Product'
}],
},
};

Exclude Pages from Sitemap

To prevent certain pages from being synced:

// In page frontmatter
---
title: Internal Page
sitemap:
exclude: true
---

Ready to elevate your
user experience?

Deploy AI assistants that delight customers and scale with your business.

GDPR Compliant