Confluence
Overview
The Confluence Plugin for Omnata enables ingestion of Confluence data to Snowflake. It supports inbound syncs from Confluence Cloud with multiple data streams including pages, spaces, blog posts, comments, attachments, labels, and users.
Authentication
Atlassian Cloud, API Token
To set up authentication:
Create an API token
Use it with your email address (typically your Atlassian account email)
Configuration Fields
Confluence Domain Name:
site-name.atlassian.netorconfluence.my-custom-domain.comUser Email Address: The email address associated with your API token
API Token: Your generated Atlassian API token (stored securely)
Inbound Syncs
The plugin supports the following streams for inbound syncs:
Pages
Sync Strategies: Full Refresh, Incremental
Primary Key:
idCursor:
lastModifiedAt(for incremental sync)Features:
Confluence Query Language (CQL) filtering support
Full and incremental
Supports both Confluence API v1 and v2 formats
Extracts page body, version history, timestamps, and author information
Spaces
Sync Strategies: Full Refresh
Primary Key:
idFeatures:
Syncs all spaces (global and personal)
Extracts space metadata, description, and homepage information
Blog Posts
Sync Strategies: Full Refresh, Incremental
Primary Key:
idCursor:
createdAt
Comments
Sync Strategies: Full Refresh
Primary Key:
idFeatures:
Syncs inline comments and footer comments from pages
Fetches nested comment replies
Captures comment metadata and author information
Attachments
Sync Strategies: Full Refresh
Primary Key:
idFeatures:
Syncs file attachments from pages
Includes file metadata and download links
Labels
Sync Strategies: Full Refresh
Primary Key:
idFeatures:
Syncs all labels in the Confluence instance
Includes label metadata and usage information
Users
Sync Strategies: Full Refresh
Primary Key:
idFeatures:
Syncs user profiles from Confluence
Captures user details and account information
Pages Content
Sync Strategies: Full Refresh
Primary Key:
idFeatures:
Syncs detailed page body content in storage format
Useful for full-text search and content analysis
Configuration Parameters
CQL Clause for Pages
Filter pages using Confluence Query Language (optional):
Parameter: cql_pages_clause
Examples:
Notes:
Do not include
ORDER BYclauses - Omnata controls ordering for paginationLeave empty to sync all pages (default behaviour)
Performance Notes
Initial Sync
The initial sync may take significant time depending on the volume of data:
Pages & Comments: Comment fetching may add overhead as each page may require individual API calls to fetch related comments
Attachments: Fetching attachment data for large files may increase sync duration
Consider increasing sync timeout settings for large instances
Incremental Syncs
Incremental syncs use modification timestamps and are more efficient:
Only fetches pages modified since the last sync
Uses the
lastModifiedAtcursor fieldDuplicate records on time boundaries are filtered client-side
Recommended for regular, frequent syncs
API Rate Limiting
The plugin implements retry logic and rate limiting to respect Confluence API limits:
Automatic retries with exponential backoff
Respects API rate limit headers
May require longer sync times during high load periods
Troubleshooting
Connection Issues
Verify the following:
Confluence domain name is correct
API token is valid and not expired
User email address matches the API token owner
User has permissions to access the Confluence instance
Missing Data
If expected data is not appearing:
Check CQL clause syntax (if using filters)
Verify user permissions for affected content
Ensure sufficient sync timeout for large datasets
Review error logs for specific failure details
Last updated