Migrate Datasources to Indexing

Overview

What Changed: Datasources → Indexing

Before (Nexus - Datasources):

Datasources were standalone entities that created datasets
Limited to specific data types and configurations
Separate entity management outside of toolkits
Fixed chunking and processing options

Now (Next - Indexing):

Indexing is integrated directly into Toolkits
Standardized tools available across all supported platforms
Flexible collection management and naming
Advanced search capabilities with stepback search
Better chunking and metadata handling

The Deduplication feature available in Datasources is deprecated and not available in the new Indexing system. Focus on using the improved search and filtering capabilities of the new indexing tools instead.

Migration Process Overview

Step 1: Identify Your Current Datasources

In the Nexus environment, review your existing datasources to understand:

What type of data each datasource indexes (Jira, Confluence, Files, etc.)
The scope and filters used in each datasource
Which agents or pipelines use these datasources

Step 2: Set Up Prerequisites in Next Environment

Before migrating, ensure you have:

Credentials: Migrate or recreate your credentials in the Next environment
Vector Storage: Configure PgVector in Settings → AI Configuration
Embedding Model: Select an embedding model in AI Configuration

Step 3: Create Corresponding Toolkits

Create the appropriate toolkits in the Next environment based on your datasource types:

File/Table → Artifact Toolkit or SharePoint Toolkit
Jira → Jira Toolkit
Confluence → Confluence Toolkit
Git → GitHub Toolkit

Step 4: Index Your Data

Use the Index Data tool in each toolkit to recreate your indexed content with improved capabilities.

Migration Examples by Datasource Type

File/Table Datasources → Artifacts or SharePoint Indexing

What you had in Nexus:

File datasources that indexed documents, spreadsheets, or text files
Table datasources for structured data or spreadsheets
Basic file content extraction and tabular data indexing

What to do in Next: Option A: Use SharePoint Indexing

Upload files to SharePoint (if not already there):
- Upload your files (documents, Excel, CSV, etc.) to your SharePoint site/library
- Ensure files are accessible with your SharePoint credentials
Create SharePoint Credential:
- Create credential in Credentials → + Create → SharePoint
Create SharePoint Toolkit with your SharePoint credentials
Index SharePoint files containing your data:
- Select Index Data tool and specify indexing parameters (e.g., collection suffix, file type filters, etc.)
Advanced Search Capabilities:
- Use Stepback Search Index for complex document queries
- Search across multiple SharePoint libraries with semantic understanding

Option B: Use Artifacts Indexing (for local files)

Upload files to Artifacts:
- Navigate to Artifacts → + Create bucket
- Upload your files to the artifact bucket
Create Artifact Toolkit:
- Navigate to Toolkits → + Create → Artifact
- Specify the bucket name containing your files
- Enable indexing tools: Index Data, Search Index, Stepback Search Index, Stepback Summary Index, List Collections, Remove Index
Index your artifact data:
- Use Index Data tool and specify indexing parameters (e.g., collection suffix, chunking method, clean index option, file filters, etc.)
Advanced Search Capabilities:
- Use Stepback Search Index for complex document queries
- Search across multiple artifact buckets with semantic understanding

For complete step-by-step instructions, see:

Index SharePoint Data
Index Artifacts Data

Jira Datasources → Jira Indexing

What you had in Nexus:

Jira datasources that indexed issues, stories, comments, and attachments
Project-based or JQL-filtered content

What to do in Next:

Create Jira Credential:
- Use your existing Jira API token (or generate a new one if needed)
- Create credential in Credentials → + Create → Jira
Create Jira Toolkit:
- Navigate to Toolkits → + Create → Jira
- Configure with your Jira instance URL and credentials
- Enable indexing tools: Index Data, Search Index, Stepback Search Index, Stepback Summary Index, List Collections, Remove Index
Index Jira Data:
- Use Index Data tool and specify indexing parameters (e.g., collection suffix, project keys or JQL queries, etc.)
Enhanced Search Capabilities:
- Use Stepback Search Index for complex queries
- Search across multiple projects with natural language

For complete step-by-step instructions, see: Index Jira Data

Confluence Datasources → Confluence Indexing

What you had in Nexus:

Confluence datasources that indexed pages, spaces, and attachments
Space-based or label-filtered content

What to do in Next:

Create Confluence Credential:
- Use your existing Confluence API token (or generate a new one if needed)
- Create credential in Credentials → + Create → Confluence
Create Confluence Toolkit:
- Navigate to Toolkits → + Create → Confluence
- Configure with your Confluence instance and credentials
- Enable indexing tools: Index Data, Search Index, Stepback Search Index, Stepback Summary Index, List Collections, Remove Index
Index Confluence Data:
- Use Index Data tool and specify indexing parameters (collection suffix, space keys, content filters, etc…)
Advanced Documentation Search:
- Use Stepback Summary Index for documentation analysis
- Search across multiple spaces with semantic queries

For complete step-by-step instructions, see: Index Confluence Data

Git Datasources → Repository Indexing

What you had in Nexus:

Git datasources that indexed code repositories, documentation, and commit history
Branch-based or repository-wide content indexing
Basic code and documentation search

What to do in Next:

Create Repository Credential:
- Use your existing GitHub Personal Access Token (or generate a new one if needed)
- Create credential in Credentials → + Create → GitHub
Create Repository Toolkit:
- Navigate to Toolkits → + Create → GitHub
- Configure with your repository URL and credentials
- Enable indexing tools: Index Data, Search Index, Stepback Search Index, Stepback Summary Index, List Collections, Remove Index
Index Repository Data:
- Use Index Data tool and specify indexing parameters (collection suffix, branch name, file type filters, etc.)
Enhanced Code and Documentation Search:
- Use Stepback Search Index for complex code analysis queries
- Search across multiple GitHub repositories with semantic understanding
- Find code patterns, documentation, and implementation examples

For complete step-by-step instructions, see: Index Repository Data

Key Improvements in Indexing

Better Search Capabilities

New search tools available:

Search Index: Basic semantic search across indexed content
Stepback Search Index: Advanced search that breaks down complex questions for better results
Stepback Summary Index: Search with automatic summarization of results
List Collections: View all available indexed collections
Remove Index: Clean up or refresh indexed data

Improved Organization

Collection Management:

Use meaningful collection suffixes to organize different types of content
Create multiple indexes for different scopes or time periods
Better naming conventions for easier discovery

Enhanced Integration

Toolkit Integration:

Indexing tools are built into each toolkit
Consistent experience across all platforms
Direct integration with conversations and agents

Migration Checklist

Before You Start

☐ Review all existing datasources in Nexus environment
☐ Document the scope and purpose of each datasource
☐ Identify which agents/pipelines use each datasource
☐ Plan your new collection naming strategy

Setting Up Next Environment

☐ Configure Vector Storage (PgVector) in AI Configuration
☐ Select Embedding Model in AI Configuration
☐ Recreate or migrate credentials for each data source
☐ Create appropriate toolkits for each data type

Migration Process

☐ Start with most critical datasources first
☐ Create indexes using the Index Data tool in each toolkit
☐ Test search functionality with sample queries
☐ Update agents and pipelines to use new toolkits
☐ Verify search results match expected content

Post-Migration

☐ Remove references to old datasources in agents/pipelines
☐ Train team members on new indexing tools
☐ Establish process for maintaining and updating indexes
☐ Consider creating multiple collections for better organization

Getting Help

If you encounter issues during migration:

Check Prerequisites: Ensure vector storage and embedding models are properly configured
Review Toolkit Configuration: Verify credentials and connection settings
Test with Small Scope: Start with a small subset of data to validate the process
Consult Documentation: Each indexing guide provides troubleshooting for specific platforms

For platform-specific guidance and detailed troubleshooting:

Indexing Overview - Complete guide to indexing system and capabilities
Indexing Tools - Detailed reference for all indexing tools
Index Artifacts Data - Index files and documents
Index Jira Data - Index Jira issues and project data
Index Confluence Data - Index Confluence pages and spaces
Index SharePoint Data - Index SharePoint files and sites
Index Repository Data - Index GitHub repositories
AI Configuration - Configure vector storage and embedding models
Create a Credential - Set up authentication for data sources
Toolkits Menu - General toolkit configuration guide

Help & Support

Migration & Upgrade

Archived Documentation

Overview

What Changed: Datasources → Indexing

Migration Process Overview

Step 1: Identify Your Current Datasources

Step 2: Set Up Prerequisites in Next Environment

Step 3: Create Corresponding Toolkits

Step 4: Index Your Data

Migration Examples by Datasource Type

File/Table Datasources → Artifacts or SharePoint Indexing

Jira Datasources → Jira Indexing

Confluence Datasources → Confluence Indexing

Git Datasources → Repository Indexing

Key Improvements in Indexing

Better Search Capabilities

Improved Organization

Enhanced Integration

Migration Checklist

Before You Start

Setting Up Next Environment

Migration Process

Post-Migration

Getting Help

​Overview

​What Changed: Datasources → Indexing

​Migration Process Overview

​Step 1: Identify Your Current Datasources

​Step 2: Set Up Prerequisites in Next Environment

​Step 3: Create Corresponding Toolkits

​Step 4: Index Your Data

​Migration Examples by Datasource Type

​File/Table Datasources → Artifacts or SharePoint Indexing

​Jira Datasources → Jira Indexing

​Confluence Datasources → Confluence Indexing

​Git Datasources → Repository Indexing

​Key Improvements in Indexing

​Better Search Capabilities

​Improved Organization

​Enhanced Integration

​Migration Checklist

​Before You Start

​Setting Up Next Environment

​Migration Process

​Post-Migration

​Getting Help

Overview

What Changed: Datasources → Indexing

Migration Process Overview

Step 1: Identify Your Current Datasources

Step 2: Set Up Prerequisites in Next Environment

Step 3: Create Corresponding Toolkits

Step 4: Index Your Data

Migration Examples by Datasource Type

File/Table Datasources → Artifacts or SharePoint Indexing

Jira Datasources → Jira Indexing

Confluence Datasources → Confluence Indexing

Git Datasources → Repository Indexing

Key Improvements in Indexing

Better Search Capabilities

Improved Organization

Enhanced Integration

Migration Checklist

Before You Start

Setting Up Next Environment

Migration Process

Post-Migration

Getting Help