Overview
What Changed: Datasources → Indexing
Before (Nexus - Datasources):- Datasources were standalone entities that created datasets
- Limited to specific data types and configurations
- Separate entity management outside of toolkits
- Fixed chunking and processing options
- Indexing is integrated directly into Toolkits
- Standardized tools available across all supported platforms
- Flexible collection management and naming
- Advanced search capabilities with stepback search
- Better chunking and metadata handling
Migration Process Overview
Step 1: Identify Your Current Datasources
In the Nexus environment, review your existing datasources to understand:- What type of data each datasource indexes (Jira, Confluence, Files, etc.)
- The scope and filters used in each datasource
- Which agents or pipelines use these datasources
Step 2: Set Up Prerequisites in Next Environment
Before migrating, ensure you have:- Credentials: Migrate or recreate your credentials in the Next environment
- Vector Storage: Configure PgVector in Settings → AI Configuration
- Embedding Model: Select an embedding model in AI Configuration
Step 3: Create Corresponding Toolkits
Create the appropriate toolkits in the Next environment based on your datasource types:- File/Table → Artifact Toolkit or SharePoint Toolkit
- Jira → Jira Toolkit
- Confluence → Confluence Toolkit
- Git → GitHub Toolkit
Step 4: Index Your Data
Use the Index Data tool in each toolkit to recreate your indexed content with improved capabilities.Migration Examples by Datasource Type
File/Table Datasources → Artifacts or SharePoint Indexing
What you had in Nexus:- File datasources that indexed documents, spreadsheets, or text files
- Table datasources for structured data or spreadsheets
- Basic file content extraction and tabular data indexing
- Upload files to SharePoint (if not already there):
- Upload your files (documents, Excel, CSV, etc.) to your SharePoint site/library
- Ensure files are accessible with your SharePoint credentials
- Create SharePoint Credential:
- Create credential in Credentials → + Create → SharePoint
- Create SharePoint Toolkit with your SharePoint credentials
- Index SharePoint files containing your data:
- Select Index Data tool and specify indexing parameters (e.g., collection suffix, file type filters, etc.)
- Advanced Search Capabilities:
- Use Stepback Search Index for complex document queries
- Search across multiple SharePoint libraries with semantic understanding
-
Upload files to Artifacts:
- Navigate to Artifacts → + Create bucket
- Upload your files to the artifact bucket
-
Create Artifact Toolkit:
- Navigate to Toolkits → + Create → Artifact
- Specify the bucket name containing your files
- Enable indexing tools:
Index Data,Search Index,Stepback Search Index,Stepback Summary Index,List Collections,Remove Index
-
Index your artifact data:
- Use Index Data tool and specify indexing parameters (e.g., collection suffix, chunking method, clean index option, file filters, etc.)
-
Advanced Search Capabilities:
- Use Stepback Search Index for complex document queries
- Search across multiple artifact buckets with semantic understanding
For complete step-by-step instructions, see:
Jira Datasources → Jira Indexing
What you had in Nexus:- Jira datasources that indexed issues, stories, comments, and attachments
- Project-based or JQL-filtered content
-
Create Jira Credential:
- Use your existing Jira API token (or generate a new one if needed)
- Create credential in Credentials → + Create → Jira
-
Create Jira Toolkit:
- Navigate to Toolkits → + Create → Jira
- Configure with your Jira instance URL and credentials
- Enable indexing tools:
Index Data,Search Index,Stepback Search Index,Stepback Summary Index,List Collections,Remove Index
-
Index Jira Data:
- Use Index Data tool and specify indexing parameters (e.g., collection suffix, project keys or JQL queries, etc.)
-
Enhanced Search Capabilities:
- Use Stepback Search Index for complex queries
- Search across multiple projects with natural language
For complete step-by-step instructions, see: Index Jira Data
Confluence Datasources → Confluence Indexing
What you had in Nexus:- Confluence datasources that indexed pages, spaces, and attachments
- Space-based or label-filtered content
-
Create Confluence Credential:
- Use your existing Confluence API token (or generate a new one if needed)
- Create credential in Credentials → + Create → Confluence
-
Create Confluence Toolkit:
- Navigate to Toolkits → + Create → Confluence
- Configure with your Confluence instance and credentials
- Enable indexing tools:
Index Data,Search Index,Stepback Search Index,Stepback Summary Index,List Collections,Remove Index
-
Index Confluence Data:
- Use Index Data tool and specify indexing parameters (collection suffix, space keys, content filters, etc…)
-
Advanced Documentation Search:
- Use Stepback Summary Index for documentation analysis
- Search across multiple spaces with semantic queries
For complete step-by-step instructions, see: Index Confluence Data
Git Datasources → Repository Indexing
What you had in Nexus:- Git datasources that indexed code repositories, documentation, and commit history
- Branch-based or repository-wide content indexing
- Basic code and documentation search
-
Create Repository Credential:
- Use your existing GitHub Personal Access Token (or generate a new one if needed)
- Create credential in Credentials → + Create → GitHub
-
Create Repository Toolkit:
- Navigate to Toolkits → + Create → GitHub
- Configure with your repository URL and credentials
- Enable indexing tools:
Index Data,Search Index,Stepback Search Index,Stepback Summary Index,List Collections,Remove Index
-
Index Repository Data:
- Use Index Data tool and specify indexing parameters (collection suffix, branch name, file type filters, etc.)
-
Enhanced Code and Documentation Search:
- Use Stepback Search Index for complex code analysis queries
- Search across multiple GitHub repositories with semantic understanding
- Find code patterns, documentation, and implementation examples
For complete step-by-step instructions, see: Index Repository Data
Key Improvements in Indexing
Better Search Capabilities
New search tools available:- Search Index: Basic semantic search across indexed content
- Stepback Search Index: Advanced search that breaks down complex questions for better results
- Stepback Summary Index: Search with automatic summarization of results
- List Collections: View all available indexed collections
- Remove Index: Clean up or refresh indexed data
Improved Organization
Collection Management:- Use meaningful collection suffixes to organize different types of content
- Create multiple indexes for different scopes or time periods
- Better naming conventions for easier discovery
Enhanced Integration
Toolkit Integration:- Indexing tools are built into each toolkit
- Consistent experience across all platforms
- Direct integration with conversations and agents
Migration Checklist
Before You Start
- ☐ Review all existing datasources in Nexus environment
- ☐ Document the scope and purpose of each datasource
- ☐ Identify which agents/pipelines use each datasource
- ☐ Plan your new collection naming strategy
Setting Up Next Environment
- ☐ Configure Vector Storage (PgVector) in AI Configuration
- ☐ Select Embedding Model in AI Configuration
- ☐ Recreate or migrate credentials for each data source
- ☐ Create appropriate toolkits for each data type
Migration Process
- ☐ Start with most critical datasources first
- ☐ Create indexes using the Index Data tool in each toolkit
- ☐ Test search functionality with sample queries
- ☐ Update agents and pipelines to use new toolkits
- ☐ Verify search results match expected content
Post-Migration
- ☐ Remove references to old datasources in agents/pipelines
- ☐ Train team members on new indexing tools
- ☐ Establish process for maintaining and updating indexes
- ☐ Consider creating multiple collections for better organization
Getting Help
If you encounter issues during migration:- Check Prerequisites: Ensure vector storage and embedding models are properly configured
- Review Toolkit Configuration: Verify credentials and connection settings
- Test with Small Scope: Start with a small subset of data to validate the process
- Consult Documentation: Each indexing guide provides troubleshooting for specific platforms
For platform-specific guidance and detailed troubleshooting:
- Indexing Overview - Complete guide to indexing system and capabilities
- Indexing Tools - Detailed reference for all indexing tools
- Index Artifacts Data - Index files and documents
- Index Jira Data - Index Jira issues and project data
- Index Confluence Data - Index Confluence pages and spaces
- Index SharePoint Data - Index SharePoint files and sites
- Index Repository Data - Index GitHub repositories
- AI Configuration - Configure vector storage and embedding models
- Create a Credential - Set up authentication for data sources
- Toolkits Menu - General toolkit configuration guide