Google Drive Knowledge Sources
Google Drive Knowledge Sources enable you to connect your organization's Google Workspace Shared Drives to Outermind, making documents searchable by AI agents. Documents are automatically indexed on a daily schedule, allowing agents to find and cite information from policies, procedures, reports, and other organizational content stored in Google Drive.
Overview
Google Drive Knowledge Sources provide:
- Shared Drive indexing - Connect one or more Google Workspace Shared Drives to a single knowledge index
- Automatic synchronization - Documents are indexed on a daily schedule using efficient change tracking
- AI-powered summaries - Each document is automatically summarized to improve search relevance
- Google-native format support - Exports Google Docs, Sheets, and Slides to searchable text alongside traditional file formats
- Per-tenant OAuth connection - Secure read-only access via Google OAuth 2.0 consent flow
How It Works
1. Connect Google Account
Authenticate via OAuth to grant Outermind read-only access to Shared Drives
2. Create Knowledge Source
Select Shared Drives to index and configure sync settings
3. Initial Sync
System downloads, exports, and indexes all matching documents
4. Delta Sync
Subsequent syncs only process new or changed documents via change tracking
5. Agent Access
AI agents search the indexed documents through Grounded Knowledge Index tools
Prerequisites
Before configuring Google Drive Knowledge Sources, ensure you have:
- Google Workspace Account - A Google Workspace account with access to the Shared Drives you want to index
- Shared Drive Access - The Google account used to connect must have access to the target Shared Drives
- Grounded Knowledge Index - A GKI of type "Google Drive" must be provisioned (the wizard will create one if needed)
- Azure AI Search - A search provider must be configured. See Azure AI Search Setup
Setting Up a Google Drive Source
Step 1: Connect Your Google Account
- Go to Build > Knowledge > Knowledge Sources
- Click Add Source and select Google Drive
- Click Connect Google Account to launch the OAuth consent flow
- Sign in with a Google Workspace account and grant read-only access
- Once connected, the account email is displayed as confirmation
If you have already connected a Google account, you can select the existing connection or add a new one.
Step 2: Name and Select Index
Configure the basic settings:
| Setting | Description |
|---|---|
| Display Name | A friendly name for this knowledge source (e.g., "Engineering Docs") |
| Description | Optional description of what documents are included |
| Target Index | The Grounded Knowledge Index to populate (or create a new one) |
Step 3: Select Shared Drives
Browse and select the Shared Drives you want to index:
- The wizard displays all Shared Drives accessible to the connected account
- Click on a drive to select it
- You can add multiple Shared Drives to a single knowledge source
Step 4: Configure File Types
Choose which document types to include:
- Google-native formats - Google Docs, Google Sheets, Google Slides (exported to text)
- Uploaded files - Word, Excel, PowerPoint, PDF, text, Markdown, and HTML files
You can toggle Google-native format indexing on or off and select specific uploaded file types.
Step 5: Configure Sync Settings
| Setting | Description |
|---|---|
| Sync Mode | Manual (on-demand only) or Daily (automatic) |
| Max File Size | Documents larger than 5 MB are skipped by default |
| Generate Summaries | Enable AI summarization for better search relevance |
Step 6: Review and Create
Review your configuration and click Create Source to begin indexing.
Supported File Types
Google-Native Formats
Google-native files are exported to text via the Google Drive export API:
| Format | Virtual Extension | Export Method |
|---|---|---|
| Google Docs | gdoc | Exported as plain text |
| Google Sheets | gsheet | Exported as CSV (first sheet) |
| Google Slides | gslide | Exported as plain text (all slides + speaker notes) |
Google-native files have no file size since they are stored natively in Google's format. They are always eligible for indexing regardless of the max file size setting.
Uploaded Files
| Format | Extension | Notes |
|---|---|---|
| Word | .docx, .doc | Full text extraction including headers and footers |
| Excel | .xlsx, .xls | Extracts content from all sheets |
| PowerPoint | .pptx | Extracts slide text and speaker notes |
| Text-based PDFs only (scanned PDFs not supported) | ||
| Text | .txt | Plain text files |
| Markdown | .md | Markdown files |
| HTML | .html, .htm | Extracts body content, removes scripts and navigation |
Not supported: Scanned PDFs (require OCR), images, videos, Google Forms, Google Sites.
Managing Google Drive Sources
Viewing Source Details
From Build > Knowledge > Knowledge Sources, click on a Google Drive source to view:
- Overview - Status, statistics, and recent sync activity
- Drives - Connected Shared Drives with individual sync status and document counts
- Documents - Paginated list of indexed documents with search and filters
- Sync History - Log of past sync operations
- Settings - Edit configuration or delete the source
Triggering a Manual Sync
- Open the source detail page
- Click Sync Now in the header
- Choose sync type:
- Delta - Process only new and changed documents (faster)
- Full - Re-process all documents (use when troubleshooting)
You can also sync specific Shared Drives individually from the Drives tab.
Pausing and Resuming Sync
To temporarily stop scheduled syncs:
- Open the source detail page
- Click Pause to stop scheduled syncs
- Click Resume to restart scheduled syncs
Pausing does not affect the indexed documents or delete any data.
Viewing Indexed Documents
The Documents tab shows all indexed documents with:
| Column | Description |
|---|---|
| File Name | Document name with link to Google Drive |
| Drive | Which Shared Drive the document is in |
| File Type | Document format (including Google-native types) |
| Author | Who created the document |
| Modified | When the document was last changed |
| Status | Index status (indexed, pending, failed) |
Use filters to find specific documents by drive, file type, or status. You can also filter to show only Google-native format documents.
Re-indexing a Document
If a document's index entry seems outdated or incorrect:
- Find the document in the Documents tab
- Click the document row to open details
- Click Reindex to queue the document for re-processing
Re-indexing will re-download or re-export the file, re-extract content, and regenerate the AI summary.
AI-Powered Document Summaries
When Generate Summaries is enabled, each document is processed by an AI model to generate:
| Field | Description |
|---|---|
| Summary | 2-3 sentence description of the document's content |
| Keywords | 5-10 relevant search terms |
| Category | Classification (policy, procedure, report, template, reference, presentation, spreadsheet, other) |
These AI-generated fields significantly improve search relevance, especially for long documents where full-text search may return poor matches.
Sync Modes
| Mode | When It Runs | Best For |
|---|---|---|
| Manual | On-demand only (click Sync Now) | Testing or rarely changing content |
| Daily | Once per day (2:30 AM UTC) | Standard document libraries with regular updates |
Start with Daily sync. Use Manual mode only for sources that rarely change or when you want full control over sync timing.
Delta Sync Architecture
Google Drive Knowledge Sources use the Google Drive changes API for efficient synchronization:
- Initial sync downloads and processes all matching documents across all selected Shared Drives
- Subsequent syncs use per-drive change tokens to detect only new, modified, or deleted files
- Per-drive tracking stores change tokens independently for each Shared Drive
- Content deduplication uses SHA-256 hashing to skip documents whose content has not actually changed
- Automatic retry handles transient failures with exponential backoff
This approach minimizes API calls and processing time while keeping your knowledge base current.
Best Practices
Start Focused
- Begin with 1-2 high-value Shared Drives (e.g., Engineering Docs, Company Policies)
- Enable AI summaries to maximize search quality
- Use Daily sync to keep content current
- Expand to additional drives as you confirm value
Organize for Discoverability
- Ensure documents have meaningful titles (not "Untitled document")
- Store related documents in dedicated Shared Drives with clear names
- Use folders to organize within drives (paths are indexed)
- Keep uploaded file sizes reasonable (under 5 MB)
Content Quality
- Focus on authoritative, reference-quality documents
- Avoid indexing drafts, personal notes, or obsolete content
- Move outdated documents to excluded drives or archive folders
- Consider separate sources for different content types
Security Considerations
- Google Drive access uses per-tenant OAuth with read-only scope
- All indexed content is accessible to AI agents for your entire organization
- Do not index Shared Drives containing sensitive or restricted content
- The connected Google account determines which drives are visible
- Review indexed documents periodically to ensure appropriateness
Monitoring and Troubleshooting
Sync Activity
Monitor sync operations from Build > Knowledge > Scan Activity:
- View running, completed, and failed syncs
- See document counts (added, updated, deleted, failed)
- Review error messages for failed operations
Common Issues
Documents Not Being Indexed
- Verify the file type is in the configured filter
- Check the file size is under 5 MB (uploaded files only)
- Ensure the Shared Drive is selected and active
- Review the sync history for errors
Google Account Connection Expired
- Navigate to the source settings
- If the connection status shows "expired" or "revoked", click Reconnect
- Complete the OAuth flow again with the same Google account
- Connections may expire if the OAuth refresh token is revoked by a Google Workspace admin
Slow Sync Performance
- Large initial syncs may take time; subsequent delta syncs are faster
- Reduce the number of Shared Drives if processing is taking too long
- Google Drive API rate limits are applied at the project level (18,000 requests per 100 seconds)
Search Results Not Finding Documents
- Verify the sync has completed successfully
- Check the document is in the "indexed" status
- Allow a few minutes for search index propagation
- Try searching with different terms from the document title or content
AI Summary Not Generated
- Confirm Generate Summaries is enabled in settings
- Very short documents may not have meaningful summaries
- Documents with extraction errors skip summarization
Agent Integration
AI agents access Google Drive documents through the Grounded Knowledge Index tool infrastructure. When an agent needs to search corporate documents:
- Agent uses the
search_knowledge_indextool (or similar GKI tool) - Query is executed against Azure AI Search
- Relevant documents are returned with:
- Title and summary
- Relevance score
- Direct link to Google Drive document
- Agent cites sources in its response
No additional configuration is needed. Once documents are indexed, they're automatically available to all agents with access to the target GKI.