Skip to main content

Google Drive Knowledge Sources

Google Drive Knowledge Sources enable you to connect your organization's Google Workspace Shared Drives to Outermind, making documents searchable by AI agents. Documents are automatically indexed on a daily schedule, allowing agents to find and cite information from policies, procedures, reports, and other organizational content stored in Google Drive.

Overview

Google Drive Knowledge Sources provide:

  • Shared Drive indexing - Connect one or more Google Workspace Shared Drives to a single knowledge index
  • Automatic synchronization - Documents are indexed on a daily schedule using efficient change tracking
  • AI-powered summaries - Each document is automatically summarized to improve search relevance
  • Google-native format support - Exports Google Docs, Sheets, and Slides to searchable text alongside traditional file formats
  • Per-tenant OAuth connection - Secure read-only access via Google OAuth 2.0 consent flow

How It Works

1. Connect Google Account
Authenticate via OAuth to grant Outermind read-only access to Shared Drives

2. Create Knowledge Source
Select Shared Drives to index and configure sync settings

3. Initial Sync
System downloads, exports, and indexes all matching documents

4. Delta Sync
Subsequent syncs only process new or changed documents via change tracking

5. Agent Access
AI agents search the indexed documents through Grounded Knowledge Index tools

Prerequisites

Before configuring Google Drive Knowledge Sources, ensure you have:

  1. Google Workspace Account - A Google Workspace account with access to the Shared Drives you want to index
  2. Shared Drive Access - The Google account used to connect must have access to the target Shared Drives
  3. Grounded Knowledge Index - A GKI of type "Google Drive" must be provisioned (the wizard will create one if needed)
  4. Azure AI Search - A search provider must be configured. See Azure AI Search Setup

Setting Up a Google Drive Source

Step 1: Connect Your Google Account

  1. Go to Build > Knowledge > Knowledge Sources
  2. Click Add Source and select Google Drive
  3. Click Connect Google Account to launch the OAuth consent flow
  4. Sign in with a Google Workspace account and grant read-only access
  5. Once connected, the account email is displayed as confirmation

If you have already connected a Google account, you can select the existing connection or add a new one.

Step 2: Name and Select Index

Configure the basic settings:

SettingDescription
Display NameA friendly name for this knowledge source (e.g., "Engineering Docs")
DescriptionOptional description of what documents are included
Target IndexThe Grounded Knowledge Index to populate (or create a new one)

Step 3: Select Shared Drives

Browse and select the Shared Drives you want to index:

  1. The wizard displays all Shared Drives accessible to the connected account
  2. Click on a drive to select it
  3. You can add multiple Shared Drives to a single knowledge source

Step 4: Configure File Types

Choose which document types to include:

  • Google-native formats - Google Docs, Google Sheets, Google Slides (exported to text)
  • Uploaded files - Word, Excel, PowerPoint, PDF, text, Markdown, and HTML files

You can toggle Google-native format indexing on or off and select specific uploaded file types.

Step 5: Configure Sync Settings

SettingDescription
Sync ModeManual (on-demand only) or Daily (automatic)
Max File SizeDocuments larger than 5 MB are skipped by default
Generate SummariesEnable AI summarization for better search relevance

Step 6: Review and Create

Review your configuration and click Create Source to begin indexing.

Supported File Types

Google-Native Formats

Google-native files are exported to text via the Google Drive export API:

FormatVirtual ExtensionExport Method
Google DocsgdocExported as plain text
Google SheetsgsheetExported as CSV (first sheet)
Google SlidesgslideExported as plain text (all slides + speaker notes)

Google-native files have no file size since they are stored natively in Google's format. They are always eligible for indexing regardless of the max file size setting.

Uploaded Files

FormatExtensionNotes
Word.docx, .docFull text extraction including headers and footers
Excel.xlsx, .xlsExtracts content from all sheets
PowerPoint.pptxExtracts slide text and speaker notes
PDF.pdfText-based PDFs only (scanned PDFs not supported)
Text.txtPlain text files
Markdown.mdMarkdown files
HTML.html, .htmExtracts body content, removes scripts and navigation

Not supported: Scanned PDFs (require OCR), images, videos, Google Forms, Google Sites.

Managing Google Drive Sources

Viewing Source Details

From Build > Knowledge > Knowledge Sources, click on a Google Drive source to view:

  • Overview - Status, statistics, and recent sync activity
  • Drives - Connected Shared Drives with individual sync status and document counts
  • Documents - Paginated list of indexed documents with search and filters
  • Sync History - Log of past sync operations
  • Settings - Edit configuration or delete the source

Triggering a Manual Sync

  1. Open the source detail page
  2. Click Sync Now in the header
  3. Choose sync type:
    • Delta - Process only new and changed documents (faster)
    • Full - Re-process all documents (use when troubleshooting)

You can also sync specific Shared Drives individually from the Drives tab.

Pausing and Resuming Sync

To temporarily stop scheduled syncs:

  1. Open the source detail page
  2. Click Pause to stop scheduled syncs
  3. Click Resume to restart scheduled syncs

Pausing does not affect the indexed documents or delete any data.

Viewing Indexed Documents

The Documents tab shows all indexed documents with:

ColumnDescription
File NameDocument name with link to Google Drive
DriveWhich Shared Drive the document is in
File TypeDocument format (including Google-native types)
AuthorWho created the document
ModifiedWhen the document was last changed
StatusIndex status (indexed, pending, failed)

Use filters to find specific documents by drive, file type, or status. You can also filter to show only Google-native format documents.

Re-indexing a Document

If a document's index entry seems outdated or incorrect:

  1. Find the document in the Documents tab
  2. Click the document row to open details
  3. Click Reindex to queue the document for re-processing

Re-indexing will re-download or re-export the file, re-extract content, and regenerate the AI summary.

AI-Powered Document Summaries

When Generate Summaries is enabled, each document is processed by an AI model to generate:

FieldDescription
Summary2-3 sentence description of the document's content
Keywords5-10 relevant search terms
CategoryClassification (policy, procedure, report, template, reference, presentation, spreadsheet, other)

These AI-generated fields significantly improve search relevance, especially for long documents where full-text search may return poor matches.

Sync Modes

ModeWhen It RunsBest For
ManualOn-demand only (click Sync Now)Testing or rarely changing content
DailyOnce per day (2:30 AM UTC)Standard document libraries with regular updates

Start with Daily sync. Use Manual mode only for sources that rarely change or when you want full control over sync timing.

Delta Sync Architecture

Google Drive Knowledge Sources use the Google Drive changes API for efficient synchronization:

  • Initial sync downloads and processes all matching documents across all selected Shared Drives
  • Subsequent syncs use per-drive change tokens to detect only new, modified, or deleted files
  • Per-drive tracking stores change tokens independently for each Shared Drive
  • Content deduplication uses SHA-256 hashing to skip documents whose content has not actually changed
  • Automatic retry handles transient failures with exponential backoff

This approach minimizes API calls and processing time while keeping your knowledge base current.

Best Practices

Start Focused

  1. Begin with 1-2 high-value Shared Drives (e.g., Engineering Docs, Company Policies)
  2. Enable AI summaries to maximize search quality
  3. Use Daily sync to keep content current
  4. Expand to additional drives as you confirm value

Organize for Discoverability

  1. Ensure documents have meaningful titles (not "Untitled document")
  2. Store related documents in dedicated Shared Drives with clear names
  3. Use folders to organize within drives (paths are indexed)
  4. Keep uploaded file sizes reasonable (under 5 MB)

Content Quality

  1. Focus on authoritative, reference-quality documents
  2. Avoid indexing drafts, personal notes, or obsolete content
  3. Move outdated documents to excluded drives or archive folders
  4. Consider separate sources for different content types

Security Considerations

  1. Google Drive access uses per-tenant OAuth with read-only scope
  2. All indexed content is accessible to AI agents for your entire organization
  3. Do not index Shared Drives containing sensitive or restricted content
  4. The connected Google account determines which drives are visible
  5. Review indexed documents periodically to ensure appropriateness

Monitoring and Troubleshooting

Sync Activity

Monitor sync operations from Build > Knowledge > Scan Activity:

  • View running, completed, and failed syncs
  • See document counts (added, updated, deleted, failed)
  • Review error messages for failed operations

Common Issues

Documents Not Being Indexed

  1. Verify the file type is in the configured filter
  2. Check the file size is under 5 MB (uploaded files only)
  3. Ensure the Shared Drive is selected and active
  4. Review the sync history for errors

Google Account Connection Expired

  1. Navigate to the source settings
  2. If the connection status shows "expired" or "revoked", click Reconnect
  3. Complete the OAuth flow again with the same Google account
  4. Connections may expire if the OAuth refresh token is revoked by a Google Workspace admin

Slow Sync Performance

  1. Large initial syncs may take time; subsequent delta syncs are faster
  2. Reduce the number of Shared Drives if processing is taking too long
  3. Google Drive API rate limits are applied at the project level (18,000 requests per 100 seconds)

Search Results Not Finding Documents

  1. Verify the sync has completed successfully
  2. Check the document is in the "indexed" status
  3. Allow a few minutes for search index propagation
  4. Try searching with different terms from the document title or content

AI Summary Not Generated

  1. Confirm Generate Summaries is enabled in settings
  2. Very short documents may not have meaningful summaries
  3. Documents with extraction errors skip summarization

Agent Integration

AI agents access Google Drive documents through the Grounded Knowledge Index tool infrastructure. When an agent needs to search corporate documents:

  1. Agent uses the search_knowledge_index tool (or similar GKI tool)
  2. Query is executed against Azure AI Search
  3. Relevant documents are returned with:
    • Title and summary
    • Relevance score
    • Direct link to Google Drive document
  4. Agent cites sources in its response

No additional configuration is needed. Once documents are indexed, they're automatically available to all agents with access to the target GKI.