Execution Profiles
AICOS v2 introduces conversation-type-aware execution profiles that replace the one-size-fits-all execution model. Instead of running the same pipeline for every trigger, AICOS now automatically selects an optimized profile based on what kind of work it is doing, resulting in faster responses, lower costs, and better quality.
How Execution Profiles Work
When AICOS receives a trigger (a scheduled wake-up, a chat message, an inbound email, or an event notification), it resolves the trigger to one of six execution profiles. The profile controls every dimension of the pipeline:
- Resource budgets - Maximum tool calls, execution time, and token allocation
- Model selection - Which LLM to use (primary or chat-optimized)
- System prompt - How much context to include (full, standard, compact, or minimal)
- Tool subset - Which tools are available for this conversation type
- Phase configuration - Which pre-execution steps run (some are skipped for fast responses)
- Temperature - Tuned per phase for optimal output quality
- Streaming - Whether tokens are delivered in real time
Profile resolution happens once at the start of execution, before any LLM calls. The resolved profile drives all downstream parameters. If resolution fails for any reason, AICOS falls back to the full pipeline (v1 behavior), ensuring no disruption.
Trigger Arrives
|
v
Resolve Profile (from trigger type)
|
v
Configure Pipeline
|---> Resource budgets (max tools, max time, max tokens)
|---> Model selection (primary or chat)
|---> System prompt tier (full, standard, compact, minimal)
|---> Tool subset (all, email-focused, chat-focused, event-focused)
|---> Phase skipping (skip unnecessary pre-work for fast paths)
|---> Temperature and streaming settings
|
v
Execute via Core Loop
(same proven engine, different parameters)
The Six Profiles
Long Processing (long_processing)
The daily workhorse profile. Used for scheduled wake-ups, manual triggers, and goal assignments where AICOS needs to review all goals, plan projects, execute tasks, coordinate SMEs, and record knowledge.
| Parameter | Value |
|---|---|
| Max tool calls | 100 |
| Max execution time | 20 minutes |
| Max tokens per response | 16,384 |
| System prompt | Full (all 15 sections) |
| Tools available | All tools |
| Streaming | No |
| Model | Primary |
When it activates:
| Trigger | Description |
|---|---|
scheduled_wakeup | Daily scheduled wake-up |
manual_trigger | Administrator-initiated from dashboard |
daily_wakeup | Legacy daily trigger |
wake_up | Generic wake-up |
goal_assigned | New goal assigned |
goal_updated | Existing goal modified |
What happens: AICOS runs all phases, including knowledge bootstrapping, context assessment, reflection on prior sessions, delta calculation, goal decomposition, session planning, execution, communication, and scheduling. This profile benefits from prompt caching since the system prompt remains stable across many LLM calls within a single execution. Progressive compaction preserves recent tool results while summarizing older conversation turns to maximize the useful context window.
Email Research (email_research)
Mid-weight profile for processing inbound emails and email replies. Focuses on email-related tools and skips planning-heavy phases that are not relevant to email handling.
| Parameter | Value |
|---|---|
| Max tool calls | 30 |
| Max execution time | 10 minutes |
| Max tokens per response | 8,192 |
| System prompt | Standard (8-10 sections) |
| Tools available | Email and research tools |
| Streaming | No |
| Model | Primary |
When it activates:
| Trigger | Description |
|---|---|
email_inbound | New inbound email received |
email_reply | Reply to an existing email thread |
What happens: AICOS loads email context (sender, thread, body), retrieves relevant knowledge, and drafts a response. It skips reflection, delta calculation, and goal decomposition since these are not needed for email handling. Temperature is tuned for factual accuracy during research and allows slightly more creativity during synthesis. The tool subset includes email composition, knowledge retrieval, web search, and SME delegation, but excludes project management and goal tools.
Available tools in this profile:
respond,send_to,forward_emailretrieve_knowledge,record_knowledge,search_outermind_docssend_message_to_boinvoke_smetool_search
Quick Chat (quick_chat)
The fast path, optimized for conversational latency. Business Owner messages, Teams mentions, and Slack mentions are routed here for sub-second first-token delivery.
| Parameter | Value |
|---|---|
| Max tool calls | 15 |
| Max execution time | 3 minutes |
| Max tokens per response | 4,096 |
| System prompt | Compact (4-5 sections) |
| Tools available | Chat-relevant tools (read-only + messaging) |
| Streaming | Yes (tokens arrive in real time) |
| Model | Chat (faster, lower-latency model if configured) |
When it activates:
| Trigger | Description |
|---|---|
bo_message | Business Owner sends a chat message |
teams_message | Message via Microsoft Teams |
slack_message | Message via Slack |
What happens: AICOS skips all pre-phases (knowledge bootstrapping, reflection, delta calculation, goal decomposition) and jumps directly to the core execution loop with a compact system prompt. The minimal prompt includes base instructions, key people, and a lightweight context summary, omitting phase instructions, operational context, and pre-fetched knowledge bundles. Tokens stream to the frontend via SignalR so the user sees the response building in real time. If the LLM responds without tool calls, the loop terminates immediately rather than continuing to iterate.
Available tools in this profile:
retrieve_knowledge,record_knowledgesend_message_to_bo,respondsearch_outermind_docssummarize_active_work,review_project,review_goaltool_search
Configure a dedicated chat model in AICOS Settings for the best quick chat experience. A faster model (such as Claude Sonnet 4) delivers noticeably quicker responses for simple status questions and conversations, while the primary model (such as Claude Opus 4.5) handles deep planning during daily wake-ups.
Reactive Event (reactive_event)
Event-driven profile for handling completions, approvals, and reminders. Loads minimal context, focuses on processing the event and updating state.
| Parameter | Value |
|---|---|
| Max tool calls | 20 |
| Max execution time | 5 minutes |
| Max tokens per response | 8,192 |
| System prompt | Minimal (essential instructions only) |
| Tools available | Event-focused tools (status updates + communication) |
| Streaming | No |
| Model | Primary |
When it activates:
| Trigger | Description |
|---|---|
approval_received | Business Owner approved or denied a request |
reminder_triggered | A scheduled reminder fired |
sme_completed | An SME agent completed its delegated task |
scheduled_followup | A scheduled follow-up action triggered |
What happens: AICOS loads the event context, assesses what happened, updates the relevant project or task state, and optionally notifies the Business Owner. It does not need planning, reflection, or goal decomposition, since the event itself defines the scope of work. The tool subset includes state update tools (update project, update task, complete task), communication tools, and knowledge tools.
Available tools in this profile:
update_project,update_project_tasks,complete_project_tasksupdate_goalrecord_knowledge,retrieve_knowledgesend_message_to_bo,administer_approvalsmanage_my_settings(setting_area=scheduled_reminders)invoke_smetool_search
Process Email Inbound (process_email_inbound)
PA-specific profile for processing new emails arriving in a Personal Assistant's shared mailbox. Lighter than AICOS's email_research profile since PA focuses on reply/notify/read actions rather than deep research.
| Parameter | Value |
|---|---|
| Max tool calls | 20 |
| Max execution time | 5 minutes |
| Max tokens per response | 8,192 |
| System prompt | Standard |
| Tools available | PA inbound email tools (reply, forward, calendar, mailbox search) |
| Streaming | No |
| Model | Primary |
When it activates:
| Trigger | Description |
|---|---|
pa_email_inbound | New email received in a PA shared mailbox |
What happens: The PA loads the email context, determines if a reply is needed or if the employee should be notified, and takes appropriate action. It does not initiate autonomous project or task work from the email. The tool subset includes email composition, knowledge retrieval, employee inbox/calendar access, and mailbox search.
Available tools in this profile:
respond,forward_emailretrieve_knowledge,search_outermind_docssend_message_to_bossread_my_inbox,read_supervisor_calendarsearch_mailbox,list_email_attachments,extract_attachment_text
Process Email Reply (process_email_reply)
PA-specific profile for processing replies to emails previously sent by the PA. Slightly larger budget than inbound since replies may require updating tasks or recording knowledge.
| Parameter | Value |
|---|---|
| Max tool calls | 25 |
| Max execution time | 8 minutes |
| Max tokens per response | 8,192 |
| System prompt | Standard |
| Tools available | PA reply tools (inbound tools + task updates + knowledge recording) |
| Streaming | No |
| Model | Primary |
When it activates:
| Trigger | Description |
|---|---|
pa_email_reply | Reply received to a PA-sent email |
What happens: The PA continues the email conversation based on the reply content. It may update tasks or record knowledge based on information in the reply. If the reply resolves an open question, the PA updates the relevant records. The tool subset extends inbound tools with write capabilities for tasks and knowledge.
Available tools in this profile:
respond,forward_emailretrieve_knowledge,search_outermind_docs,record_knowledgesend_message_to_bossread_my_inbox,read_supervisor_calendarsearch_mailbox,list_email_attachments,extract_attachment_textcomplete_project_tasks,update_project_tasks
Profile Comparison
| Dimension | Long Processing | Email Research | Quick Chat | Process Email Inbound | Process Email Reply | Reactive Event |
|---|---|---|---|---|---|---|
| Typical duration | 5-20 minutes | 1-5 minutes | 2-15 seconds | 30 seconds - 3 minutes | 1-5 minutes | 30 seconds - 2 minutes |
| Tool calls | 50-100+ | 10-30 | 1-5 | 5-15 | 5-20 | 5-15 |
| Max tool calls | 100 | 30 | 15 | 20 | 25 | 20 |
| Max execution time | 20 minutes | 10 minutes | 3 minutes | 5 minutes | 8 minutes | 5 minutes |
| Max tokens/response | 16,384 | 8,192 | 4,096 | 8,192 | 8,192 | 8,192 |
| System prompt | Full | Standard | Compact | Standard | Standard | Minimal |
| Streaming | No | No | Yes | No | No | No |
| Model tier | Primary | Primary | Chat | Primary | Primary | Primary |
| Phase skipping | None | Reflection, delta | All pre-phases | All except assess | All except assess | Reflection, delta, decomposition |
Customizing Profile Budgets
Administrators can override the default budget limits for each profile through AICOS Settings. This allows you to fine-tune resource allocation based on your organization's usage patterns.
Available Budget Overrides
Each profile has two configurable budget parameters:
| Setting | Description | Default |
|---|---|---|
| Max Tool Calls | Maximum number of tool calls per execution | Varies by profile |
| Max Execution Time | Maximum wall-clock time before timeout | Varies by profile |
Budget overrides are stored per-tenant in Account Settings and applied during profile resolution.
Configuring Budget Overrides
- Navigate to Monitor > Dashboard > Boardroom
- Click AICOS Settings
- Scroll to the Execution Profile Budgets section
- Adjust the values for each profile as needed
- Click Save
Budget overrides apply to all executions of that profile type. If you reduce the quick chat tool limit from 15 to 5, all chat messages will be limited to 5 tool calls. Start with the defaults and adjust only if you observe specific issues in the Performance dashboard.
When to Adjust Budgets
| Scenario | Adjustment |
|---|---|
| Chat responses are too slow | Reduce quick chat max tool calls to 5-10 |
| Daily runs time out frequently | Increase long processing max time to 30 minutes |
| Email research is too shallow | Increase email research max tool calls to 50 |
| Costs are too high for daily runs | Reduce long processing max tool calls to 50 |
| Reactive events need more depth | Increase reactive event max tool calls to 30 |
Understanding Phase Skipping
Different profiles skip different pre-execution phases to optimize for their use case. Understanding which phases run helps explain why certain conversation types are faster than others.
Execution Phases
| Phase | Long Processing | Email Research | Quick Chat | Process Email Inbound | Process Email Reply | Reactive Event |
|---|---|---|---|---|---|---|
| Context Assessment | Yes | Yes | Yes | Yes | Yes | Yes |
| Knowledge Pre-fetch | Yes | Yes | Skipped | Skipped | Skipped | Skipped |
| Load Session State | Yes | Skipped | Skipped | Skipped | Skipped | Yes |
| Delta Calculation | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
| Reflection on Prior Sessions | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
| Goal Decomposition | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
| Session Planning | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
| Execute Work | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
| Communication | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
| State Save | Yes | Skipped | Skipped | Skipped | Skipped | Yes |
| Schedule Next Run | Yes | Skipped | Skipped | Skipped | Skipped | Skipped |
The quick chat and PA email profiles achieve their speed by skipping most pre-phases, going directly from context assessment to the core execution loop. Only long_processing runs the full pipeline.
Fallback Behavior
Execution profiles are designed with backwards compatibility in mind:
- Unknown trigger types automatically map to
long_processing(the full pipeline) - If profile resolution fails, the executor falls back to v1 behavior with all default settings
- If a configured chat model is unavailable, the system falls back to the primary model
- If a configured chat model becomes inactive, the system falls back to the primary model
This ensures that AICOS never fails due to profile configuration issues. The worst case is that a chat message runs through the full pipeline (slower but still functional).
Troubleshooting
Chat Responses Are Still Slow
- Verify that a chat model is configured in AICOS Settings (see Settings & Customization)
- Check the Performance dashboard for the quick chat p95 latency metric
- Ensure the chat model is active and has available API quota
- If using Claude Opus as the chat model, consider switching to Claude Sonnet for faster responses
Profile Resolution Issues
If AICOS appears to be using the wrong profile for a trigger type:
- Check the execution logs in Manage > Data & Logs for the
profileCategoryfield - Compare the
profileTriggervalue against the expected trigger-to-profile mapping above - Unknown trigger types default to
long_processing, which is safe but may be slower than expected
Budget Exceeded Errors
If executions are timing out or hitting tool call limits:
- Review the Performance dashboard to see average tool usage by profile
- Increase the relevant budget in AICOS Settings
- Consider whether the trigger is being routed to the right profile
End-User Reasoning Escalation
For chat-style work, end users can opt their next message into the heaviest reasoning model (Opus tier) by starting it with one of the recognized reasoning-request phrases or including an inline marker. This is useful when a user wants a deeper, more careful answer for a specific question without changing any tenant-level configuration. The escalation respects your tenant's subscription tier, so a tier without access to the reasoning model continues to run on the model the tenant was already entitled to.
The recognized triggers are:
- Phrases at the start of the message (or right after a sentence ending): "think hard", "think carefully", "think step by step", "think this through", "think about it deeply", "be thorough", "deep dive", "in depth" or "in-depth", "extended thinking", "use reasoning", "really think", "no shortcuts".
- Inline markers that work anywhere in the message:
#think,/think,**think**,*think*.
The phrase check is anchored, so a casual mid-sentence "I'll think hard about it later" does not trigger an upgrade; only a leading "Think hard about Q3 positioning" does. Markers are unanchored so users who do not want to craft a sentence can opt in with a single token. Communicate these phrases and markers to your end users so they know how to ask for a deeper response when one is warranted. Each escalation is recorded in the execution logs with the matched substring for audit review.
Related Topics
- AICOS Overview - Introduction to AICOS
- Execution Cycle - How AICOS operates daily
- Settings & Customization - Model selection and budget configuration
- Performance Dashboard - Monitor profile metrics and costs