Tools: What the AI Can Do

The AI doesn't just chat—it can take real actions. Use tools to browse the web, read documents, generate images, manage your calendar, and more. The AI automatically decides which tools to use based on what you ask for.

Caiioo uses a read/write access model: Free users get read-only access to most tools, while Pro unlocks full write access across the board.

Everyone Gets These Tools (Free)

Web Browsing (Read-Only)

The AI can navigate your browser, read pages, take screenshots, search Google, and extract content. Perfect for research and data gathering.

What you can ask:

"Read this page and summarize it"
"Take a screenshot of this"
"Find all the prices on this page"
"Search Google for the best camping tents"

Google Workspace (Read-Only)

Search and read your Gmail, Google Drive, Docs, Sheets, and Calendar — no setup beyond connecting your Google account. Gmail searches understand natural filters—sender, subject, label, category, age, attachments, and unread state—so you don't have to know Gmail's search syntax.

What you can ask:

"Search my Drive for the Q4 report"
"Find unread emails from Bob in the last week that have attachments"
"Read my latest emails"
"What's on my calendar today?"

Sundial Agenda (Read-Only)

View your calendar events and reminders, check availability, and find free time slots. Works with Google Calendar on all platforms.

Web Search

AI-powered search with citations. Ask questions and get sourced answers instead of hunting through search results yourself.

Slate Editor

Real-time AI collaboration for code and documents. See Slate for details.

Calculator

Quick math. The AI can do arithmetic, trigonometry, statistics, and more without using external tools.

API Integration (Read-Only)

Make GET requests to external REST APIs. Useful for fetching data from services we don't have built-in support for yet.

SQL Database

Create and query local SQLite databases. Useful for analyzing CSV data, building lightweight dashboards, or prototyping data workflows.

Sub-Agents

Delegate parts of a complex task to parallel agents so they run independently and report back. Useful when you want research, analysis, and drafting to happen at the same time. Sub-agent results render inline in the main chat.

Hearing

Ask follow-up questions about any audio attachment. The assistant can re-listen to a recording with a targeted question — "which words were mispronounced?", "what's the tone of voice?" — on the turn you attach it or any later one, via an audio-capable helper model. See Voice for the recording workflow.

Ask User

Pause an AI run mid-execution and surface a decision dialog. The AI presents up to 4 options (approve, approve with notes, reject, reject with notes) and waits for your input. Your notes flow back as plain-English guidance that overrides the plan, and the agent continues in place.

Self Checker

Rate and verify every assistant turn. Click the ⚖ button next to any response to open a verdict card. Choose from LLM-powered judgment plus deterministic checks (exact match, contains substring, regex pattern, number range, arithmetic). The judge can also write a small Python program to re-verify things no simple check can — recomputing a claimed statistic from the source data, for example — run locally in the same no-network sandbox as everything else (browser extension and desktop apps); a check that can't run cleanly reports as failed rather than silently passing. The verdict computes in the background, so it completes even if you close the panel. Results show inline. Cost rolls into your thread total. You can pick which provider and model act as the judge — leave the model blank and Caiioo picks one that suits the provider you chose.

Instant Tool Chooser

On-device semantic tool selection. The AI picks the right tool in ~10ms without calling a model. Enabled by default on every tier—toggle "Instant Tool Chooser" vs "Quick Tasks LLM" in Settings > Tools.

Pro Tier Tools ($9/month)

Pro unlocks full write access to tools that are read-only on Free, plus additional capabilities.

When you're running a local chat model (like Ollama), AI-powered tools that would send your data to a remote AI provider ask for your approval first. See Privacy & Data → Remote AI Providers.

Full Web Automation

Everything in read-only browsing, plus: click links, fill forms, type text, interact with page elements, and execute JavaScript. Perfect for form-filling, data entry, and browser automation.

What you can ask:

"Fill out this form with my info"
"Click on the Reviews tab and read what people say"
"Log into this site and download my invoice"

Full Google Workspace

Create, edit, and manage Google Docs, Sheets, Slides, Gmail drafts, Drive files, and Calendar events. Google Sheets gained new actions: paste CSV/TSV/HTML, split text to columns, trim whitespace, remove duplicates, apply and clear toolbar filters, move rows and columns, insert and delete cell ranges, protect ranges, define and update named ranges, attach developer metadata, and apply conditional formatting. Google Docs gained native comments (add, reply, resolve, delete), multi-tab document support, and smart-chip recognition (people, links, equations, page breaks, dates).

Google Drive — Create folders, move files, manage sharing permissions Gmail — Draft and send emails, download attachments Google Docs & Sheets — Create and edit documents, write formulas, format cells Google Slides — Create presentations, add text/images/tables, edit layouts Google Calendar — Create events and reminders, schedule meetings across every calendar in your account (team, family, and personal calendars)

Full Sundial Agenda

Create events and reminders, schedule meetings, and manage your calendar across providers.

Full API Integration

POST, PUT, PATCH, and DELETE requests to any REST endpoint — not just GET.

Document Ingestion

Upload and analyze PDFs, Word docs (DOCX), Excel spreadsheets (XLSX), and images with OCR. Higher-quality extraction on complex documents is available through OCR models accessed via your OpenRouter key.

Image Generation

Create images from text descriptions. Available models include FLUX.2 (Flex, Klein, Max, Pro), Gemini (2.5 Flash, 3 Pro, 3.1 Flash), GPT-5 Image, Seedream 4.5, and Riverflow v2. Perfect for illustrations, mockups, or visualizations.

What you can create:

A product mockup for a new design
An illustration for a blog post
A texture for a 3D project
Variations on an existing image

Video Generation

Generate videos from text descriptions. Available models accessed via OpenRouter: Google Veo 3.1, OpenAI Sora 2 Pro, and ByteDance Seedance. Valid durations and resolutions vary per model. Videos save as thread attachments.

What you can generate:

Product demo videos
Animated explainers
Scene transitions for edits
Storyboard sequences

Music Generation

Generate original music from text descriptions via Google Lyria 3 Pro Preview (accessed through OpenRouter). Creates royalty-free tracks that save as inline audio attachments in your thread.

What you can generate:

Background music for videos
Ambient soundscapes
Musical themes for projects
Instrumental versions of descriptions

Seeing-Eye Dog

Vision fallback for text-only models. If your chosen model doesn't support images (like local Ollama, DeepSeek V4 Pro, Kimi K2.6), attach images anyway—they route through a cheap vision model to generate captions, then text is sent to your main model. Default is Gemini 3.5 Flash Lite via OpenRouter. Auto-caption happens at message-build time with per-attachment caching. Use the vision({action: "inspect"}) tool for targeted follow-up. Configure in Settings > Tools > Vision Fallback Model.

Workspace Files

Sandboxed read, write, edit, and search inside a folder you point at. Cannot escape that folder or hit the network. Auto-parses .docx, .xlsx, .pptx, and PDF. Perfect for working with local project files without uploading them to the cloud. Configure in Settings > Tools > Workspace Files.

Adding a folder: on the desktop apps, type an absolute folder path; in the browser extension, an "Add folder" action opens your browser's own folder picker and remembers the folder you choose. A folder that's later deleted, renamed, or on a disconnected drive is simply skipped, so the rest of your workspace keeps working.

Test Runner

Run a list of prompts with graders — substring match, pattern (regex), expected tool calls, or second-model 1-10 scoring. Each prompt runs in its own fresh conversation, through the same pipeline as your actual messages — so results reflect what the app really does. A floating panel shows the live run (suite name, per-test pass/fail, progress) with a cancel button, and results are saved as the run progresses, so exports and reports survive an app restart. Export results as a CSV pass/fail report.

XLSX Cell-Level Tracked Changes

Slate spreadsheets now support DOCX-style redlining via propose_change({editMode: 'xlsx_cell'}). Changes are anchored by cellRef and sheet name, rendered inline as <del>old</del><ins>new</ins>, with a toolbar for next, previous, accept, and reject. AI and user merge cell-by-cell with user edits winning on conflict.

Physics & Structural Analysis

Calculate projectile motion, collisions, energy, momentum, force, impulse, velocity-to-target, beam loading, column buckling, and material properties.

Private Sync

Sync your settings and conversations across devices via encrypted cloud backup. See Private Sync for details.

MCP Servers

Connect to remote MCP servers over HTTP/SSE, or run desktop tools (local MCP via the desktop app) on your own machine. Build custom tool integrations or connect to third-party services.

Meeting Recall

Retrieve details from recent video calls — transcripts, action items, and summaries so you can pull meeting context into any conversation.

Messaging Gateway

Answer and send messages across WhatsApp, Telegram, and more, with the agent responding for you. See Messaging Gateway.

More Pro Features

Unlimited custom modes — Create your own AI personalities with custom prompts and variables
Custom profile variables — Personalize AI behavior across all modes
Per-action instructions — Customize how each tool action behaves
Caiioo Benchmarks — Compare how models perform with quality evaluations and throughput tests
Priority support — Submit support tickets directly from the app

Experimental Tools

Toggle experimental tools via an on-device switch. These rotate as features mature into Free or Pro tiers. Available options include GitHub integration, Slack, advanced spatial reasoning, test automation, and more.

Enable or Disable Tools

Go to Settings > Tools to see what's available and toggle tools on or off. Some modes come with specific tools pre-configured.

Tools that don't run on your current device appear grayed out with a note saying which platforms they work on — and a tool that just needs the desktop app running says so — so you can discover what's available elsewhere instead of never knowing it exists. If you ask the assistant for one by name, it tells you which platforms the tool works on instead of claiming it doesn't exist.

Tools: What the AI Can Do

Everyone Gets These Tools (Free)

Web Browsing (Read-Only)

Google Workspace (Read-Only)

Sundial Agenda (Read-Only)

Web Search

Slate Editor

Calculator

API Integration (Read-Only)

SQL Database

Sub-Agents

Hearing

Ask User

Self Checker

Instant Tool Chooser

Pro Tier Tools ($9/month)

Full Web Automation

Full Google Workspace

Full Sundial Agenda

Full API Integration

Document Ingestion

Image Generation

Video Generation

Music Generation

Seeing-Eye Dog

Workspace Files

Test Runner

XLSX Cell-Level Tracked Changes

Physics & Structural Analysis

Private Sync

MCP Servers

Meeting Recall

Messaging Gateway

More Pro Features

Experimental Tools

Enable or Disable Tools

See Also