Tools: What the AI Can Do

The AI doesn't just chat—it can take real actions. Use tools to browse the web, read documents, generate images, manage your calendar, and more. The AI automatically decides which tools to use based on what you ask for.

Caiioo uses a read/write access model: Free users get read-only access to most tools, while Pro unlocks full write access across the board.

Everyone Gets These Tools (Free)

Web Browsing (Read-Only)

The AI can navigate your browser, read pages, take screenshots, search Google, and extract content. Perfect for research and data gathering.

What you can ask:

  • "Read this page and summarize it"
  • "Take a screenshot of this"
  • "Find all the prices on this page"
  • "Search Google for the best camping tents"

Google Workspace (Read-Only)

Search and read your Gmail, Google Drive, Docs, Sheets, and Calendar — no setup beyond connecting your Google account. Gmail searches understand natural filters—sender, subject, label, category, age, attachments, and unread state—so you don't have to know Gmail's search syntax.

What you can ask:

  • "Search my Drive for the Q4 report"
  • "Find unread emails from Bob in the last week that have attachments"
  • "Read my latest emails"
  • "What's on my calendar today?"

Sundial Agenda (Read-Only)

View your calendar events and reminders, check availability, and find free time slots. Works with Google Calendar on all platforms.

Web Search

AI-powered search with citations. Ask questions and get sourced answers instead of hunting through search results yourself.

Slate Editor

Real-time AI collaboration for code and documents. See Slate for details.

Calculator

Quick math. The AI can do arithmetic, trigonometry, statistics, and more without using external tools.

API Integration (Read-Only)

Make GET requests to external REST APIs. Useful for fetching data from services we don't have built-in support for yet.

SQL Database

Create and query local SQLite databases. Useful for analyzing CSV data, building lightweight dashboards, or prototyping data workflows.

Sub-Agents

Delegate parts of a complex task to parallel agents so they run independently and report back. Useful when you want research, analysis, and drafting to happen at the same time. Sub-agent results render inline in the main chat.

Ask User

Pause an AI run mid-execution and surface a decision dialog. The AI presents up to 4 options (approve, approve with notes, reject, reject with notes) and waits for your input. Your notes flow back as plain-English guidance that overrides the plan, and the agent continues in place.

Self Checker

Rate and verify every assistant turn. Click the ⚖ button next to any response to open a verdict card. Choose from LLM-powered judgment plus deterministic checks (exact match, contains substring, regex pattern, number range, arithmetic). Results show inline. Cost rolls into your thread total.

Instant Tool Chooser

On-device semantic tool selection. The AI picks the right tool in ~10ms without calling a model. Enabled by default on every tier—toggle "Instant Tool Chooser" vs "Quick Tasks LLM" in Settings > Tools.

Pro Tier Tools ($9/month)

Pro unlocks full write access to tools that are read-only on Free, plus additional capabilities.

When you're running a local chat model (like Ollama), AI-powered tools that would send your data to a remote AI provider ask for your approval first. See Privacy & DataRemote AI Providers.

Full Web Automation

Everything in read-only browsing, plus: click links, fill forms, type text, interact with page elements, and execute JavaScript. Perfect for form-filling, data entry, and browser automation.

What you can ask:

  • "Fill out this form with my info"
  • "Click on the Reviews tab and read what people say"
  • "Log into this site and download my invoice"

Full Google Workspace

Create, edit, and manage Google Docs, Sheets, Slides, Gmail drafts, Drive files, and Calendar events. Google Sheets gained new actions: paste CSV/TSV/HTML, split text to columns, trim whitespace, remove duplicates, apply and clear toolbar filters, move rows and columns, insert and delete cell ranges, protect ranges, define and update named ranges, attach developer metadata, and apply conditional formatting. Google Docs gained native comments (add, reply, resolve, delete), multi-tab document support, and smart-chip recognition (people, links, equations, page breaks, dates).

Google Drive — Create folders, move files, manage sharing permissions Gmail — Draft and send emails, download attachments Google Docs & Sheets — Create and edit documents, write formulas, format cells Google Slides — Create presentations, add text/images/tables, edit layouts Google Calendar — Create events and reminders, schedule meetings across every calendar in your account (team, family, and personal calendars)

Full Sundial Agenda

Create events and reminders, schedule meetings, and manage your calendar across providers.

Full API Integration

POST, PUT, PATCH, and DELETE requests to any REST endpoint — not just GET.

Document Ingestion

Upload and analyze PDFs, Word docs (DOCX), Excel spreadsheets (XLSX), and images with OCR. Higher-quality extraction on complex documents is available through OCR models accessed via your OpenRouter key.

Image Generation

Create images from text descriptions. Available models include FLUX.2 (Flex, Klein, Max, Pro), Gemini (2.5 Flash, 3 Pro, 3.1 Flash), GPT-5 Image, Seedream 4.5, and Riverflow v2. Perfect for illustrations, mockups, or visualizations.

What you can create:

  • A product mockup for a new design
  • An illustration for a blog post
  • A texture for a 3D project
  • Variations on an existing image

Video Generation

Generate videos from text descriptions. Available models accessed via OpenRouter: Google Veo 3.1, OpenAI Sora 2 Pro, and ByteDance Seedance. Valid durations and resolutions vary per model. Videos save as thread attachments.

What you can generate:

  • Product demo videos
  • Animated explainers
  • Scene transitions for edits
  • Storyboard sequences

Music Generation

Generate original music from text descriptions via Google Lyria 3 Pro Preview (accessed through OpenRouter). Creates royalty-free tracks that save as inline audio attachments in your thread.

What you can generate:

  • Background music for videos
  • Ambient soundscapes
  • Musical themes for projects
  • Instrumental versions of descriptions

Seeing-Eye Dog

Vision fallback for text-only models. If your chosen model doesn't support images (like local Ollama, DeepSeek V4 Pro, Kimi K2.6), attach images anyway—they route through a cheap vision model to generate captions, then text is sent to your main model. Default is Gemini 3.1 Flash Lite via OpenRouter. Auto-caption happens at message-build time with per-attachment caching. Use the vision({action: "inspect"}) tool for targeted follow-up. Configure in Settings > Tools > Vision Fallback Model.

Workspace Files

Sandboxed read, write, edit, and search inside a folder you point at. Cannot escape that folder or hit the network. Auto-parses .docx, .xlsx, .pptx, and PDF. Perfect for working with local project files without uploading them to the cloud. Configure in Settings > Tools > Workspace Files.

Test Runner

Run a list of prompts with graders — substring match, pattern (regex), expected tool calls, or second-model 1-10 scoring. Each prompt runs in its own fresh conversation. Export results as a CSV pass/fail report.

XLSX Cell-Level Tracked Changes

Slate spreadsheets now support DOCX-style redlining via propose_change({editMode: 'xlsx_cell'}). Changes are anchored by cellRef and sheet name, rendered inline as <del>old</del><ins>new</ins>, with a toolbar for next, previous, accept, and reject. AI and user merge cell-by-cell with user edits winning on conflict.

Physics & Structural Analysis

Calculate projectile motion, collisions, energy, momentum, force, impulse, velocity-to-target, beam loading, column buckling, and material properties.

Private Sync

Sync your settings and conversations across devices via encrypted cloud backup. See Private Sync for details.

MCP Servers

Connect to remote MCP servers over HTTP/SSE, or run desktop tools (local MCP via the desktop app) on your own machine. Build custom tool integrations or connect to third-party services.

Meeting Recall

Retrieve details from recent video calls — transcripts, action items, and summaries so you can pull meeting context into any conversation.

Messaging Gateway

Answer and send messages across WhatsApp, Telegram, and more, with the agent responding for you. See Messaging Gateway.

More Pro Features

  • Unlimited custom modes — Create your own AI personalities with custom prompts and variables
  • Custom profile variables — Personalize AI behavior across all modes
  • Per-action instructions — Customize how each tool action behaves
  • Caiioo Benchmarks — Compare how models perform with quality evaluations and throughput tests
  • Priority support — Submit support tickets directly from the app

Experimental Tools

Toggle experimental tools via an on-device switch. These rotate as features mature into Free or Pro tiers. Available options include GitHub integration, Slack, advanced spatial reasoning, test automation, and more.

Enable or Disable Tools

Go to Settings > Tools to see what's available and toggle tools on or off. Some modes come with specific tools pre-configured.

See Also


This guide is maintained by the Caiioo team using Slate, our built-in editor.