Version 0.9.724 (June 13, 2026)
A fast follow-up to 0.9.723 fixing a launch freeze for members whose subscription had lapsed.
Bug Fixes
- Common: Fixed a freeze on launch after a subscription lapsed: If your Pro subscription expired or was cancelled, the app could get stuck repeating the free-tier downgrade while it loaded — freezing the interface before it finished opening. The downgrade now completes cleanly in a single pass, so the app starts normally and keeps every free-tier feature.
Version 0.9.723 (June 12, 2026)
A targeted follow-up to 0.9.722 with a new BYOK voice provider, a fresh clinician decision-support app, in-app editing for personal Hub apps, the ability to use Caiioo from any browser through your own private relay, a clearer Settings panel, durable mode editing with automatic forking, a substantial second pass through the document round-trip pipeline, a streaming-reliability pass across every AI provider, and a security-hardening sweep.
New Features
- Common: Cartesia is now a BYOK voice provider: Cartesia Sonic for text-to-speech and Cartesia Ink for speech-to-text are now in the per-mode voice picker, the first-use TTS / STT dialogs, and the live-captions path. Add your Cartesia API key in Settings → Voice and pick Cartesia wherever a voice provider can be chosen.
- Common: Edit any forked Hub app inside Caiioo: Settings → Tools, Modes & Apps gains a per-primitive editor for personal apps (forked Hub apps and apps you've saved). Cards, views, workflows, skills, modes, and variables all get dedicated editors with safe defaults, so a malformed primitive in a fork can no longer break the editor. The agent can also snapshot a useful session into a draft personal app via the new
save_session_as_apptool — say "save this as an app" (or similar) and it persists into your personal-apps store for you to refine. - Common: Edit any mode's system prompt — Caiioo forks it for you: You can now edit the system prompt (or the entire definition) of any mode, including built-in and Hub-installed ones. The first edit automatically forks the mode's app into a personal copy, so your changes stick across mode switches and sync to your other devices like any personal app. A "Forked from … — your copy" banner shows whenever your copy is active, with a one-click Reset that restores the original. Existing custom modes migrate into this system automatically.
- Common: Caiioo for Medicine (beta): A new Community Hub-installable app mirroring the Caiioo for Legal pattern — a clinician decision-support factory that bundles differential-diagnosis, drug-interaction, and SOAP-note skills together with their reference materials. One-click install, forkable like any Hub app.
- Common: Use Caiioo from any browser through your own private relay: When you're signed in and your desktop relay is running, opening caiioo.ai in a browser without the extension now serves the same sidepanel UI through your private relay — so you can use Caiioo from a Chromebook, a public computer, or a tablet while every tool call still routes through your own machine.
- Common: Settings panel reorganized: The advanced settings layout is now seven user-intent categories — Account, Personalization & Privacy, AI Setup, Tools / Modes & Apps, Data & Sync, Connectivity, and Help — instead of the historical five. User Profile and Credentials Vault move into a dedicated Account category, LAN Relay / API Access / Messaging Gateway group together as Connectivity (the common thread is inbound/outbound network surfaces), Voice moves into AI Setup (it's AI behavior, not a tool), and Backup / Private Sync / Data Management form their own Data & Sync category. The simple-mode variant collapses the same controls into six categories with power-user controls hidden entirely.
Improvements
- Common: Document round-trip — second fidelity pass: Another sweep through Slate / DOCX / PDF / RTF / Google Docs closed a long list of round-trip findings. Highlights: generated tracked changes now emit Word-valid change ids and flatten nested change markers (so Word stops complaining about "corrupt" tracked changes on open); DOCX comments are deduplicated by id instead of double-counted; DOCX → Markdown export escapes literal
|characters in table cells so the table doesn't dissolve; the RTF parser now consumes embedded binary segments (\binN) correctly so stray bytes no longer desync the rest of the document, skips\ucUnicode fallbacks properly, decodes\'hhhex via Windows-1252, preserves tracked revisions, and renders image placeholders; PDFToUnicodeCMaps decode surrogate pairs and bfrange array / multi-unit forms (non-BMP characters and emoji extract cleanly); track-changes prefix detection handles indented and nested list markers plus)numbering; and the fidelity matrix now surfaces comment loss for docx → md/html/rtf and pdf → docx so you can see in advance what won't round-trip. RTF export also strips pending tracked deletions instead of inlining them as literal text. - Common: Google Sheets formulas survive Slate sync: Editing a synced Google Sheet from Slate no longer rewrites formulas as their evaluated values. Tracked-change resolution in Slate is also corrected so accept / reject commits the right text when content already matches the target.
- Common: Google Docs sync ignores table-cell paragraphs: The sync scanner previously walked into table cells and produced wrong document offsets. Cells are now skipped before paragraph scanning, so edits land at the correct position in documents that contain tables.
- Common: Pseudonymizer protects machine-generated IDs end-to-end: Slate (and any tool that emits machine-generated identifiers) can now wrap them in a
<!--no-pseudonymize-->marker that survives the entire pipeline — the pseudonymizer skips them, and the markers are stripped before they reach the screen. This closes the case where a tool-call's internal id got pseudonymized on one turn and dropped from the substitution map on the next, breaking follow-up tool calls that referenced it. - Common: Self Checker now runs the judge on the server: Clicking the ⚖ button moves the verdict computation onto the relay instead of running it in the sidepanel, so the result completes even if you close the panel, returns faster, and stays consistent across devices. The judge call is also routed through the same thread-level pseudonymizer as the original turn, so a verdict on a pseudonymized message never leaks the real PII to the judge model.
- Common: Skill / Mode badges in the Settings inventory: Each installed skill in Settings → Tools, Modes & Apps now shows which mode(s) it belongs to, so you can see at a glance where the agent will actually reach for it.
- Common: Free-tier users on the provisioned key no longer 401 on web search: Search tools were resolving the OpenRouter key via a code path that bypassed the credential resolver, so users on the provisioned key (no BYOK) intermittently got 401s. The resolver now runs at every call site.
- Common: UI-context provider is noticeably snappier: The extension UI-context provider gained result caching, lazy mode loading, alias resolution, and tighter exclusions, so the model picker and mode picker don't stall on large workspaces.
- Common: Voice playback-speed slider now works for ElevenLabs and Cartesia: The speed slider had no effect with ElevenLabs or Cartesia — speech always played at normal rate regardless of the setting. Both providers now generate speech at your chosen speed. (ElevenLabs supports 0.7–1.2×, Cartesia 0.6–1.5×; the slider clamps to each provider's range. Resemble and Gemini don't offer a speed control and are unchanged.)
- Common: STT model loading is visible in the composer: When the on-device speech model is downloading or warming up, the composer now shows real download percentage, and concurrent load requests are serialized so two near-simultaneous "speak" actions don't kick off duplicate downloads.
- Common: Composer keeps the send button on-screen on narrow widths: The send button could previously clip out of the composer when the panel was very narrow. It now stays anchored regardless of width.
- Common: Remote sidepanel reconnects cleanly: For the new "Caiioo through a browser via your private relay" capability, the relay now addresses replies back to the remote browser correctly and primes it with an initial state snapshot on connect, so reconnects come up in the same state you left off.
- Common: UI-asset handlers echo request IDs: Responses now echo the originating request id back to the caller, fixing a class of stuck-spinner bugs where two asset fetches raced and the second result was discarded.
- Common: Caiioo for Legal is now just "for Legal": The "Caiioo for" prefix on the Legal app name was redundant once "for Medicine" landed without it. Existing installs continue to work via the legacy id.
- Common: Streaming reliability pass across every AI provider: A deep pass over how replies stream in from every provider. Reply fragments that arrived split across network packets could be dropped silently — breaking multi-turn extended thinking with Anthropic models, reasoning continuity with Gemini and GPT-5, and occasionally losing answer text outright with local MLX models — and non-English text or emoji could arrive corrupted into � characters in saved answers. Both are fixed everywhere. The Stop button now also cancels an in-flight Perplexity search instead of letting it finish (and bill) in the background, stopping a run now reaches any sub-agents still working, and images attached when chatting with Mistral models no longer get mangled in transit.
- Common: Skills now have proper names: Every skill carries an explicit display name, shown in Settings, the composer's skill picker, and to the agent itself — so skills whose prompts open with similar wording no longer collapse into indistinguishable rows. Typing "/" in the composer matches by name first, and publishing to the Hub now requires a name on every skill.
- Common: Hub-app modes keep their full configuration: Modes installed from the Community Hub (for Legal, for Medicine, …) could silently fall back to the general mode deep in the agent core, dropping their variables and tool configuration mid-run. They now resolve correctly everywhere, and a mode whose definition can't be resolved falls back to the standard Caiioo prompt instead of an empty one.
- Common: Provisioned-key (Caiioo-issued) account fixes: Re-issuing your Caiioo-provided AI key no longer wipes purchased credits — the remaining balance carries over to the new key. Revoking a key now actually revokes it with the provider before reporting success. Accounts with unlimited balances now display "Unlimited" instead of a number and no longer trigger automatic credit purchases. And a brief sign-in hiccup while restoring your key on a new device now retries instead of failing.
- Common: Free-tier model picks stay current: The free-tier model list no longer offers free models the provider has retired, which previously produced immediate errors when selected.
- Common: Video generation checks each model's real capabilities first: The video tool now reads each model's live capability sheet — supported aspect ratios, durations, resolutions, and whether it accepts reference frames — directly from the provider and validates your request before submitting, so an unsupported combination fails instantly with a clear message instead of after a long wait. Animating from reference images now sends them in a format every model accepts.
- Common: Pseudonymizer coverage — thread titles and helper calls: With the Pseudonymizer on, auto-generated conversation titles and the small internal AI helper calls now route through the same protection as your messages — on both the client and the relay — so a thread title can no longer carry a real name to the model. Sub-agent answers also display the real values on screen now instead of their substitute names.
- Common: GitHub sync handles non-English content and simultaneous edits: Files synced from GitHub containing accented or non-Latin characters no longer arrive garbled, and pushing a change to a file that moved on GitHub since your last sync now surfaces a conflict instead of silently overwriting the newer copy.
- Common: Transient server errors no longer sign you out: A temporary server error during the daily background sign-in refresh could clear your session and log you out. Only a genuine credential rejection signs you out now; anything transient keeps your session under a grace window.
- Common: Remote browser client boots cleanly: Opening caiioo.ai from a plain browser (through your private relay) no longer shows a long wall of connection errors while signing in, and after an update the UI always loads fresh instead of a stale cached copy.
- macOS, iOS, Android: Cold-start model prewarm: On-device voice and STT models now prewarm during app boot instead of lazy-loading on first use, hiding the multi-second first-call latency. The first tap of the mic now feels close to instant.
- iOS: App Store §3.1.1 sales-routing compliance: All external-payment surfaces (Stripe checkout buttons, "manage billing" links pointing off-app) are gated behind an iOS check, so the iOS client only ever offers in-app StoreKit purchases for digital subscriptions, matching Apple's anti-steering policy.
- iOS, macOS: Stale Safari extension registrations no longer freeze the relay path: When Apple's WebKit leaves multiple Safari extension registrations stale across app updates, Caiioo now evicts the superseded duplicates instead of freezing them dormant, so the relay-backed Safari extension keeps working without a manual reinstall.
- Android: Stale media permissions cleaned up:
READ_MEDIA_IMAGESis gone from the manifest, and the legacy permissions implicitly added by the LiteRT GPU library are stripped, so the app's runtime permission prompt is now minimal — closer to what users actually consented to.
Security
- Common: Hardening sweep across imports, rendering, and logging: Crafted Word documents and conversation-import archives can no longer exhaust memory through decompression tricks (strict size caps and safer document parsing); a maliciously structured ChatGPT export can no longer hang the importer; three cross-site scripting risks in the document (Slate) page are closed; the server now refuses to fetch model-suggested URLs that point at internal or private network addresses; filter rules are rejected if their pattern could lock up the matcher; and a payment-webhook debug log no longer records secrets.
Bug Fixes
- Common: Voice provider API key appears in the right place: When a cloud provider was used only for speech-to-text (e.g. Cartesia Ink as your dictation engine), its API key field was stranded down in the Text-to-Speech (output) section instead of next to the speech-to-text picker — and stayed visible even after you switched your TTS voice to a different provider. Each provider's key field now renders under the selector that actually uses it (input vs. output), driven by a single voice-provider definition.
- Common: Auto-read now works with Cartesia and Gemini voices: The "read replies aloud" toggle silently never activated when your TTS voice was Cartesia or Google Gemini — both were mis-classified as on-device models awaiting a download. Auto-read now correctly turns on once the provider's API key (and voice, where required) is set.
- Common: Cartesia dictation works everywhere: Cartesia Ink as your speech-to-text engine is now correctly recognized as ready once its API key is set (instead of being treated like a local model awaiting download) — across the extension, native apps, and relay-backed setups (macOS, or the web client through your private relay), where the relay previously tried to load "cartesia" as an on-device model.
- Common: Gemini voice starts playing sooner: Google Gemini TTS reads replies aloud sentence-by-sentence now, so playback starts after the first sentence instead of waiting for the entire reply to be synthesized — matching how ElevenLabs, Cartesia, Resemble, and Kokoro feel. (Gemini's API generates a whole utterance at once, so the chunking is done on our side; the speed slider now applies to Gemini too.)
- Common: Voice playback errors are now shown, not silent: When reading a reply aloud fails (missing/invalid API key, an incompatible Resemble voice/model, a provider rejection, etc.), the reason now appears as an on-screen message instead of failing silently with only a console log. The messages are actionable (e.g. "The selected Resemble.ai voice doesn't support the 'chatterbox-turbo' model — choose a Chatterbox-compatible voice…").
- Common: Clearer Resemble.ai voice/model error: When a selected Resemble voice doesn't support the chosen model, the error now says exactly that and how to fix it, instead of surfacing a raw internal error payload.
- Common: HEIC/HEIF photos from modern iPhones convert again: Attaching a recent iPhone photo (HEIC/HEIF) failed to convert — it errored with "format not supported" and fell back to a plain file attachment instead of a viewable image. The built-in image converter was years out of date and couldn't read photos from current iPhones (full-resolution and HDR shots in particular). It's been replaced with an up-to-date decoder, so HEIC/HEIF attachments turn into JPEGs and display inline again.
- Common: Video and music tools always advertise the live model list: The video and music tools were only ever showing the LLM their three bundled fallback IDs — the per-turn background warm-up that fetched the live OpenRouter video/music registries inside the relay subprocess was racy and routinely missed the deadline before the tool's description was sent off. The cached registry state now primes the video and music model services on subprocess boot, so the tool always shows the current line-up (Kling, Hailuo, Wan, etc.), and a user-selected model resolves without a per-turn network round-trip.
- Common: Test Runner multi-model benchmarks now return meaningful scores: Running a "compare these models" benchmark with
runLLMJudge = trueused to silently produce zero scores and an input-order ranking, because the suite-level path checked only per-testevaluationsettings and ignored the run-wide flag. The gate now honors either signal, and when the judge is on but no rubric was supplied a default rubric (factuality + completeness + clarity + helpfulness) is applied. Thetest_runnertool'sget_resultandexport_transcriptactions also no longer reject withrunId and testId are requiredwhen only one of those was missing — the error now names the actual missing field. - Common: Voice playback could be completely silent on iOS and Safari: Reading replies aloud sometimes produced no sound at all until the app restarted — the audio engine started in a suspended state and was never woken up. Playback now reliably produces sound.
- Common: Daily reminders fire at the time you set: A repeating reminder created for, say, 9:00 AM could drift and fire at the moment you created it each day instead. Recurrences are now anchored to the scheduled start time.
- macOS, Desktop: Scheduled-task notifications actually arrive: When a scheduled task finished and tried to notify you, the macOS and Windows/Linux desktop apps silently dropped the notification. It now appears as a normal system notification.
- Common: Sign-up failures are no longer silent: If the verification email can't be sent during sign-up, you now get a clear error right away instead of a sign-up that appears to succeed but never delivers the email.
- Common: The composer's "+" attach menu reappears on narrow panels: The earlier fix that kept the send button on-screen at very narrow widths inadvertently clipped the attach menu to nothing. Both now fit.
- Common: Claude model names work again for BYOK Anthropic users: Selecting certain Claude models with your own Anthropic key produced a "model not found" error because of an outdated internal model-name mapping. Model names now pass through to Anthropic as-is.
- Common: A failed step inside a multi-step app workflow now stops the workflow: A nested workflow that failed was reported to its parent as a success, so the workflow's error-handling branch never ran. Failures now propagate correctly.
Version 0.9.722 (May 22, 2026)
This is the first release where three long-awaited capabilities — the Pseudonymizer, the Community Hub, and the Messaging Gateway — are available to everyone. All three have been hardened in the tester program for months; the highlights below cover the publicly visible launch as well as the new features and fixes that landed alongside.
New Features
- Common: The Pseudonymizer launches for everyone: Caiioo's on-device personal-data filter detects names, emails, phone numbers, addresses, IDs, organizations, cities, and other sensitive values in what you type and swaps them for realistic fakes before any of it reaches the model — then reverses the swap locally on the way back, so the conversation on your screen still shows the real values. The model never sees the real values; tool calls run on the real values after restoration. Turn it on with the new shield icon next to the send button — gray for off, blue for the Personal Data Filter, teal for PHI / Limited Data Set, emerald for PHI / Safe Harbor. While on, a thin ring colored to match the active mode wraps the composer as an ambient reminder, and a "🛡 pseudonymized — N substitutions" chip appears under each message that had values swapped; click it to see exactly which real → fake pairs were used, with category labels. Includes a strict PHI / Safe Harbor submode (HIPAA §164.514(b)(2): also strips dates beyond year, full geographic addresses, and ages over 89) for cases where you need a de-identified dataset you can share without a Data Use Agreement. (The feature was previously called "Anonymizer." We renamed it to "Pseudonymizer" because that's the technically correct word: GDPR Art. 4(5) reserves "anonymization" for one-way, irreversible transformations, and this one is reversible by design — your screen still shows real names.)
- Common: The Community Hub launches: A new one-click marketplace for installing the tools, modes, MCP servers, and complete apps that extend Caiioo. Browse it from inside the extension or from caiioo.ai/hub. The launch catalog includes Slack (sign in once with the Caiioo Slack app — no manual app config or developer mode), Notion, Linear, GitHub, Atlassian, PandaDoc, Microsoft 365, Legal Data Hunter (18M+ case-law, legislation, and doctrine documents across 110+ countries), and 18 additional self-registering MCP servers that authorize themselves through their own provider's sign-in — no copy-paste of secrets or developer-mode setup required from you. Each package shows a preview of exactly what it installs — tools, modes, skills, MCP servers, and variables — before you click Install.
- Common: Tool Network Access — explicit consent when a tool would leave your machine: When you pick a local chat provider (Ollama, MLX) you're making an intentional privacy choice — your conversation stays on the device. Tools that route through a remote provider (image generation, music, video, Perplexity web search) used to cross that line silently. Caiioo now stops on the first attempt, renders an Approve / Cancel card inline that names the tool and the provider it would send to, and remembers your decision per provider. Revoke at any time from Settings → Personalization & Privacy → Tool Network Access.
- Common: The Messaging Gateway launches (Pro): Caiioo can now answer messages on the platforms your customers and contacts actually use — WhatsApp Business, Messenger, Telegram, iMessage, Signal, Viber, and Beeper — through a single configurable gateway in Settings → Messaging. Each channel uses its provider's standard bot / business credentials, entered once per service. For every conversation you pick how the agent shows up: Auto (agent answers everyone; anything you type in the same thread is treated as coaching that shapes the next reply), Direct (agent still auto-answers, but anything you type goes through to the caller as you), or Takeover (agent steps back entirely and you take over). Image, audio, and video attachments flow through to the model in both directions, so you can ask "what's in this photo the customer just sent?" and answer in the same channel. Slack lives in the Community Hub as its own MCP package — install it from there rather than the Messaging Gateway.
- Common: My Day (Beta): A second Hub-installable app: a one-click morning briefing that pulls today's calendar, unread emails, and recent documents into a single dashboard. The composer button runs the brief; the agent renders into the same dashboard view every time, with follow-up skills for expanding any item, drafting a reply, or scheduling deep-work blocks.
- Common: Storybook Builder (beta): Another Hub-installable app: pick a style, audience, age range, and a freeform brief; the agent designs a cast, draws reference portraits to keep characters consistent, generates each page with prose and illustration, derives a cover from the best page, and renders the finished book into a readable scroll-layout viewer. Forkable like any Hub app, so you can customize the master prompt or swap the style options.
- Common: Free plan now includes the desktop apps: The macOS and Windows / Linux desktop apps used to require Pro. Free users can now sign in to the desktop apps on every platform — the platform itself is free, and Pro still differentiates on Pro-only capabilities (image generation, remote MCP servers, scheduled tasks, etc.).
- Common: Pro Mobile retired — one $9 Pro tier across web, desktop, and mobile: The $2.99 Pro Mobile in-app purchase (iOS and Android) is gone. The single $9/month Pro tier now unlocks every Caiioo client on every platform — Chrome, Edge, macOS, Windows, Linux, iOS, Android — with no separate mobile SKU and no cross-rail entitlement gymnastics. Existing Pro Mobile subscribers are grandfathered to Pro at no extra cost and keep every capability they had. New mobile installs see only the $9 Pro tier on the in-app paywall, matching the website.
- Common: Caiioo for Legal (beta): A new one-click install from the Community Hub activates fifteen cross-practice legal skills — contract markup, redline drafting, transactional drafting, memos & opinions, advocacy, batch playbook review, and more — backed by two comprehensive negotiation playbooks (buy-side and sell-side) with Preferred / Fall-back / Walk-away tiers across dozens of common clauses. Variables stay scoped to the thread you're working in, so each matter keeps its own client, counterparty, jurisdiction, and posture without bleeding into the next.
- Common: Self Checker — judge any answer with the new ⚖ button: Every assistant turn now has a ⚖ button in the action bar. Click it to score that answer against your request — the judge sees the full turn (your prompt, every tool call's inputs and outputs, attached images, and the assistant's reply), authors deterministic checks (exact match, contains, regex, number range, arithmetic), runs them, and renders a verdict card inline. Pick any provider you have a key for; the judge's LLM cost rolls into the conversation's running total so there are no hidden charges.
- Common: Test Runner — try Caiioo on your own list of examples (Pro): Hand Caiioo a list of prompts and a way to grade each answer — substrings that should appear, a pattern the response should match, specific tools that should get used, or a second model that scores the answer 1-10 across criteria you define. Caiioo runs each prompt in its own fresh conversation, captures the assistant's reply, what tools it called, how long it took, and how much it cost, then renders a pass/fail report you can export as a CSV. Useful for spot-checking that a new mode, model, or installed Hub app still behaves the way you expect.
- Common: Hub apps can ship ready-made reference materials: A Community Hub package can now bundle its own Slate templates, PDFs, and other reference files. Installing Caiioo for Legal, for example, drops the two negotiation playbooks straight into your library so the agent can mark up your contracts against them on the very first turn.
- Common: My Apps — fork any Hub app and edit it as your own: A new My Apps panel in Settings → Tools, Modes & Apps lets you fork any Community Hub app into a personal copy and edit any of its primitives — tools, skills, modes, cards, views, workflows, template attachments, and variables — with per-primitive editors. The agent can also snapshot a useful conversation into a draft personal app for you to refine.
- Common: Hub apps now sync across devices: Install a Community Hub package on your Mac and it shows up on your iPhone, and vice versa. Personal apps (apps you've forked or saved as your own) sync too. Per-package vector clocks mean no install ever overwrites a newer install from another device.
- Common: Hub Settings now lists every app primitive in its own section: Tools, Modes & Apps grew four new sections — Cards, Views, Workflows, and Template Attachments — alongside the existing Tool Configuration and Agent Modes. Each is a read-only inventory of what your installed apps actually shipped, with source attribution so you can see which Hub package brought in which skill, mode, or card.
- Common: Skills are now visible to the model: Previously skills were UI-only — clicking a skill chip pasted text into your message but the model itself had no awareness of them. Each mode now injects its available skills (name, description, prompt body) into the system prompt, so phrases like "use your contract analysis skill on this PDF" actually work.
- Common: Sub-agent cards render inline in the main chat: When a sub-agent emits a card (a Self Checker verdict, a generated chart, a structured result view), the card is reparented up to the parent conversation and rendered inline next to the sub-agent's text result — same as how sub-agent attachments already work.
- Common: Google Sheets — 14 new actions: Paste data (CSV / TSV / HTML), split text to columns, trim whitespace, remove duplicates, apply or clear toolbar filters with criteria and sort rules, move rows or columns, insert and delete cell ranges, protect ranges with editor permissions, define and update named ranges, attach developer metadata, fine-tune conditional formatting, and use the modern ColorStyle palette — all without leaving the chat.
- Common: Google Docs gets real comments, multi-tab support, and smart-chip awareness: Add, reply to, resolve, and delete native Docs comments that show up in the Docs UI for everyone on the document. Multi-tab documents now work correctly — the agent reads from and writes to the right tab instead of mashing every tab into one position space. Smart chips (people, links, equations, page breaks, date chips) are now recognized so search and edit operations land on the right character. Concurrent edits now fail loudly with a clear error instead of silently clobbering each other.
- macOS: Voice playback (Kokoro TTS) starts within a second: On-device voice was failing silently on macOS because the model wouldn't load inside the WebView. Voice now runs through the desktop app's helper process and streams sentence-by-sentence, so you hear the first sentence within about a second of clicking play, even on a busy machine.
- Web: Hub packages install via a
caiioo://link: Clicking Install on caiioo.ai now routes directly to whichever client you actually have — Chrome extension or native app — instead of firing both at once. If you have both installed, you'll see a picker. The Community Hub install modal also breaks open each package's payload — tools, modes, skills, MCP servers, and variables — so you can see exactly what gets installed before you click Install. - Desktop: Linux AppImage registers the
caiioo://URL scheme: Linux users on the AppImage build can now install Hub apps from caiioo.ai with one click — the URL scheme registers on first launch without any system package install.
Improvements
- Common: Slate stability and round-trip fidelity overhaul: Closed roughly 40 individual bugs across the rich-text editor, the diff engine, version history, and Word import/export. Tracked changes no longer corrupt when an AI proposal lands on a document you've edited since; version-history snapshots are now true frozen copies; revisions resolve correctly when the content matches the target instead of stalling; AI proposals merge with existing redlines instead of overwriting other authors' edits; accepting or rejecting changes in a Word file persists to storage; rich-text exports handle emoji and other supplementary-plane characters; and the diff engine no longer confuses deletions and modifications when their text matches.
- Common: Slate — accept or reject every tracked change inside a selection: New ✓ Sel / ✗ Sel buttons in the Slate review toolbar mirror Word's "select a paragraph, accept all changes inside" behavior — highlight a region and one click resolves every tracked change that overlaps it. The diff engine also coalesces adjacent edits separated only by whitespace or punctuation into a single accept-or-reject unit, so reviewing an AI rewrite isn't a hundred individual clicks.
- Common: Word document import / export fidelity: A roughly 110-bug pass through the .docx parser fixed character formatting that ignored explicit "off" toggles, broken style inheritance, missing theme-color resolution, lost paragraph indentation, mishandled superscript / subscript / hidden text, wrong list numbering after headings, missing tab and line-break separators in extracted text, embedded images that weren't being extracted at all, and round-trip loss of embedded image references on export. Markdown export now uses CommonMark-correct list indentation.
- Common: PDF round-trip and rendering audit: An audit of the PDF pipeline preserves inline images, hex strings, and letter spacing on round-trip; resolves inherited page resources correctly; preserves transparency masks, decode, intent, and interpolation flags on image replacement; rescues special characters (Euro symbol, smart quotes, trademark) that Windows fonts can't natively encode; correctly handles emoji and other supplementary-plane characters in PDF-embedded fonts; and surfaces OCR errors instead of swallowing them. The Slate PDF viewer also opens noticeably faster on multi-page PDFs by rendering pages lazily, and large PDFs no longer fetch the file repeatedly when first opened.
- Common: PDF export preserves unencodable characters instead of failing: Special characters the chosen font can't encode are now passed through unchanged from the source PDF instead of aborting the entire export.
- Common: Pseudonymizer accuracy and coverage upgrades: The personal-data detector model was retrained with a locale-aware pipeline and now handles Chinese and a wider range of non-Latin scripts substantially better. Name spans now extend correctly across script boundaries — middle initials, leading honorifics, and contiguous Chinese, Japanese, Korean, and Arabic runs — so half-name leaks are closed. A new safety-net second pass catches misses before they're sent. Fragmented same-label spans are coalesced before substitution. Multi-language city and company-name detection is now in production. The calculator tool's numeric output is no longer mis-classified as a name.
- Common: Pseudonymizer hallucination inspector: Scans the assistant's reply for fake-shaped names that aren't in your session's substitution map — a hit means the model probably made up a name. Pairs with the existing leak inspector to give a complete view of what the model said about identity.
- Common: Pseudonymizer multilingual fakes: City names, company names, and personal names now generate locale-appropriate substitutes — a Spanish prompt gets Spanish-looking fakes, a Japanese prompt gets Japanese-looking fakes, and so on.
- Common: Pseudonymizer 30+ smaller correctness fixes: A multi-round audit cleared dozens of low, medium, and high-severity findings — script-coverage gaps, debug-log noise, leaks across sessions, URL trimming bugs, policy edge cases, audit-log privacy, restore robustness, Unicode handling in the user dictionary, structural-PII leaks under self-only mode, and more.
- Common: Telegram messages render with proper formatting: Messages sent through the Telegram bridge are now formatted using Telegram's native bold, italic, code, and link styling instead of showing raw asterisks, backticks, and broken "text (url)" syntax. Inbound images from Telegram also flow correctly to vision-capable models, and pre-formatted messages from the agent aren't re-formatted by the bridge.
- Common: Calendar sync covers every calendar in your account: Background sync iterated only your primary Google calendar; team and family calendars were silently absent. Every visible calendar is now synced. Event pagination is properly followed (so events past the first page no longer go missing), event timezones are preserved instead of being normalized to UTC, all-day events anchor correctly, and the Apple Calendar handling now routes Apple Reminders too.
- Common: Calendar / Agenda tool correctness: Closed a cluster of 19 bugs across calendar create / update / delete / list — most importantly, all-day event dates are now derived in your local timezone instead of UTC, so an event you set for Friday no longer lands on Thursday in eastern timezones.
- Common: Gmail tool reliability sweep: Fixed five bugs that were silently producing wrong-account results, broken reply threading, mangled "Doe, John" style recipient names, and hidden authentication failures. Replies now thread correctly in both Gmail and external mail clients (Outlook, Apple Mail, Thunderbird). Draft updates preserve the original conversation thread.
- Common: Gmail search by sender, category, age, attachment, and unread state: Asking the agent to find "unread emails from Bob from the last week with attachments" used to depend on the model remembering Gmail's exact search-operator syntax — and it would re-issue the same logical query with different wording until something worked. The Gmail tool now exposes first-class filter parameters (
from,subject,label,hasAttachment,isUnread,isImportant,category,newerThan/olderThanlike7d/1m/1y, andafter/beforedates), so the agent picks the right filter on the first try. - Common: Google Drive tool reliability sweep: Twelve fixes including refusing to read binary files as text, supporting shared-drive folder paths, surfacing pagination for large folder listings, mapping Slides to PPTX exports, removing the broken "owner" role from share options, and fixing destination-folder filtering on moves. The Drive transfer cache now expires public links after 24 hours instead of leaving them permanently public if cleanup fails.
- Common: Google Drive — full folder paths, shared drives, export, owner transfer, link discovery: Drive operations now accept human-readable folder paths (
Engineering/Specs/Q3) instead of only opaque folder IDs, walk into Shared Drives as first-class destinations, export Docs / Sheets / Slides to specific MIME types (PDF, DOCX, XLSX, PPTX), transfer file ownership between users, and surface anyone-with-link and public links so the agent can answer "what's the shareable URL for this file?" without you copying it out manually. - Common: Google Slides text edits land in the right place: The Slides tool used a 999999 magic number for "end of text," which the API rejected. It now looks up the actual text length and constructs proper ranges so partial-index edits (e.g. "style from character 5 onward") work as intended.
- Common: Google Sheets — 30+ smaller correctness fixes: Range parsing handles quoted sheet names with inner punctuation, unbounded references (A:A, 1:10), and columns past Z; image uploads write a real
=IMAGE()formula instead of erasing the cell; HTML import decodes named, decimal, and hex entities including astral-plane characters; charts no longer crash the sheet info reader; search reports absolute column letters and surfaces per-sheet errors. Sheet diffs now emit both adds and deletes correctly. - Common: Variables dialog now works for any Hub app: The "open variables" composer button used to be hardcoded for one specific app. It's now driven from the Hub manifest, so any app that ships a variables dialog gets its own button and label.
- Common: Personal apps surface across every reader: Personal apps (apps you've forked or saved) now contribute their skills, modes, MCP servers, tools, views, and variables through every place the agent reads them — not just the composer. They're a first-class app now, identical to Hub-installed apps.
- Common: Hub uninstall actually removes everything: Uninstalling a Hub app now also removes the package's modes, composer buttons, skills, template attachments, mode-variable patches, the cached system-disable list, and (if you were on it) the package's active mode. No more orphan modes lingering in the picker after uninstall.
- Common: Hub install warns about overlaps: When you install a package whose skills, modes, MCPs, tools, or views would duplicate something you already have, a toast surfaces the conflicts at install time so you can decide what to do, and per-row duplication indicators stay visible in the Hub install list and in your Settings inventory.
- Common: Per-thread variable overlays for matter-scoped work: Mode variables used to be sticky across every conversation, which is wrong for matter-scoped work (each legal matter, each client engagement, each project is a different context). Each thread can now carry its own variable overlay so the agent works on Matter A in one thread and Matter B in the next without mixing them up.
- Common: Hub install no longer pops a tab cascade for every required sign-in: Installing a multi-provider Hub app used to fire one sign-in tab per provider in sequence. Installs now complete fast and prompt for each remaining sign-in on demand, one at a time, instead of stacking tabs.
- Common: Cloud sync robustness: Team-sync key derivation now uses your organization ID and passphrase instead of your personal email, so every member of the same team derives the same key and can actually decrypt each other's items (this was previously broken). Sync timers, vector clocks, and manifest locking also tightened to prevent overlapping syncs from corrupting state.
- Common: "Use Caiioo's Account" button works for re-issued provisioned keys: The Settings button used to do nothing if your provisioned OpenRouter row had been deactivated by a previous switch to BYOK or a decrypt failure. It now mints a fresh row when the server says you don't have one, so the button always restores a working key.
- Common: OpenRouter key field stays in sync with Settings: The API key input now refreshes when the parent component pushes a new value (e.g. after clicking "Use Caiioo's Account") instead of holding onto the value it had when the page first rendered.
- Common: Composer buttons reload when mode settings change: Composer action buttons (Variables, etc.) now refresh immediately when a mode's settings change, instead of needing a chat reload.
- Common: Agent tools see your latest edits before they read a Slate: If you're still typing when an AI tool fires (Slate update, revision, tracked-changes resolve), the tool now waits for in-flight editor content to flush to storage so it operates on what you can actually see — not a stale snapshot from a second ago.
- Common: Inline cards size themselves to their content: The Self Checker verdict card and other inline cards now grow to fit their actual content instead of reserving a fixed slot, and pick up the parent app's theme (light or dark) instead of forcing a white background.
- Common: Floating action buttons default to the top-left corner: The floating ⚖ and 🛡 buttons (and any custom floating buttons) now land in the top-left of the composer by default instead of obstructing the send button on the right. You can still drag them anywhere.
- Common: User profile and mode-variable updates reject unknown fields: The agent could previously invent variable names like
_clientNamethat silently went nowhere. Updates now require the key to exist in the schema and return a clear error otherwise. - Common: Onboarding welcome screen simplified: First-run now shows three clear options — Free, Subscribe to Pro, or Bring Your Own API key with a trial — instead of the previous longer onboarding form. The first time you open the Composer, Settings, or Slate, a short interactive tour runs to point out the relevant controls.
- Common: Sign-in is one step: Caiioo used to require both a verified identity AND an active license check before letting you in. That redundant license probe is gone — once you've signed in, you're in. License state still gates Pro-only capabilities (image generation, etc.); it just no longer gates opening the app.
- Common: Settings search reveals advanced sections: Searching for a setting that lives under an "advanced" twist-down used to silently return no results because the section was collapsed. The search bar now reveals matching collapsed sections, and a new "Collapse all" button reverses it in one click.
- Common: Friendlier provider catalog: Newly released OpenRouter free-tier models are now prioritized in the model picker, and deprecated models are soft-removed instead of cluttering the list.
- Common: Host-language detection on every native platform: The macOS, iOS, Android, and Tauri shells now detect your OS interface language at launch and pass it through to the sidepanel, so first-run translations land in the right language without you having to set it manually. About 10,000 additional translation strings landed across 23 non-English locales.
- macOS: Single-instance enforcement: A second copy of the macOS app can no longer launch from a
caiioo://link when one is already running, even when macOS Launch Services has two registered copies (typically a leftover from a prior install). - macOS: MCP servers installed via Homebrew now launch: macOS 15+ blocks notarized apps from running binaries that carry the "provenance" attribute, which Homebrew adds to everything it installs. Caiioo now auto-clears that attribute on permission-denied spawn, so local MCP servers (filesystem, memory, fetch, etc.) installed via Homebrew start working without any manual cleanup.
- macOS, iOS: Hub install via
caiioo://is reliable on cold launch: Tapping acaiioo://install link on a freshly launched app could race the WebView's first load and silently drop the install intent. The intent is now persisted across the cold-launch race so the install completes once the app is ready. - iOS: Apple's new "write-only" calendar permission is respected: iOS 17 introduced a third Calendar / Reminders permission — "write-only" — that lets Caiioo create events and reminders without seeing your existing ones. Caiioo previously treated this as "denied"; it now uses the write-only access correctly, so users who grant only that level can still ask the agent to schedule things.
- iOS: Clearer message when an in-app purchase can't be verified: Failed purchase verifications now surface a specific reason instead of a generic error, and the transaction is no longer silently finished, satisfying Apple's StoreKit guidance.
- Android: Saving a file no longer freezes the app: Writing a large file through the Android save-file picker used to run on the UI thread and could freeze the app for several seconds on slow storage. Writes now happen off the UI thread.
- Android: Streaming network responses deliver headers before chunks: Long-running streaming calls now deliver headers to the caller before any body chunks arrive, fixing a category of intermittent stream failures.
- Android: On-device transcription faster and more memory-efficient: The Whisper / Moonshine audio capture path now uses a primitive float buffer instead of a boxed list, reducing both memory and CPU. Loading, unloading, transcribing, and clearing the model cache also serialize through a shared lock now, so the app no longer occasionally crashes if you switch models mid-transcription.
- Desktop: Reliability and security audit of the Windows / Linux shell: A full pass through the Tauri shell — capability scope narrowed to the trusted sidepanel only (so an arbitrary visited page can't invoke privileged commands), HTML-escaping hardened on the
caiioo://Hub-install bridge, a graceful "Node.js not found" message instead of a silent crash, and a handful of robustness fixes. - Desktop, macOS: On-device speech and voice model downloads now work: The desktop apps' WebView Content Security Policy was blocking downloads from huggingface.co, so the Kokoro voice and Whisper speech-to-text models couldn't load on first use. The policy now allows huggingface.co.
- Extension: Settings → Tools, Modes & Apps renamed and reorganized: The old "Tools & Capabilities" category is now "Tools, Modes & Apps" with a new "Connectivity" category split out from Advanced.
- Web: Pricing copy refined across 23 languages: An editorial pass on the website's Free / Pro feature bullets brought the non-English locales into line with the latest English source. Legal document "last updated" dates were refreshed.
Security
- Common: Sign-in hardening: Multiple findings closed from an audit of the sign-in and credential-vault pipeline. Google ID tokens are now only accepted if they were issued for Caiioo's own client IDs (closing a hole where any Google OAuth client's token could sign someone in). Apple sign-in now validates the nonce returned by Apple against the one Caiioo generated. Google and Apple sign-in refuse to auto-link an external identity onto an unverified-email account (closing a "shadow signup" takeover). The refresh-token endpoint now rejects revoked tokens, so a logged-out token can't be exchanged for a fresh one. The OAuth-token issuance endpoint now requires a live bearer token. The OAuth callback page itself was hardened against several script-injection vectors.
- Common: Per-profile isolation for all settings: The settings layer was sharing a single bucket across multiple sign-ins on the same install in certain code paths, which could leak a credential entered under one account into another account's view. Every settings read and write is now strictly per-profile, legacy shared buckets are wiped on first launch after upgrade, and cloud sync refuses to upload or download those legacy buckets.
- Extension: Local-bridge auto-discovery is now opt-in: The Chrome extension previously connected unconditionally to any Caiioo desktop app it found on localhost. It now only does so when you've explicitly turned on local-bridge access in Settings, so a desktop app installed by another user on a shared machine can't be silently bridged to your extension session.
- macOS, iOS: Hardened
caiioo://install bridge against injection: The hand-rolled string escaping on the JavaScript that processescaiioo://hub/install/<id>URLs only escaped single quotes — a maliciously crafted package ID could break out of the string literal and run arbitrary script in the WebView. Replaced with full JSON escaping on both platforms. - iOS, macOS, Extension: Google sign-in flows now use the verified Caiioo OAuth app: Connecting Google for Private Sync, Calendar, Gmail, Drive, and the other Workspace tools used to show the "This app isn't verified" warning on iOS, macOS, and the Chrome / Edge extension because those platforms were still authenticating against an older, unverified Google Cloud project. Every platform now uses Caiioo's verified Google project end-to-end, so you see the proper Caiioo branding and the verified-app green check on the Google consent screen instead of the warning. One-time re-login: existing users on iOS and macOS will be signed out automatically and asked to sign in again the first time they open the app after this update — the previous sign-in tokens were issued by the older Google project and cannot be carried over.
Bug Fixes
- macOS: "Browser not connected" warning clears when Safari connects: Connecting the Safari extension to the Mac app left a stale warning saying browser control was unavailable — and the warning only named Chrome, even though Safari, Edge, and Firefox all work. The app now counts every connected browser (Safari connects over a different channel than the others, which is why it was being missed), so the warning clears the moment any browser connects, and its wording no longer singles out Chrome.
- Common: Newer image models you pick are actually usable: The image-model picker showed every model your account can generate with — including newly released ones like Recraft — but choosing one could fail with "Unknown image model" because the generator was checking your selection against a stale built-in list instead of the live one. The generator now refreshes the live model list before deciding, so anything visible in the picker works. The default model also changed from FLUX.2 [pro] to the faster, cheaper FLUX.2 [flex], which is a better all-round default.
- Common: Tracked changes inside Word documents now accept and reject correctly: Clicking accept or reject on a tracked change inside a .docx-backed Slate used the document position as the change ID, which never matched the real stored change ID — so the change visually flipped state but never persisted. Fixed.
- Common: Slate handles corrupt template installs: Hub-installed Slate templates that shipped without the proper wrapper used to hang the viewer on "Loading artifact" with no recovery. Templates now self-heal on load and the viewer surfaces a clear error if a template is genuinely corrupt.
- Common: Self Checker no longer narrates the verdict twice: When the Self Checker rendered its verdict card, the agent was also describing the scores in reply text — two copies of the same verdict on the same screen. Suppressed the prose recap when the card renders.
- Common: Self Checker reads image attachments correctly: The judge was decoding image attachments as text and feeding the resulting garbage into the evaluation. Image attachments are now forwarded to the judge as actual images, so any turn whose correctness depends on what's in an image scores correctly.
- Common: Self Checker shows only the latest verdict: Each ⚖ click was appending a fresh verdict card without removing the previous one, so the chat ended up with a stack of duplicates. Each turn now shows only the most recent verdict, with an × to clear it.
- Common: Reminders sync dialog renders Outlook events correctly: The internal "microsoft_calendar" source value was leaking into the platform-detect path in the reminders modal. The modal now falls back to its inference path so events render correctly regardless of where they came from.
- Common: Large Gmail attachments flow through reliably: Large Gmail attachments (a 25 MB file is roughly 33 MB after base64 encoding) were exceeding the inter-process message ceiling and silently dropping. The runtime now falls back to a temp file for oversized payloads so attachments make it through.
- Common: PandaDoc MCP install works again: PandaDoc moved their MCP endpoint to a new path. Caiioo's catalog entry now points at the new endpoint.
- Common: Workflow render steps no longer mis-resolve string outputs as attachments: Forwarding a string output from one workflow step into a View step used to fail because every string was treated as an attachment ID. Strings forwarded by reference are now distinguished from literal attachment IDs.
- Common: Subscription-status check no longer hangs in browser-only environments: The agenda tool was probing the desktop bridge on every cache-miss in environments that don't have one, throwing a "Failed to fetch" error every time. The probe is now gated on whether a desktop bridge is actually reachable.
- Common: Background process for the agent now exits cleanly: A change in a previous release left the agent's background process holding open after it finished, blocking the things that run after a turn — most visibly, the automatic conversation title. The process now exits cleanly so title generation and other post-turn steps fire as expected.
- Common: Lab and admin-only modes hidden from the mode picker: Two paths were still leaking the internal "lab" mode (and any other mode marked admin-only) into the user-facing mode picker. Fixed.
- Common: Duplicate Hub installs deduplicated on save: A race in the install pipeline could write two entries for the same package. New saves dedupe by package slug, and existing duplicates are healed on next read.
- Common: Custom mode IDs no longer collide with Hub installs: The storage key is now authoritative for a custom mode's ID, eliminating a class of collisions when a Hub-installed mode landed on a key that already had a user-edited copy.
- Common: Sub-agent attachments and cards reach the main chat: A sub-agent that generated an image, a chart, a Self Checker verdict, or any other attachment used to drop the result against an invisible sub-agent thread — and the parent conversation would sometimes invent a URL to fill the gap. Sub-agent attachments and cards now reparent to the main thread correctly across the agent runner's mid-run cleanup, so what the sub-agent produced shows up where you sent the request.
- Common: PDF reading on lazy-rendered pages: The "view original text" pop-up for a PDF chunk now works on pages that hadn't been rendered yet when the chunk was created.
- macOS: Slack, Notion, Linear and other Hub OAuth sign-ins now complete in the native app: Hub MCP servers that use HTTPS-only OAuth (Slack, Notion, Linear, etc.) couldn't redirect back to the local relay because their providers reject http:// callbacks. The macOS app now routes those flows through the Caiioo cloud relay's
/oauth/callback, which then bounces the authorization code back to the local relay over the existing per-user channel — so sign-in for these providers from inside the native app now lands the same way it does in the extension. Includes a one-time migration that fixes existing installs whose stored profile wasn't yet wired to the local relay's identity provider. - Common: "Ask the user" doesn't hang in sub-agent runs: An agent that called
ask_userfrom inside a sub-process was hitting a shared in-memory singleton that wasn't reachable across processes, so the question never surfaced and the run stalled until you cancelled it. The collaboration controller is now per-thread, so the question shows up in the chat the way it does in the main agent loop. - Common: Native macOS Calendar / Notes / Reminders helpers refreshed: Updated the helper binaries against current macOS SDKs to clear an issue where reminders sync intermittently saw the wrong items in 0.9.721.
- Extension: Self Checker verdict cards now render inside the extension: Inline scripts inside card templates were blocked by the extension's content-security policy. Cards now route through the extension's sandbox page so they render correctly.
- Common: Text shows up when you turn an SVG drawing into an image: Rendering an SVG — a logo, diagram, or chart — to an image used to drop all of its text (wordmarks, labels, captions) in the Chrome extension and the macOS app, leaving blank gaps where the words should be; only the mobile apps rendered them. Caiioo now ships fallback fonts (sans-serif, serif, and monospace, including bold), so text appears exactly as drawn — including symbols like &, ™, and accented characters — and any typeface the drawing asks for that isn't available falls back to a clean sans-serif instead of vanishing.
Version 0.9.721 (May 15, 2026)
New Features
- Common: Pricing collapsed to Free and Pro, with a $5/mo OpenRouter credit bonus for Pro: We retired the separate Platform and Pro Mobile tiers and folded every Platform-only feature (Apple Calendar / Reminders / Notes, encrypted private relay, API access preview, native voice dictation, browser UI) into Pro. The website, account page, and in-extension pricing panel are now a clean two-card Free / Pro layout. As part of the change, every paid Pro month (Stripe or Apple) now also tops up your provisioned OpenRouter key with $5 of credit on top of your existing balance. Existing Platform subscribers are grandfathered into Pro at no extra cost and keep every capability they had. The OpenRouter credit-purchase floor was also lowered: end users no longer see the legacy $0.80 minimum-fee surcharge, just the 5.5% pass-through fee.
- Common: Slack is now a one-click Community Hub install: The standalone Slack integration is gone; Slack now installs from the Community Hub like Notion, Linear, GitHub, etc. Click Install on the Slack package in the Hub, sign in once with the Caiioo Slack app, and the official Slack MCP server is wired up immediately — no manual app config, no developer mode. Slash commands like
/caiiooand/caiioo-helpwork from any channel where the bot is invited. - Common: Workspace Files tool: A new sandboxed file tool lets the agent read, write, edit, and search files inside a workspace folder you point it at. The agent cannot escape that folder, and cannot reach the network through this tool. Reads auto-parse Office formats (docx/xlsx/pptx) and PDFs. Configure the folder in Settings → Tools → Workspace Files; the agent picks it up immediately.
- Common: PHI Safe Harbor submode for the Pseudonymizer: The PHI mode of the Pseudonymizer now offers two submodes — Limited Data Set (the existing 16-category strip, default) and Safe Harbor (the stricter HIPAA §164.514(b)(2) standard that also removes specific dates beyond year, full geographic addresses, and ages over 89). Pick the submode from the Pseudonymizer settings or the new in-chat toggle. Safe Harbor is the right choice when you need a de-identified dataset you can share without a Data Use Agreement.
- Common: Pseudonymizer is now generally available: The Pseudonymizer — which detects names, emails, addresses, IDs, and other sensitive values in what you type and swaps them for realistic fakes before any of it reaches the model — graduates out of the tester program with this release and is available on every plan, including Free. Turn it on in Settings → Privacy → Pseudonymizer, or with the new in-chat shield toggle. The model never sees your real values; substitutions are unmasked locally before tool calls run, so the output stays accurate.
- Common: Pseudonymizer in-chat controls: The Pseudonymizer no longer hides in Settings. A shield icon next to the send button (gray = off, blue = Personal Data Filter, teal = PHI/Limited Data Set, emerald = PHI/Safe Harbor) shows the active mode at a glance and opens a popover with the same mode picker you'd find in Settings. When the Pseudonymizer is on, a 1-pixel ring colored to match the mode wraps the composer as an ambient reminder. After each turn that had substitutions, a small "🛡 pseudonymized — N substitutions" chip appears under your message; click it to see exactly which real → fake pairs were swapped, with category labels.
- Common: Connect a browser on another device to your computer: A new pairing flow lets you safely use Caiioo in a browser on your phone, tablet, or another laptop, with everything still running on your main computer. In Settings → Connections → Caiioo Bridge → "Pair a device", generate a 6-character code that's good for 5 minutes and single-use. Open the same URL on the other device, enter the code, and that device stays paired from then on. Brute-force guesses are rate-limited to 10 attempts per IP per minute, and a server restart on your main computer re-pairs every device cleanly.
Improvements
- Common: Agenda items color-stripe by source calendar: Each event card in the Reminders / Agenda window now carries a colored stripe matching the calendar it came from, so you can tell at a glance whether an item is from your work, personal, or shared calendar. Google Calendar today; Apple and Microsoft follow once their data feeds expose a calendar color.
- Common: AI can read a Google Sheet without naming the exact cells first: The Google Sheets read tool used to refuse to run unless the agent specified an exact cell range like
Sheet1!A1:D200. The model now can ask for a whole sheet, or every sheet in a workbook, and the tool figures out the layout itself. A built-in 1000-row ceiling keeps a giant spreadsheet from blowing past the model's context window — if a sheet gets capped, the response tells the model so it can ask for a smaller range or a higher row count next time. - Common: Installing an OAuth tool from the Community Hub now opens sign-in for you: Adding a Hub package that needs sign-in (Notion, Linear, etc.) used to leave the card stuck in an "OAuth required" state with a "Sign in" button you had to click separately. The install now opens the sign-in tab for you automatically — your click on Install is treated as consent — and an "Opening sign-in for X…" line explains why a tab popped. The manual Sign in button stays as a backup.
- Common: Settings → Connections shows the real network address for other devices: The "Connect from other devices on your network" panel was showing
127.0.0.1:3847(the loopback address that only works on this same computer) on the macOS app, so users were copy-pasting an address that couldn't actually be reached from another device. The panel now shows the real LAN IP and.localnetwork name from the bridge itself. The address is also hidden when the new "Allow access from other devices" toggle is off, so you don't see an address that wouldn't work anyway. - Common: Web search captures Google's AI Overview reliably: The web-browsing tool's Google scraper sometimes returned an empty AI Overview block, or pulled in noisy right-rail "Sources" text and inline scripts. We re-anchored extraction to Google's stable section markers and now strip script content and the right-rail before returning, so the agent sees only the readable answer.
- Common: Pseudonymizer protects company names too: The Pseudonymizer now treats organization names as protected entities, swapping in realistic fakes that preserve legal suffix style (Inc., LLC, GmbH) and "partners-vs-brand" structure. City and small-region names ("Springfield", "St. Albans") are also detected as a first-class category so they can be swapped or stripped per your active mode.
- Common: Pseudonymizer adds Spanish, French, German, and other multilingual coverage for cities and regions: The personal-data detector model was retrained with hand-authored multilingual data for city names across 23 languages, so users with non-English-language prompts now get the same level of protection as English speakers.
- Common: Pseudonymizer model download shows progress on iOS: The first-time PHI / Personal Data detector download could appear stuck on iOS because nothing told you it was still working. There's now a heartbeat progress log so you can see the download isn't frozen.
- Common: Subagents can search the model catalog by name: To keep the sub-agent tool description in budget, the catalog embedded in it is now the top 10 highest-value models per provider. When a sub-agent needs a model outside that list, it can call the new
search_modelsaction with a name fragment (e.g. "haiku") and get back exact catalog IDs, pricing, and capability flags — so the model can pick a concrete ID even for lesser-used choices. - Common: Custom MCP servers show a friendly name in the credentials vault: When you signed into a custom MCP server, the credentials vault was labeling it with the raw connection URL, which was hard to scan. The vault now shows the friendly server name you gave it, falling back to the URL only when no name is set.
- Common: Community Hub now adds Legal Data Hunter: A new Community Hub package wires up Legal Data Hunter — 18M+ case-law, legislation, and doctrine documents across 110+ countries — as a one-click MCP install.
- Web: SOC 2 Type I badge on the trust page: Caiioo passed its SOC 2 Type I audit. The trust page on caiioo.ai now carries the AICPA SOC 2 badge alongside the existing security disclosures.
- iOS: Hide the keyboard toolbar that iOS adds to text fields: The prev/next arrows and Done button iOS attaches above the keyboard for web text fields ate noticeable vertical space in the composer. The bar is now hidden (using only Apple-public APIs, so this stays App Store safe), and stays hidden after page reloads.
Security
- Common: The local bridge stays on your computer by default; LAN access is opt-in: The local bridge that powers stdio MCPs and the desktop-only tools used to listen on every network interface, and on a public Wi-Fi network another device could request a session token and then call the privileged "run a shell command" endpoints. The bridge now only listens on your own computer by default. To use it from another device, turn on "Allow access from other devices on your network" in Settings → Connections, then pair the device with the new code flow (see above). Even when LAN access is on, session tokens are only handed out to requests coming from your own machine, so a stranger on the same Wi-Fi cannot get a foothold.
- Common: Pseudonymizer refuses to leak your real values to Perplexity: If the Pseudonymizer is on and the agent tries to run a Perplexity search whose query contains a pseudonymized identifier (a name, email, address it already swapped), Caiioo now blocks the search and surfaces a clear explanation instead of un-swapping the value and sending it to Perplexity. Perplexity is a third-party LLM service — the whole point of the Pseudonymizer is that values like that never reach an LLM service. To run the search, turn the Pseudonymizer off for that turn, or rephrase to avoid the protected value.
- Common: Spreadsheet parser swapped to a maintained library (GHSA-4r6h-8v6p-xvw6): The library Caiioo used to parse
.xlsxattachments had an open prototype-pollution advisory and is no longer maintained. We replaced it with the well-maintainedexceljslibrary. Spreadsheet uploads and the file-workspace's.xlsxreads return the same content as before.
Bug Fixes
- Common: Browser stuck on "Loading Caiioo… 0/0" after a server restart: When the host server didn't have an end-to-end-encrypted session with a connecting browser yet (because the server just restarted, or it's the browser's first connection), the browser was ignoring the server's request to start the key exchange — so encrypted messages never decrypted, and the UI got stuck loading. The browser now answers the key-exchange request correctly and the sidepanel loads.
- Common: Caiioo kept asking for Google permissions you'd already granted: When you granted a Google permission mid-conversation (from a sibling tool, the settings panel, or a sync from another device), the agent's in-memory copy of your account didn't notice, and it kept asking again for the same permission — sometimes several times in one run. Caiioo now re-reads your account once before complaining about a missing permission, so within-run grants are honored immediately.
- Common: 7 languages were being told the AI is "English": The internal map from locale code to language name covered 17 of the 24 supported languages. Users with their UI set to Arabic, Hebrew, Hindi, Bengali, Urdu, Turkish, or Dutch were getting a system prompt that said "respond in English" instead of their actual UI language. The map is now complete, and a test prevents this from regressing.
- Common: Reminders sync dialog showed raw text codes instead of translated labels: The Caiioo card in the Reminders / Agenda sync dialog was rendering literal placeholders like
reminders.sync.CaiiooNameinstead of the translated text, in every language. Fixed. - Common: Google sign-in stops ping-ponging when you grant an extra permission mid-conversation: When the agent needed a Google scope it didn't have yet (Drive, Sheets, etc.) and prompted you to grant it from a sidebar tab, the new sign-in could return the agent to a state where it asked for the SAME scope again. The OAuth flow now correctly merges the newly granted scope into your existing connection instead of overwriting it, so one approval is one approval.
- Common: Sub-agent attachments now show up in the main chat: When a sub-agent's tool created an image, file, or other attachment, it used to be stored against the sub-agent's invisible thread and the main conversation showed nothing — sometimes the parent model would invent a URL to fill the gap. Sub-agent attachments are now linked back to the parent thread automatically and render inline like any other tool result.
- Common: Pseudonymizer detector pass-2 catches names the first pass missed: The Pseudonymizer now runs a quick second detector pass that re-checks the message for any real values that should have been masked but weren't, before sending. Belt and suspenders for protected categories the model is most likely to miss.
- Common: Pseudonymizer no longer puts your real name in the My Identifiers placeholder: The Settings → Pseudonymizer → My Identifiers field was showing your account's real name as the example placeholder, which both looked like a leak and confused setup. Replaced with a generic placeholder.
- Common: Pseudonymizer pill text is readable in dark theme: The "🛡 pseudonymized" pill under user messages was using a light-mode text color in dark theme, making it nearly invisible. Fixed.
- Extension: Sidepanel "Get current location" dead-end fixed: The sidepanel's location request was returning "Permission denied" before the browser could even ask you, because the Chrome manifest was missing the location permission entirely. Permission added; the location request now reaches the browser prompt as expected. Existing users will see a one-time permission request on update.
- iOS: Subscribe page only sells Pro after the tier merge: The iOS in-app paywall briefly still showed the retired Pro Mobile and Platform tiers, which could leave users on a now-unsupported plan. The paywall now sells exactly the same Pro tier shown on the website. Existing Pro Mobile / Platform subscribers continue to be honored as Pro at no extra cost.
- iOS: Rare crash during navigation while a page was still loading: iOS could crash when a page navigation was cancelled mid-load (for example, tapping a link before the previous page finished). The fix routes every error path through the same already-guarded helper, so a cancelled load never tries to deliver a result on a closed page.
- Web: Sign in works inside in-app browsers (Slack / X / LinkedIn / Instagram) and on iOS Safari: The popup-style Google sign-in failed when caiioo.ai was opened from a link inside another app, because in-app browsers either block the popup or strip its connection back to the original page. On mobile and in-app browsers the site now uses a full-page redirect through your own browser session instead of a popup, so sign-in completes and returns you to the page you started on. Desktop popup sign-in is unchanged.
- Web: Community Hub polish on tablets and long names: The navigation bar on the website now switches to the hamburger menu at tablet widths (up to 1024px) instead of overflowing into the logo. Hub package modal titles no longer run under the close button. The "Coming Soon" pill wraps cleanly on narrow widths. Tool icons fall back to an emoji, then to a known logo from the company's website, then to a first-letter avatar — instead of letting a long internal slug (like "customerio") overflow out of the icon box.
- Web: Community Hub now shows the right author on each package: Every package in the Hub used to read "by caiioo" regardless of who actually built it. Authors now reflect the real maintainer — "Model Context Protocol", "oraios", or the vendor brand — and only fall back to "Caiioo" for tools and modes we built ourselves. Cards for integrations that aren't fully wired up yet are hidden from the Hub until they work, so you don't see installable cards that immediately error.
- Web: Sitemap stops triggering "page with redirect" warnings: Search Console was flagging every page on caiioo.ai as a redirect because the sitemap listed URLs without the trailing slash that the live site uses. The sitemap now matches the canonical URLs and includes per-language alternates, so search engines stop seeing the whole site as redirected.
Version 0.9.720 (May 1, 2026)
Brand
- Common: PebbleFlow is now Caiioo: The product has been renamed to Caiioo — the same animal, the same app you've been using. You'll see the new name throughout the sidepanel, settings, the website, the macOS host app menu, and the Safari host app. Existing data, accounts, sign-ins, and sync are unchanged — only the display name moves. The marketing site lives at pebbleflow.ai for now and forwards-compatibly displays the new brand; canonical caiioo.ai routing follows in a later release.
New Features
- Common: Simple Mode is on by default for new users: First-run installs now land in Simple Mode — a calmer composer that hides per-message token and cost detail, the lossless-compression dropdown, and tab-context indicators. The model picker is still visible because choosing a model is a first-class action. Existing users keep whatever Simple Mode setting they had. A new eye-icon toggle in the composer reveals or hides the full detail in one tap.
- Common: Account deletion: You can now delete your Caiioo account from the website's Account page (also linked from the iOS app, per Apple's account-deletion requirement). Deletion removes your profile, sign-in credentials, AI credit balance, and the encrypted copy of your provisioned OpenRouter key; minimal compliance audit logs without account ID are retained as required by law. A new privacy-policy section spells out exactly what is removed and what is retained.
- iOS: Native on-device text-to-speech (Kokoro 82M): Kokoro voice synthesis now runs in the iOS host process via OnnxRuntime instead of inside the WKWebView, mirroring the e5-embeddings architecture from 0.9.719. The model gets the host process's increased-memory entitlement budget instead of competing with the sidepanel UI for WebKit's per-process cap, so on-device TTS no longer crashes the WebView under load on iPhone 13/14.
Improvements
- Common: Image generator works with slow streaming models: The image generation tool now opts into streaming for OpenRouter image models, fixing "Network error: Unable to connect to OpenRouter" on slow models like
gpt-5.4-image-2(~167 s end-to-end). The previous buffered path waited on ~1.8 MB of keepalive padding before the actual JSON arrived and the connect-layer timeout fired first. - Common: Free-tier model selection picks a real model dynamically: The previous
openrouter/freemeta-router routed inside OpenRouter with no awareness of which downstream models supported tools or vision — Android users sending an image plus a tool call would hit "No endpoints found that support tool use". Caiioo now routes itself: onboarding, settings, and the upgrade-modal "use free models" CTA all pick a real free model that supports tools (and image input where available), and auto-swap on rate-limit or capability errors. Existing users onopenrouter/freeare migrated lazily on next launch. - Common: Personal Intuition finds the right context on long messages: The retrieval query was sliced to the last 500 chars of your message and embedded as a single vector. On long pasted-then-asked messages or multi-topic turns, that either truncated before the actual question or blurred everything into a topic-flat centroid that scored badly. Caiioo now extracts a salient query — keeps the verbatim head and appends a deduped bag of content-bearing tokens drawn from up to the next 3000 chars — so retrieval stays on-topic even when your prompt is long.
- Common: Settings → "Minimal settings" toggle stops flipping its own title: The toggle alternated its label between "Minimal settings" (on) and "Advanced settings" (off), so the off state read as if checking the box would move you to advanced — the opposite of what happens. Title now stays constant; the description prefixes "On — " / "Off — " to make the current state unambiguous.
- Common: Voice-model download dialog handles indeterminate progress: The model-info card during download could overflow in the narrow sidepanel; layout is now anchored. When the server doesn't return a Content-Length, the progress block renders an indeterminate spinner instead of a frozen 0% bar.
- Common: Simple Mode shows a compact stat strip instead of hiding everything: Simple Mode previously hid the entire thread-stats summary. It now shows a slimmed strip — context-window ring, compact total tokens (e.g. "1k"), cost, and remaining credit — and the eye-toggle in the composer expands to full detail in one tap.
- Common: Subagent costs roll up correctly: The per-thread cost tally was missing subagent Perplexity / web search spend, helper costs, image/video/music generation costs, PDF OCR costs, and voice costs. All categories now roll up into the parent's
sub_agent_costline. - Common: Settings → Tools selector overrides hidden default-off tools: Choosing "Always" or "Auto" on a tool that ships disabled by default was being silently ignored. Your dynamic-tool-config choice now overrides the default.
- Common: Local sidecar renamed to "Desktop app" in user-facing copy: All user-visible references to "PebbleFlow Relay" — the sidecar that backs local-stdio MCP servers and Desktop-only tools — now read "Desktop app", which is what users actually install.
- Common: Privacy copy tightened: Dropped redundant "telemetry" wording across legal policies, store listings, the website privacy page, and the in-app guide. "No analytics" already covers it; no factual change to what the apps do or do not collect.
- Common: Provider account view labels balance source: The provider-account panel now states which key each balance is reporting against (your BYOK key vs the Caiioo-provisioned key), so credits and remaining balance can no longer be misattributed at a glance.
Bug Fixes
- Common: Apple Sign-In after the brand transition: Apple Sign-In was failing for new sign-ins because the OAuth Services ID still pointed at the old
com.sixcailloux.PebbleFlow.web. Switched tocom.sixcailloux.Caiioo.webso "Sign in with Apple" works again on iOS, macOS, and the web account page. - Common: "Sign in" button on OAuth-required MCP servers: Installing a remote OAuth-required MCP server (e.g. Notion) writes the server to storage even when the initial connect fails because OAuth is required, but no client gets registered. The "Sign in" button then sent a refresh request that returned a raw "MCP server <id> not found" error instead of opening the OAuth dialog. Refresh now falls back to a fresh
addServercall when the client isn't registered, so the OAuth/DCR shape is surfaced and the sign-in dialog actually opens. - Common: Bug reports submitted from native apps now record the correct app version: iOS, macOS, and Android shells now inject the installed app version into the WebView at boot, so submitted bug reports identify which build they came from instead of leaving the field blank or echoing the bundled sidepanel version.
- Android: Launch crash after the rebrand: An over-eager PebbleFlow → Caiioo rename had renamed the Application class and JNI symbols on the Kotlin side without renaming the corresponding native exports, so the app crashed at startup unable to resolve symbols. Reverted the renames; the app launches cleanly again.
- Common: Google Workspace re-authorization loop: Users were stuck re-authorizing Google Workspace every ~hour because their stored OAuth connection had no
refresh_token. Six connected fixes guarantee a refresh token on every grant and preserve it through cloud sync, so Google connections survive the 1-hour access-token TTL without bouncing through the consent screen. - Common: Recovering from "User not found" on OpenRouter: OpenRouter returns HTTP 401 "User not found" when a provisioned sub-key's underlying user record is gone but the key entity still exists. Caiioo previously surfaced this as a dead extension that only logout+login fixed. The provider now self-heals by swapping in a fresh provisioned key and retrying the request once.
- Common: Google Docs
insert_componentreturned misleading errors: Inserting a component after a previously inserted table surfaced "Document not found" because the inserted table never got bound to itscomponentName. Tables now get a named range in the same insert phase, the position resolver respects non-defaulttabId, and app-level errors stop being misclassified as 404s. - Common: Newly released OpenRouter models lost ZDR routing: When a model wasn't yet in Caiioo's intelligence database, the synthetic fallback record marked it as not-ZDR-capable, even when ZDR-only providers actually supported it. ZDR routing now uses the same provider-list fallback as the regular path.
- Common: API
/v1/runsignored attachments on the very first call: When the API endpoint kicked off an agent against a fresh thread, the attachment list on the user message was lost because the empty-thread branch pushed only text. Attachments now flow through correctly. - Common: Native apps could read stale settings right after a model change: On memory-pressured Android, switching the model picker and immediately sending a message could let the agent read the previous model from disk because the 500 ms debounced flush hadn't fired yet. The send path now forces a state flush before spawning the agent.
- Common: Tester-bug triage (PF-260429 / PF-260430): Five fixes — managed-key (free-tier) users can now ingest documents through PDF OCR without typing their own key; macOS pins the Node sidecar's timezone to the host so dates resolve correctly under sandbox; scheduled tasks now persist on iOS/macOS/Tauri shells (the WebView storage stub was silently dropping writes); the configuration tool surfaces one-time and manual schedules as first-class options; and tool callsites recover from the OpenRouter 401 self-heal the same way the agent runner does.
- Common: Upgrade modal stops mixing tier subscription with credits/BYOK: "Add Credits" actually opened the subscription portal — the label lied. The modal now focuses on tier subscription only; iOS additionally hides any credit-purchase surface per Apple §3.1.1.
- Web: Delete Account section moved to the bottom of the account page: The destructive Delete Account block used to render inline between your identity card and the subscription/billing UI. It now lives at the very bottom of the page, after the FAQ. iOS deep-link behavior into the delete-only view is unchanged.
- iOS: Manage Plan now opens Apple's subscription sheet: Manage Plan on iOS previously fell through to the web account page for free, trial, Stripe, and unknown-source users, where Google OAuth in WKWebView would fail. Tapping Manage Plan on iOS now always opens Apple's StoreKit
showManageSubscriptionssheet. - iOS: Delete-account web view collapses to a delete-only page: When the iOS Delete Account button opens the website, the page now hides every billing surface (subscription, plans, credits, Stripe portal, FAQ) and shows only the deletion section, satisfying §3.1.1.
- iOS: ITMS-90208 framework-version validation fix: The onnxruntime framework's Info.plist is now patched at archive time to match the host app's deployment target, so App Store Connect stops rejecting builds with "framework does not support the minimum OS version specified in the Info.plist".
- Android: External links open in a Custom Tab so Google OAuth works: Tapping links like "Open pebbleflow.ai/account" used to open inside the Android WebView, where Google blocks OAuth with
disallowed_useragent(Error 403). External links now route through the native bridge into a Chrome Custom Tab, which Google trusts. - Android: Copy buttons under messages actually copy: The copy icon used
navigator.clipboard.writeTextdirectly, which silently no-ops in the Android WebView when user activation expires across the async boundary. Copy now routes through the native clipboard bridge.
Version 0.9.719 (April 25, 2026)
Security
- Common: protobufjs CVE-2026-41242 patched (CVSS 9.8): Pinned
protobufjs >= 7.5.5(resolved to 8.0.1) to close a critical arbitrary-code-execution hole inRoot.fromJSON. Caiioo doesn't importprotobufjsdirectly — it's a transitive ofonnxruntime-web— and risk-in-practice was low because we only feed bundled models, not user-supplied schemas. Patched anyway because the fix is trivial. - Common: @xmldom/xmldom — 4 high-severity CVEs patched: Pinned
@xmldom/xmldom >= 0.9.10. XML parsing is used in DOCX/XLSX redline pipelines and various extension code paths. - Server: undici — 3 high-severity CVEs patched: Pinned
undici >= 7.24.0incloud/relay. Affects the relay server's HTTP client only. - Desktop: rustls-webpki + quinn-proto — 2 high-severity CVEs patched: Bumped these Tauri Rust dependencies. Affects the Windows + Linux desktop builds that go through the Tauri shell.
New Features
- Common: Video generation (Pro): Generate or animate short videos with Google Veo 3.1, OpenAI Sora 2 Pro, and ByteDance Seedance via OpenRouter. The tool picks valid durations and resolutions per model, polls until the job completes, and saves the result as a thread attachment.
- Common: Music generation (Pro): Generate songs and instrumental clips with Google's Lyria 3 Pro Preview via OpenRouter. Output is saved as an audio attachment that plays inline.
- Common: Dynamic video model catalog: The video generator fetches the current list of video-capable OpenRouter models at runtime, so new providers and models appear without a Caiioo update. A bundled snapshot keeps things working offline.
- Common: Dynamic music model catalog: Same for music generation — the tool picks up new music models as OpenRouter publishes them, with offline fallback.
- Common: Custom OAuth at Pro: Bring-Your-Own-Auth — the Google Workspace wizard, Microsoft 365 wizard, and the generic "Add Custom Provider" flow — is now visible to all Pro, Platform, Teams, and Enterprise users in Settings → Custom OAuth. Previously the tab and add buttons were hidden behind tester-only flags, so paying users couldn't reach BYOA setup.
- Common: Physics + Structural Analysis (Pro): The physics simulation tool (projectile motion, collisions, kinetic/potential energy, momentum, force, impulse, velocity-to-target) and structural analysis (beam loading, column buckling, material properties) is now available at Pro alongside the other creative and utility tools.
- Common: Seeing-Eye Dog — vision fallback for text-only LLMs: Text-only models like DeepSeek V4 Pro, Kimi K2.6, MiMo V2.5 Pro, and local Ollama models can now handle image attachments by routing them through a configured cheap vision model (default: Gemini 3.1 Flash Lite). Auto-captioning fires at message-build time and caches per attachment so subsequent turns don't repay; a dedicated
vision({action: "inspect"})tool gives the model targeted follow-up access. Settings → Tools → Vision Fallback Model chooses the helper. - Common: XLSX cell-level tracked changes: Spreadsheet artifacts now support the same redlining UX as DOCX. AI proposals via
propose_change(editMode: 'xlsx_cell')produce cell-located tracked changes anchored bycellRef+sheetName; user typed edits in track-changes mode bake cell-level diffs; cells with pending changes render<del>old</del><ins>new</ins>inline; the existing toolbar's next/prev/accept/reject works on cell changes; concurrent AI + user edits merge cell-by-cell with user-wins on same-cell conflicts. - Common: Cost tracking for video and music generators: Generated videos (via OpenRouter
/api/v1/videos) and music (viachat-completions) now roll their cost into thread totals just like image generation, with newvideo_gen_costandmusic_gen_costbreakdown rows in the sidepanel cost dropdown.
Improvements
- Common: GPT-5 series stability: OpenAI's gpt-5, gpt-5.1, gpt-5.4, gpt-5.4-pro, and gpt-5.3-codex no longer produce silent empty responses on tool-heavy agentic conversations. A function-tool schema interaction was causing OpenAI's backend to terminate streams without producing any output; Caiioo now serializes tools in the shape these models require.
- Common: Better long-conversation behavior on GPT-5.4+: Multi-turn conversations with gpt-5.4, gpt-5.4-pro, and gpt-5.3-codex no longer early-stop on long tool-calling sequences — the phase marker OpenAI uses to distinguish intermediate commentary from final answers is now preserved across turns.
- Common: GPT-5.x reasoning visible in the thinking panel: gpt-5, gpt-5.1, gpt-5.4, gpt-5.4-pro, and gpt-5.3-codex now stream their reasoning summary into the in-chat thinking panel as they think, matching how Gemini already behaves. Previously the thinking panel stayed empty for these models even though OpenRouter was streaming reasoning text.
- Common: Transparent recovery from transient upstream outages: When OpenRouter routes a request to an upstream that returns a transient 5xx error before any content streams, Caiioo quietly retries on a different upstream instead of surfacing an empty reply.
- Common: Clearer errors when the model stream fails: Provider-side crashes, content-filter rejections, and other mid-stream failures now surface with a specific error message instead of the conversation just "stopping" with no explanation.
- Common: Generated videos and music render inline: Generated videos and music now appear in the conversation like generated images — a video or audio player surfaces above the final answer with a small badge showing which model produced it, instead of being tucked inside the collapsed tool section.
- Common: Default-model picker for video and music tools: Settings → Tools now includes a Default Model dropdown for the video and music generators, mirroring the existing image-generator picker. The agent uses your selected model by default; you can still override per request by naming a different model.
- Common: Personal Intuition indexed-status display: The Personalization settings panel now shows a live "Indexed: N threads · M chunks · K tokens · last indexed Xm ago" line so you can confirm the memory indexer is doing its job. The last backfill summary also stays visible after the run completes instead of vanishing.
- iOS: Native on-device embeddings (faster, less memory): Multilingual-e5-small inference now runs in the iOS host process via Apple's
onnxruntime-objcinstead of inside the WKWebView. This solves a per-process memory cap that was killing the WebView ~7 seconds after model load (the post-login crash). After the first encode of the tool catalog, subsequent tool-selection calls hit a per-text in-memory cache and complete in ~10 ms instead of ~5 s. The shared-pipeline refactor also stops Personal Intuition and on-device tool selection from each loading their own ~115 MB copy of the model. - Common: Personal Intuition + on-device tool selection share one e5 pipeline: Both features previously instantiated their own multilingual-e5-small loader (~115 MB each, ~230 MB total in the renderer). They now delegate to a single shared pipeline that dedupes concurrent loads, halving cold-start cost.
Bug Fixes
- Common: Browser-session cookies no longer leak into API calls: If you'd visited openrouter.ai in a browser tab, the extension was unintentionally attaching your OpenRouter browser session (Clerk / Stripe / analytics cookies) to every API call. API calls are now strictly Bearer-authenticated.
- Common: GPT-5.4 responses no longer appear twice: Fixed a bug where gpt-5.4, gpt-5.4-pro, and gpt-5.3-codex responses showed the same text back-to-back — the final-answer payload was being accumulated alongside the token stream that had already rendered it.
- Common: Model picker refreshes after reload: The model list is now invalidated on version upgrades and honors a short in-memory TTL, so newly released OpenRouter models appear after a reload instead of being hidden behind a stale cache. Long-lived service workers no longer hold onto a frozen catalog.
- Common: Generated-media short links open the player: When a model writes a short link like
[Listen](audio-…)after generating audio/image/video, clicking it now opens the slate player instead of failing as a broken external URL. Same fix applies to image and video links. - Common: Generated media as markdown image embeds rendered broken: When the model wrote a video or audio attachment as
instead of[Video 1](video-…), it rendered as a broken<img>instead of opening the player. The leading!is now stripped so the link opens the slate viewer. - Extension: Sidepanel "Location permission denied" dead-end: Fixed a regression where every sidepanel location request returned "Location permission denied. Please allow location access in browser settings." regardless of what the user clicked. An attempt to route through the offscreen document hit
PERMISSION_DENIEDinstantly because the offscreen context can't show a permission prompt and the manifest doesn't declaregeolocation. The sidepanel now uses the original content-script-then-IP-geolocation chain that worked before. - Common: Personal Intuition full rebuild left stale indexed counts: Fixed full-rebuild backfill skipping the per-thread index update, so Settings → Personalization showed inflated pre-rebuild totals after a rebuild. Threads scanned by full-rebuild but not previously indexed are also no longer invisible to retrieval.
- iOS: Post-login WKWebView crash loop: Three converged fixes for the post-login crash. (1) Native e5 embeddings now run chunked in batches of 8 with the ORT memory arena set to shrink between runs — peak working set stays under ~100 MB instead of spiking to ~3 GB and tripping iOS's per-process memory kill. (2) The on-device retriever now warms during
requestIdleCallbackinstead of inline at boot, so it no longer competes with license sync, identity restore, UI render, and cloud-sync init. (3) The on-device retriever now re-provisions on identity / tier change, so fresh installs no longer silently fall back to cloud helpers because tier was undefined at first registration. - iOS: Bogus "path traversal blocked" 403s: Fixed
LocalFileSchemeHandlermis-flagging every 404 as a path-traversal attempt becauseNSString.standardizingPathonly resolves/var→/private/varfor files that exist on disk. Legitimate paths to non-existent resources (the iOS bundle excludes*.wasm, plus chrome-extension API paths likeapi/active-tab-context) returned 403 instead of 404, breaking transformers.js' wasm pre-fetch fallback. Replaced with a string-based../ NUL check matching the Android handler. - Common: PDF embedded images broke text-only models: Sending a PDF with embedded images to a text-only OpenRouter model (DeepSeek V4 Pro, Kimi K2.6, etc.) was hitting "No endpoints found that support image input" — the warning the new Seeing-Eye Dog routing was supposed to eliminate. The PDF delivery path now respects the same per-model
supportsVisionflag that image-block delivery already honored, so text-only models receive a text-only PDF and the agent can spawn a vision subagent if needed. - Common: Ad-blocker level toggle didn't take effect: Toggling the ad-blocker level (Off / Standard / Aggressive) only changed the persisted value while the live DNR rules + static ruleset stayed in effect until the service worker happened to restart — so after toggling to Off, sites like ads.google.com remained blocked. The storage listener now watches the
globalSharedSettingsbucket where the setting is actually written, and the legacyadBlockerEnabledflag routes through the same bucket. - Extension: Oversized images were dropped silently: When an image attachment exceeded the API cap, the extension service worker had no compress impl registered (only the server's sharp-backed one was) — so the defense in
pushImageBlockcaught the throw and dropped the image entirely. The SW now probes natural dimensions viacreateImageBitmapand routes the encode through the existing offscreen document, so extension-context callers get the compressed image instead of a dropped placeholder. - Common: Generated images and screenshots rejected by providers: Anthropic caps base64 images at 5 MB; other providers have similar limits. Generated images from FLUX / Gemini / Seedream and large screenshots were being passed through at full size, producing 4xx errors that aborted the entire agent turn. Three layers of defense: image-generator compresses output before storage; a new
pushImageBlockhelper routes everyimage_urlemission through compress-or-drop; and screenshot rebuild + live screenshot injection both go through the same helper. Conservative 4 MB cap that works on every provider.
Version 0.9.718 (April 22, 2026)
New Features
- Common: Personal Intuition — associative cross-thread memory: Caiioo quietly remembers what you've talked about before — across every conversation, not just this one. Relevant memories come to mind as you chat; your agent might use one directly, or just let it color the response. Like how a scent can remind you of a place. Find it under Personalization → Personal Intuition; off by default. Includes Full-rebuild and Incremental backfill so you can index your existing threads.
- Common: Personal Intuition backup & restore: You can snapshot the full memory corpus to a JSON file under Backup & Restore → Personal Intuition, and restore it later if something goes wrong or you want to move it to another device.
- Common: Instant Tool Chooser default for every tier: The semantic tool chooser now runs locally on every device, on every tier — free included. Picks the right tools for each turn in ~10 ms, fully on your device. The picker UI in Settings → Tools clarifies the choice as "Instant Tool Chooser" vs "Quick Tasks LLM" (which uses whichever model you've marked with the lightning bolt in the model picker).
- iOS / Android: Instant Tool Chooser in the native apps: The same on-device tool chooser is now provisioned inside the iOS and Android apps' WebView, not just the browser extension and desktop.
Improvements
- Common: Slate sandbox allows HTTPS script CDNs: HTML slate artifacts can now load common libraries like Chart.js or D3 from reputable HTTPS CDNs (jsdelivr, unpkg, etc.). Data fetching is still locked to same-origin, so use
bind_datato pipe attachment data into a slate. - Common: Clearer Personal Intuition progress: Backfill now shows per-thread progress in the settings panel (e.g. "47/75 threads · 2,134 chunks") while it runs, instead of going silent until completion.
- Common: Faster tool selection after memory indexing: Indexing a large memory corpus no longer evicts the tool-catalog cache in the on-device retriever — the next tool call stays warm rather than paying a re-encode cost.
Bug Fixes
- Common: Concurrent sub-agents hit false tier-upgrade errors: Fixed a race where two sub-agents running in parallel could overwrite each other's active-thread context, causing Pro-gated actions (update_slate, etc.) to be rejected for the wrong sub-agent. Tier enforcement is now scoped to the specific thread of the dispatching call.
- Common: Slate CSV data bindings broke
forEach: Fixed CSV-bound data arriving in the sandbox as a non-iterable object —data.forEach(...)inside sandbox JavaScript now works as expected.
Version 0.9.717 (April 17, 2026)
New Features
- Common: Unified Pro Tier Across All Apps: Pro is now a single $9/mo subscription that unlocks every Caiioo app — Chrome, Desktop, and Mobile — instead of separate app-tier buckets. Platform ($14/mo) is repositioned as "Pro + infrastructure" (local server, API access preview, Messaging).
- Common: Pro Mobile Tier: A mobile-only Pro plan is available for $2.99/mo, sold directly in the App Store and Play Store. The in-app purchase itself is the entitlement — no separate license needed.
- Common: Ask-User Tool for Human-in-the-Loop: The agent can now pause mid-run and surface a four-way decision dialog (approve, approve with notes, reject, reject with notes). Your notes flow back to the model as plain-English guidance that overrides the proposed plan — no new cycle fires, the agent continues in place.
- Common: In-App Guide Search: Search the user guide directly from the Document menu. Results deep-link to caiioo.ai/guide, preserving the existing redirect flow.
- iOS: Monthly/Yearly Paywall Toggle: The iOS Subscribe sheet now lets you switch between Monthly and Yearly billing before purchase.
- Desktop: Auto-Updates on Windows and Linux: The Tauri desktop app now ships with the updater enabled, so Windows and Linux builds can receive updates in place instead of requiring a manual reinstall.
- Web: Tabbed Search Across Guide and Blog: The marketing site's guide and blog layouts now include a search bar with tabbed scope switching — title matches rank above body matches, and the active tab auto-switches to where the results are.
- Web: Dedicated Linux Install Page: The install page routes Linux users to
/install/linux, which lays out AppImage,.deb, and.rpmchoices with per-distro commands. - Web: Stable Download Routes:
/download/macos,/download/linux, and/download/windowsare now stable, edge-worker-backed URLs that always resolve to the latest release — no website rebuild required per release.
Improvements
- Common: In-App Support Ticketing: Support requests now route through the in-app ticket flow as the primary path, with clearer credit-error messaging pointing you to the right place when something goes wrong.
- Common: Smarter Google OAuth Re-Auth: Google tools now verify the scopes actually granted by Google and trigger a just-in-time re-authorization prompt when a 403 indicates a missing scope, instead of silently failing.
- Common: Full Drive Scope for Google Writes: Write actions against Google Drive now request the full drive scope so edits to documents you didn't create succeed instead of hitting permission errors.
- Common: Image Generator Model Descriptions: The image generation tool surfaces a description for each available model so it's easier to pick the right one for the task.
- Common: More Accurate Slate Artifact Messages: When a tool creates or updates a Slate artifact, the model no longer claims the artifact is "displayed in the editor" — the wording now reflects what actually happens.
- Common: Better Dynamic Tool Selection from Tabs: The tab-context hint now emits tool IDs, so the dynamic tool selector can actually apply context-based selection rules that depend on which tools are available.
- Common: Voice Costs Attributed to Threads: Streaming TTS (ElevenLabs, Resemble) and STT (ElevenLabs Scribe) usage now rolls up as voice cost on the conversation where it happened, matching the existing one-shot synthesis path.
- Extension: Login Subtitle Emphasizes Local Agents: Small copy update on the login screen.
Bug Fixes
- Desktop: Linux File Picker Restored: Fixed the file picker failing on Linux in the Tauri desktop app.
- Desktop: Version Reporting Wrong: Fixed the Tauri desktop app reporting an incorrect version string in the UI and telemetry.
- Desktop: Quieter Linux Audio and Clearer Errors: Suppressed PipeWire log noise on Linux, surfaced OpenRouter errors instead of swallowing them, restored the API toggle read path, and fixed an incorrect parity banner.
- Common: Gmail Body Extraction Lost Links: Reversed the Gmail extraction precedence to try link-preserving markdown first, then fall back to plain text only when the markdown output is empty — emails no longer lose links in the common case.
- Common: Public API Settings Not Persisting: Fixed the Public API toggle and access token failing to save on some setups, and ensured these settings remain strictly local rather than syncing to the cloud.
- Common: Archive and Rename Felt Laggy: Delete, archive, unarchive, and rename now update the UI optimistically so the list reflects the change immediately instead of waiting for a server round-trip.
- Common: Archive Deletes Didn't Sync: Fixed archive deletions being applied directly to storage, bypassing the protocol path — they now propagate across clients like every other thread action.
- Common: Agenda Showed False "Missing Scopes": Fixed the reminders/agenda widget reading a stale tool-context profile right after you granted Google Calendar scopes, so it flagged missing scopes while tools worked fine. The UI now reads the fresh profile, and a refresh auto-runs the incremental-scopes consent flow sequentially across any connected Google accounts that still need it — no need to open the sync dialog to find "grant access".
- Common: Google Tool Calls Rejected with "Unknown name exclusiveMinimum": Fixed Gemini rejecting tool calls whose parameter schemas used numeric
exclusiveMinimum/exclusiveMaximumbounds — these are now stripped before sending to Google's restricted OpenAPI subset. - Common: Remote MCP Servers Without DCR Failed to Connect: Fixed adding remote MCP servers that don't support Dynamic Client Registration (e.g. Slack): the client now respects RFC 9728 protected-resource-metadata, follows
authorization_serversto the AS host when it differs from the resource host, and stops fabricating a/registerendpoint that doesn't exist. - Server: Tester-Tier Users Rejected by Gated Endpoints: Fixed organizations, hub, admin, and provisioned-key endpoints rejecting users whose effective tier (from licenses) was higher than the stale tier snapshotted into their auth token. Gated endpoints now evaluate the effective tier consistently with
/api/auth/me. - Extension: Identity-Sync Ping-Pong Loop: Fixed conflicting identities between the extension and relay server looping forever on
IDENTITY_SYNC. The extension now acks once against the same conflicting remote, warns that you must sign out on one side to resolve, and resets on disconnect / match / remote adoption. - Common: Gemini Flash TTS Couldn't Be Selected or Configured: Fixed Gemini Flash TTS (voice) being unusable end-to-end — selecting it reverted on settings close, and the API key field appeared to not accept typing. The provider is now a first-class TTS option: selection persists, the API key + voice picker save correctly, and synthesis uses an SSE streaming endpoint so audio playback can begin before the full utterance finishes generating.
- iOS: New-Thread UI Didn't Switch: Fixed the iOS app failing to switch the active view to a newly created thread, and fixed a related
modelNameReferenceError that could crash the composer. - iOS: Startup Crash on Optional Storage Hook: Fixed an iOS crash when
storage.primeSettingsCachewasn't available during startup — the call is now guarded. - Common: Tier-Audit Discrepancies After Pricing Reshape: Fixed six pricing and feature inconsistencies surfaced by the tier audit, so entitlement gates, upgrade prompts, and feature flags now agree across the extension, website, and mobile apps.
iOS App Store Compliance
- iOS: External-Signup CTAs Hidden in Onboarding: Onboarding no longer shows calls-to-action that point users off-device for account creation or paid upgrades.
- iOS: Credit Purchase UI Hidden: Credit purchase surfaces are hidden on iOS per Apple's §3.1.1 rules; subscriptions route through StoreKit instead.
- iOS: Support and Credit-Error Wording Updated: The /support page and credit-error messaging have been reworded on iOS to meet Apple's guidelines on external purchase references.
Version 0.9.716 (April 12, 2026)
New Features
- Common: Gemini 3.1 Flash TTS: New voice-output option powered by Google's Gemini 3.1 Flash TTS. 30 prebuilt voices, 70+ languages, and inline audio tags like
[whispers]and[sighs]for expressive synthesis. Bring your own Gemini API key; get one free at aistudio.google.com. Audio is watermarked with SynthID. - Common: Voice Cost in Conversation Total: Text-to-speech and speech-to-text API usage (Gemini, ElevenLabs, Resemble) now rolls into the per-thread cost summary alongside chat, image-gen, helper, and OCR costs — so the number you see is the real number.
- Common: SQL Database Tool: Attach SQLite databases to conversations and query them with SQL. Results can be rendered as live dashboards in Slate artifacts.
- Common: Calendar Invitations and Google Meet: Creating calendar events now sends invitation emails to attendees and can automatically add a Google Meet video link.
- Common: Per-Tool Provider Routing: Route individual tools (image generation, search, PDF OCR, etc.) to specific providers via Settings, instead of everything going through the default model.
- Common: Conversation Sharing: Share conversations as Markdown or plain text.
- Common: 14-Day Platform Trial: Start a 14-day Platform tier trial from the extension UI or the website account page. Pro subscribers are also eligible.
- Common: ZDR Quick-Filter in Model Picker: New button in the model selector dropdown to quickly filter for zero-data-retention models.
- Common: Simplified Settings: Settings now opens in a streamlined Simple view by default, showing just the essentials. Switch to Advanced mode anytime to see everything.
- Common: Free Google Gemini Access: Bring your own Google AI Studio key to use Gemini 2.5 Flash for free — Google's free tier is now available to all users, no paid plan needed.
- Common: Baseten Self-Hosted Models: Connect your own Baseten deployments to use self-hosted models directly in caiioo.
- iOS: WhisperKit On-Device Speech-to-Text: Dictation now runs fully on-device via WhisperKit — audio never leaves the phone.
- Android: On-Device Whisper STT: Speech-to-text via whisper.cpp runs fully on-device on Android with microphone permission flow.
- Android: GPU-Accelerated Moonshine STT: On GPU-capable Android devices, dictation can run on Moonshine with sub-second inference. English only; whisper.cpp remains the multilingual fallback.
- Web: Sign in with Apple on Account & Hub Pages: Apple Sign-In is now available on the website account and hub pages.
Improvements
- Common: Better PDF Export: Exporting a PDF from Slate now produces a true PDF file instead of opening the print dialog, so you get a clean document every time.
- Common: Google Slides Full Text: Requesting slide text without specifying a page now returns text from all slides in the presentation.
- Common: Smarter Web Search: Google search results are now extracted with better structure, capturing titles, snippets, and links more reliably after recent Google layout changes.
- Common: Kokoro Text-to-Speech on All Platforms: The Kokoro voice option now appears on iOS, macOS, and Android — previously it was hidden on native apps.
- Common: Cleaner Credit Balance Display: Account balance for prepaid and bring-your-own-key setups now shows your actual balance without confusing "Limit" framing.
- Common: Higher-Fidelity PDF to Word Conversion: PDF→DOCX export now uses a dedicated Document view in Slate, producing Word files that more faithfully preserve layout and structure from the source PDF.
- Web: Quieter Account and Auth Pages: Google Analytics and the cookie consent banner are suppressed on account and auth pages for a cleaner sign-in flow.
Bug Fixes
- Common: Settings Search Didn't Navigate: Fixed the settings search dropdown and deep links failing to jump to several sections (API Access, Private Sync, Backup & Restore, Voice, and others) — the target category stayed collapsed so nothing scrolled into view.
- Common: OAuth Prompts in Sub-Agents: Fixed OAuth authorization and tier-upgrade prompts being silently dropped when triggered from a sub-agent, which broke connect flows mid-run.
- Common: Ollama Model Not Saved After Onboarding: Fixed the Ollama model you selected during onboarding not persisting afterward.
- Common: UI Chunks Failed to Load During Rate Limiting: Fixed static UI assets being rate-limited alongside API requests, causing blank screens or missing panels until refresh.
- Common: Composer-Preprocessed Attachment Metadata Lost: Fixed attachment metadata being dropped on the server for documents preprocessed in the composer before being sent to the model.
- Common: Free Time Counted Attended Events as Busy: Fixed the agenda free-time calculation marking you busy during events you only attended — only events on your own calendar are now counted.
- Common: Google Search Results Broken: Fixed web search returning empty results after Google changed their search page layout.
- Desktop: Windows/Linux Login Out of Sync with Other Platforms: Fixed the Tauri desktop login flow diverging from Chrome/macOS/iOS — including incorrect tier detection and being unable to log back in after signing out.
- Desktop: Public API Blocked the Desktop UI: Fixed the desktop app's own UI being blocked by API authentication when the Public API setting was enabled.
- Common: Google Docs OAuth Errors Hidden: Fixed Google Docs operations silently failing when authentication expired instead of showing a clear error.
- Common: Settings Panel Crash: Fixed the settings panel crashing when restoring a provisioned API key.
- Common: Google Drive Sync Duplicating Documents: Fixed Google Docs being duplicated on every sync run instead of updating in place.
- Common: Google Drive Sync Inline Code Lost: Fixed inline code formatting being lost when syncing documents to Google Docs.
- Common: Gemini Thinking Extraction: Fixed thinking/reasoning content not being captured correctly from Gemini model responses.
- Common: Profile Switch Stale Data: Fixed switching between profiles sometimes showing conversations from the previous profile until a manual refresh.
- Common: Task Scheduler "Method Not Found" Error: Fixed scheduled tasks throwing errors every 60 seconds in certain setups.
- Common: Text Insertion Newlines in Google Docs: Fixed literal
\nappearing in text insertions instead of actual newlines. - Common: Google Account Hint on Tools: Fixed Google tools sometimes using the wrong Google account for API calls.
- Common: Settings and Thread Import Round-Trip: Fixed settings and thread exports not importing back correctly.
- Common: Sub-Agent Token Usage: Fixed sub-agent token usage not being counted in the parent conversation.
- Common: Messaging Bridge Reply During Retry: Fixed messaging bridge replies failing when the assistant retried a response.
- Common: Slate Redline Positioning: Fixed redline deletions landing at the wrong position when markdown was present.
- Common: Browser Connection Survives Sleep/Wake: Fixed "No browser connected" errors after the computer sleeps or sits idle for long periods.
- Common: Scheduled Tasks Survive Service Worker Sleep: Fixed recurring tasks stopping after long idle periods.
- macOS: Safari Google Search Failures: Fixed Google search not working in the Safari extension on macOS.
- iOS: Mobile App Access: Fixed the iOS app incorrectly requiring a paid plan — the mobile app is now available to all users.
Version 0.9.715 (April 3, 2026)
New Features
- Common: Hub Delivery System: First-party tool definitions, mode configs, and MCP catalog are now prepared to be delivered from the cloud hub instead of being hardcoded in the bundle, with local caching and offline fallback in preparation for community launch.
- Common: Single-Tab Mode for Browser Tools: New per-mode browser tab policy reuses a single tab for all navigation in a conversation, preventing tab accumulation during messaging-heavy workflows like WhatsApp.
- Common: API Integration Binary Uploads: The api_integration tool now supports binary file uploads and direct local file uploads via presigned URLs, instead of always JSON-stringifying request bodies.
- Common: Credential Management via Tool: Save, list, and delete API credentials programmatically through the api_integration tool, previously only possible through the Settings UI.
- macOS: Audio Input Device Selection: Choose which microphone to use for voice input — lists all available audio devices and persists your selection.
- Android: OAuth Recovery on Low Memory: Android OAuth flows now survive activity recreation when the system kills the app for memory, recovering the auth session seamlessly.
- Common: 3-Layer Mobile CSS Architecture: New mobile-responsive CSS system with platform, layout, and component layers, ensuring consistent UI across phone, tablet, and desktop viewports.
Improvements
- Extension: Incremental OAuth via Tab: Incremental scope requests now open a full browser tab instead of a popup, fixing hangs on accounts that require interactive consent.
- Common: MCP Tool Schema Quality: All MCP tools now pass a quality lint test verifying complete schemas, proper descriptions, and consistent parameter definitions.
- Extension: Auto-Close Browser Tabs on Stop: Browser tabs opened by the assistant are now closed when a run is stopped, and OAuth Safari windows are closed on macOS.
- Android: Toolchain Upgrade: Android build upgraded to AGP 9.1, Kotlin 2.2.10, and Gradle 9.3.1.
Bug Fixes
- Common: Google Docs Table Positioning: Fixed mutations landing in wrong positions when tables precede the target text in Google Docs.
- Common: Page Content Footer Stripping: Fixed page content extraction incorrectly stripping footer elements from captured pages.
- Common: Web Browsing Pagination: Fixed pagination broken for page 2+ when no explicit maxLength was specified.
- Common: Content Script Fallback: Fixed page content extraction failing silently by falling back to executeScript when the content script is unavailable.
- Common: PDF Rendering for URL-Ingested Documents: Fixed PDF native rendering failing for documents ingested via URL.
- Common: Gemini Tool Call Parsing: Fixed double-quoted keys in LLM tool call arguments causing parse failures with Gemini models.
- Common: Amazon Nova Compatibility: Fixed tool_call messages rejected by Amazon Nova due to blank text fields.
- Common: Agent Run Not Terminating: Fixed agent runs not calling endRun reliably in service worker context, leaving browser tabs and state uncleaned.
- macOS: Voice Audio Resume: Fixed voice audio not resuming after macOS app sleep or dock minimize.
- macOS: Screenshots Not Appearing: Fixed macOS screenshots not appearing in the assistant UI due to incorrect context detection.
- macOS: Location Not Working: Fixed location broken in Safari and native macOS app by adding CoreLocation bridge and fixing IP fallback.
- macOS: Location Entitlement Missing: Fixed missing location entitlement and increased native location timeout.
- iOS: File Picker Broken: Fixed file picker not opening in iOS WKWebView by adding the runOpenPanel delegate.
- Extension: OAuth Scope Over-Granting: Removed include_granted_scopes from web popup and post-login OAuth paths to prevent unintended scope inflation.
- Extension: OAuth Consent Screen Skipped: Fixed OAuth scope elevation skipping the consent screen during incremental scope requests.
- Desktop: Console Window Visible on Windows: Fixed console window appearing when launching the Tauri sidecar on Windows.
- Desktop: Windows Build Broken: Fixed node binary path resolution and flattened node_modules for Windows Tauri builds.
- Server: Duplicate Webhook Processing: Fixed webhooks being processed multiple times when multiple clients were connected.
- Server: Promotion Codes on Credit Purchases: Disabled promotion codes for credit purchases in Stripe checkout.
Version 0.9.714 (March 29, 2026)
New Features
- Common: Redesigned Onboarding: New onboarding screen presents 6 clear AI access paths, making it easier for new users to understand their options for connecting to AI providers.
- Common: PDF from Markdown: New documents created from scratch can now be exported as PDF, converting markdown content to formatted PDF output.
- Common: Unified Google OAuth: Google account connection now uses a single verified OAuth app with all scopes, replacing the previous multi-app setup. Simpler connection flow with fewer prompts.
- Common: Save to Slate Meta-Parameter: Tools can now output directly to a slate document via the
_save_to_slateparameter, capturing structured results without copy-paste. - Common: Slate Anchor Links: Chat messages with slate references now include clickable anchor links that scroll to the relevant section within the document.
- Common: Sandbox Data Bridge: HTML artifact sandboxes can now dynamically access slate data via a postMessage bridge, enabling live dashboards and interactive visualizations.
- Common: Tier Badges: Settings, tools, and automation features now display tier badges showing which access level is required, giving expanded-access users visibility into gating.
- macOS: Native Speech-to-Text (WhisperKit): On-device speech recognition via WhisperKit and CoreML, replacing browser-based STT with a faster, private, native implementation.
- macOS: Sparkle Auto-Update: Production-ready automatic updates for the macOS app and Safari extension via Sparkle, with appcast feed and delta updates.
- macOS: Native Tab Context: AppleScript-based tab context extraction provides the macOS app with awareness of frontmost browser tabs.
- macOS: Voice Hotkey Streaming: Voice hotkey now shows a real-time streaming overlay during recording with automatic WhisperKit model loading.
- Safari: Native Messaging: Safari extension now communicates with the macOS app via native messaging instead of HTTP polling, improving responsiveness and reliability.
- iOS/Android: Native Save Dialogs: JSON export now uses native save dialogs (NSSavePanel on macOS, SAF on Android) instead of broken blob downloads.
- Web: Teams Landing Page: New /teams page with team and enterprise messaging for prospective customers.
- Web: Blog Section: New blog section on the Caiioo website.
- Web: Free Trial Activation: Account page now supports free trial activation with aligned onboarding flow.
- Common: Shared Drive Support: Google Picker and all Drive API calls now support shared drives, enabling access to team-shared documents.
Improvements
- Common: Adaptive Greeting: Adaptive greeting now riffs on custom welcome messages instead of ignoring them, preserving the user's personalized tone.
- Common: OAuth Callback Cleanup: OAuth callback tabs now auto-close with a countdown timer instead of staying open.
- Common: OAuth Guidance: When a tool needs an account connection, the error message now includes actionable guidance on which account to connect.
- Common: Drive Sync Force-Rewrite: Drive Sync workflow now supports a force-rewrite option for re-syncing all files.
- macOS: Unified Debug Log Viewer: Debug log view now aggregates server, Swift, and UI logs into a single chronological stream.
- macOS: Bucketed Settings Storage: Server storage adapter refactored to bucketed architecture with robust migration, persistent markers, multi-profile atomicity, and first-writer-wins conflict resolution.
- macOS: Notarized Safari Extension: Safari extension is now signed with Developer ID and notarization-compatible entitlements, removing the "Allow Unsigned Extensions" requirement for distributed builds.
- iOS: Default to Apple Reminders: iOS and macOS now default to Apple Reminders and fetch them in the sync view automatically.
- Common: OAuth Scope Elevation: Incremental scope requests no longer display all 22 scopes — only the newly requested scopes are shown, with cleaner account selection prompts.
- Common: Google Account Auto-Retry: When a document returns 404, other connected Google accounts are automatically tried before surfacing an error.
- Common: Settings Deep-Link Navigation: All settings sections are now properly registered for search and deep-link navigation.
Bug Fixes
- Common: Tool Misrouting: Fixed tools being misrouted when models use colon-separated
tool:actionformat in their responses. - Common: Table Cell Formatting: Preserved formatting and links in table cells and cleared inherited list numbering that leaked between cells.
- Common: Slate Highlight Scroll: Fixed highlight scrolling in TipTap editor for background tabs, preview mode, and native views using reliable ProseMirror-based positioning.
- Common: Code Block Text Search: Fixed fallback to text search when structural block mapping fails for code blocks in slate documents.
- macOS: Settings Lost on Mode Switch: Global settings (API keys, connections) are no longer cleared when switching modes on macOS/iOS — an explicit allowlist now controls which settings reset.
- macOS: Storage Purge Parity: Full storage purge now clears all state fields consistently, matching chrome.storage.local.clear() behavior. Profile deletion and retention cleanup also route through buckets correctly.
- macOS: Settings Concurrency: Settings reads in applyModeDefaults now acquire the lock first, preventing race conditions. Storage stats read from buckets instead of stale flat cache.
- macOS: Migration Robustness: Storage migration retries on failure, uses persistent completion markers, and cleans up stale pre-migration data automatically.
- macOS: MCP Server Zombie Processes: MCP server child processes are now properly terminated when the macOS app quits.
- macOS: Calendar/Reminders in Agenda: Fixed Apple Calendar events and Reminders not appearing in the agenda modal.
- macOS: Google OAuth Flow: Fixed Google login failing in macOS WKWebView by routing OAuth through the relay server web flow.
- macOS: Sign-Out Navigation: Fixed sign-out not returning to the login screen on macOS.
- macOS: Remote Access Auth: Fixed 401 error on set-remote-access endpoint by removing incorrect auth requirement.
- Server: WhatsApp Reply Delivery: Fixed WhatsApp replies not posting back to WhatsApp and only staying in the PF chat thread.
- Server: WhatsApp Relay-Back Timeout: Extended WhatsApp relay-back timeout from 120s to 10 minutes and fixed resolution on generation errors.
- Server: Private Sync Reauth: Fixed private sync reauthentication flow, server-side OAuth guard, and sign-out cleanup on macOS.
- Server: Local Folder Sync Auth: Fixed 401 auth error on local/network folder sync operations.
- Server: Google Session Re-Auth: Unlock dialog now triggers re-authentication on Google session expiry instead of showing a dead-end error.
- Safari: Extension Signing: Fixed Safari extension appearing as unsigned by using release entitlements without
get-task-allowfor Developer ID signed builds. - Common: OAuth Account Mismatch: Fixed OAuth creating mismatched connections by properly creating new connections instead of overwriting existing ones with different accounts.
- Common: OAuth Scope Inflation: Fixed mismatched connections inheriting the original account's scopes, causing unintended scope over-granting.
- Common: Thread List Wipe: Fixed STATE_UPDATE timeout/retry using wrong field name, which could wipe the thread list.
- Common: Tier Badge Accuracy: Fixed tier badges incorrectly showing "Tester" on free-tier features.
- Common: Settings Panel Overflow: Fixed flex overflow in settings panel card rows causing layout issues.
- Server: Private Sync API Key Loss: Fixed private sync losing API keys when syncing between devices.
- Server: Messaging Bridge: Fixed archived chats, stop button, and mode selection in the messaging bridge.
- macOS: Custom Mode Settings Lost: Fixed custom mode settings being silently lost on macOS/iOS due to a no-op save path.
- macOS: OpenRouter Headers Blocked: Fixed relay CORS blocking OpenRouter attribution headers on macOS/iOS.
- macOS: Tab Context Switching: Fixed tab context not updating when switching between browser tabs.
Version 0.9.713 (March 28, 2026)
New Features
- Common: Sub-Agents: Full sub-agent system with persistent named agents, conversation history, parallel execution (committee pattern), sequential and interjection modes, abort cascade, and dedicated UI rendering with chronological status tracking.
- Common: MCP Tool Approval: Registered MCP tools now appear in the tool approval system, giving users visibility and control over which MCP tools agents can invoke.
- macOS: Sidepanel Push Mode: Sidepanel now supports push mode alongside overlay, with 425px default width and docked width persistence across sessions.
- Server: WhatsApp Rich Messages: WhatsApp connection now supports location sharing, contacts, reactions, and sticker messages in addition to text and media.
- Server: API Key Encryption at Rest: API keys stored in relay D1 storage are now encrypted at rest. Server-side OAuth keys are deleted after being saved locally.
Improvements
- Common: Model Alias Display: Sub-agent tool call arguments now show annotated model aliases for easier identification.
- Common: BYOA Connection Priority: Expanded-scope OAuth requests now prefer BYOA alternate connections over Basic connections, reducing unnecessary re-auth prompts.
- Common: Thought Signature Preservation: Thought/reasoning signatures are now preserved correctly for both OpenAI Responses API and Gemini multi-turn tool calling flows.
- macOS: Debug Log Propagation: Debug logging toggle now propagates to the Node.js server subprocess.
- Common: Bengali Language Support: Platform and website now support Bengali, plus trademark disclaimer and media post updates on the website.
Bug Fixes
- Common: Safari/WKWebView Streaming: Polyfilled ReadableStream async iteration for kokoro-js, fixing TTS streaming failures in Safari and WKWebView.
- Common: Custom Mode Creation: Fixed stale React closure causing custom mode creation to fail on first attempt.
- macOS: Agent Storage Mutations Lost: Agent subprocess storage mutations (mode creation, settings, skills) were silently lost on macOS due to missing persistence bridge.
- macOS: Clipboard Copy: Fixed clipboard copy failing silently in macOS WKWebView.
- macOS: WASM/WebGPU in WKWebView: Forced WASM backend for TTS/STT and local ONNX paths in macOS native app — CDN cross-origin imports and WebGPU JSEP module imports fail in localhost WKWebView.
- macOS: Transport Request Collisions: Fixed requestId collision breaking tool approval on macOS, plus improved MCP display names.
- macOS: WhatsApp Auto-Reply: Fixed end-to-end wiring for WhatsApp auto-reply on macOS relay server, including thread visibility.
- iOS: Voice Dictation Stuck: Fixed voice dictation stuck in recording state when stopRecording cleanup was bypassed.
- Safari: Browser Commands Timeout: Fixed Safari browser commands timing out due to hanging WebExtension APIs after service worker suspension.
- Safari: Heartbeat Stale Check: Skip heartbeat stale check for Safari HTTP-polling browsers to prevent false disconnects.
- Server: Private Relay Stability: Fixed Durable Object hibernation, idle timeout (1006), reconnect delays, stale socket handling, and zombie readyState issues. Eliminated split-brain between serverState and ctx.state.
- Server: Webhook Broadcast: Webhooks now broadcast to all server sockets after DO hibernation wake, working around zombie readyState.
- Server: MV3 Messaging Race: Fixed lazy-init messaging bridge race condition on webhook arrival preventing MV3 service worker crashes.
- Server: Webhook Verify CORS: Routed webhook verify test through server to avoid CORS rejection in browser.
- Extension: Logout Cleanup: Sign out now correctly clears local relay server session on macOS, detecting localhost context.
- Common: Security — Thread Isolation: State broadcasts now filter threads by current profile, preventing cross-profile data leakage.
Version 0.9.712 (March 26, 2026)
New Features
- Common: WhatsApp Connection Diagnostics: Test Connection for WhatsApp now checks webhook subscription status and app secret validity in addition to API token, catching silent delivery failures from Meta.
- macOS: Native OAuth Flow: macOS app now uses ASWebAuthenticationSession for OAuth instead of browser redirects, with automatic upgrade of stale connections lacking refresh tokens.
Improvements
- Common: DOCX Search/Replace Robustness: Search and replace in DOCX documents now handles double spaces, non-breaking spaces, smart quotes, numeric entities, and case mismatches. Includes case-insensitive fallback when exact match fails and warnings when multiple instances are replaced.
- Common: DOCX List Numbering: DOCX export now generates proper Word list numbering (w:numPr) instead of literal bullet characters, so Word recognizes numbered and bulleted lists correctly.
- Common: Agent Startup Performance: Eliminated 30-50s agent startup delays caused by ghost MCP servers, HTTP readiness timeouts, and redundant model fetches. Model cache is now passed to agent subprocesses.
- Server: Security Hardening: Command injection fix in URL handler, timing-safe token comparison, Content-Security-Policy header on relay server, rate limiting on session endpoint, CORS restricted to known origins, and parallel E2E broadcast encryption.
- Server: Webhook Signature Verification: Webhook payloads now use base64-encoded raw body to prevent JSON round-trip corruption that invalidated HMAC signatures.
- Server: OAuth Token Persistence: Fixed split-brain between server state objects that caused OAuth connections (Google Drive, etc.) to be lost after app restart.
- macOS: Performance & Stability: Menu-driven tray polling (was unconditional 3s timer), debounced window state saves, non-blocking logging, WKWebView crash recovery with exponential backoff, and memory leak fixes for script message handlers.
- macOS: Centralized Logging: Relay server output and app lifecycle events now log to
~/Library/Logs/caiioo/with 10MB auto-rotation, replacing silently dropped output.
Bug Fixes
- Common: File Picker Instructions:
pick_filestool now returns clear instructions for the user instead of a dead UI action signal that silently failed. - Server: Active Tab Poll Spam: Active tab polling in relay mode no longer generates endless timeout errors when no browser extension is connected. Uses exponential backoff (30s to 5min) on consecutive failures.
- Server: CORS for Native App: Fixed 127.0.0.1 origin blocked by CORS whitelist, breaking Google OAuth from macOS native app's WKWebView.
- Server: Orphaned Agent Processes: Agent subprocesses are now properly cleaned up on server shutdown and uncaught exceptions.
- macOS: Browser Orchestrator Log Spam: Fixed "Unhandled message type" warnings for browser protocol messages in extension connections.
Version 0.9.711 (March 24, 2026)
New Features
- Common: Auto-Save Settings: Settings fields now persist as you edit with a 500ms debounce for text fields and immediate save for toggles/selects, preventing data loss on crash or navigation.
Improvements
- Common: DOCX Numbering Preservation: Tab characters in numbered DOCX sections (e.g., legal documents with "3.2.3 → Heading") are now preserved through the edit round-trip instead of being silently dropped.
- Common: Smarter Tool Selection: The dynamic tool selector now handles factual queries (business hours, prices, addresses) that need web verification, instead of only matching strict trigger words.
- Common: Agent Language Changes: Language changes initiated by the agent now apply immediately instead of requiring a manual settings toggle.
- Common: Multi-Account OAuth Fallback: When the primary Google account lacks required scopes, connected secondary accounts are checked before prompting re-auth. Fixes uncaught errors when secondary account tokens expire.
- Server: Mode Settings Validation: Server now validates tier permissions when saving settings, preventing free-tier users from persisting pro-tier settings.
- Common: Security Hardening: DOMPurify sanitization on reasoning preview output, restricted CORS to known origins, HTTPS-first IP geolocation lookup, and tightened web_accessible_resources.
Bug Fixes
- Common: Google Search Result URLs: Browser Google search now returns structured links with actual URLs. Previously, the agent needed 5 tool calls to extract URLs because AgentMarkdown lost link hrefs from Google's complex nested HTML — now links are extracted directly from the DOM and included in the first search result.
- Common: Private Sync Infinite Loop: Fixed sync loop caused by manifest backup file writes continuously triggering re-sync. Empty settings objects now propagate correctly across devices.
- Common: Messaging Bridge Updates: Inbound messages, agent replies, and relay-back responses now broadcast to the UI immediately instead of appearing only after the agent finishes responding.
- Extension: Private Relay Reconnect: Fixed relay staying disconnected after MV3 service worker restart by falling back to stored enabled state. Also fixed self-replacement race during enable() and identity change reconnect.
- Extension: Private Relay Race Conditions: Hardened enable/disable races, added unhandled rejection guards on messaging bridge callbacks, and fixed timer leak on settings panel unmount.
Version 0.9.710 (March 22, 2026)
New Features
- Common: Zero-Day Risk (ZDR) Enforcement: Live ZDR data sync with an enforcement toggle for OpenRouter — view provider count breakdowns and block models flagged with zero-day risks.
- Common: Tool Call Healing: Weak models that emit XML tool calls inside reasoning blocks are now automatically detected and healed, improving reliability across non-frontier models.
- Common: Unified Token Lifecycle: Symmetric OAuth token refresh across all platforms (extension, server, macOS, iOS) via a unified TokenLifecycleManager, with proactive refresh before expiry.
- iOS: Composer Icon Scaling: Composer action icons now scale 1.18x on iOS for better tap targets, with a dedicated
native-iosbody class for platform-specific styling.
Improvements
- Common: Slate Proposed Changes: Navigation between proposed changes, hover-based accept/reject, and counter updates now work correctly in the Slate editor.
- Common: Slate Selector Dropdown: The slate type selector now uses an inline dropdown instead of a portaled popover, fixing z-index and positioning issues.
- Common: Propose Change Full Replacement:
propose_changenow allows full document replacement for small documents (500 characters or fewer) instead of requiring partial edits. - Common: Voice Send Waits for Transcription: Pressing Send/Enter while recording now waits for the transcription to complete before sending, preventing empty or partial messages.
- Common: Private Sync Setup Flow: Passphrase dialog now auto-shows after OAuth during private sync setup, streamlining the onboarding flow.
- Common: BYOA Registry Refresh: BYOA provider registry now re-initializes after private sync downloads new credentials, ensuring imported connections are immediately usable.
- Common: Terminology Consistency: "Cloud Sync" renamed to "Private Sync" in user guide content with a terminology guard test to prevent regression.
- Server: Private Relay Stability: Compatibility date updated, Durable Object crash handling improved, and MV3 reconnect logic hardened for reliable long-lived connections.
- Server: Private Sync Settings Excluded: Private relay settings are no longer included in cloud sync payloads, preventing cross-device relay config conflicts.
Bug Fixes
- Common: SVG Image Handling: SVG files are no longer sent as
image_urlto providers, which caused HTTP 400 errors on models that don't support SVG format. - Common: Private Sync Race Conditions: Fixed messaging relay failures caused by race conditions during private sync initialization. Sync loop and quarantine logic hardened.
- Common: Private Sync Data Deletion: Cloud data deletion and audit now work correctly after disabling private sync.
- Common: Model Matching Contamination: Fixed cross-family model matching in the auto-adjust system that could incorrectly map models across provider families.
- Common: Dynamic Tool Selector: Hardened the dynamic tool selector prompt to prevent agent-like reasoning that could cause tool selection failures.
- Common: Validation Error Messages: Required field validation now returns focused error messages instead of generic failures.
- Extension: Private Relay WebSocket Drops: Fixed WebSocket connections dropping after ~60 seconds due to MV3 service worker suspension. Keepalive and reconnect logic improved.
- Extension: Google OAuth Incremental Scopes: Fixed OAuth scope expansion using
launchWebAuthFlowin extension context instead of failing silently. - Extension: Relay Toggle Settings Reload: Suppressed unnecessary settings reload when toggling the relay, preventing UI flicker and diagnostic noise.
- Server: Durable Object Hibernation: Fixed WebSocket close handling after Durable Object hibernation that could leave connections in a stale state.
- Server: Messaging Credential Persistence: Messaging credentials now persist immediately on change instead of waiting for the next save cycle.
- iOS: Stale WebSocket Callbacks: WebSocket handlers are now cleared on close, preventing stale callbacks from firing after reconnection on iOS.
- Desktop: NSIS Installer Upload: Windows NSIS installer now uploads correctly via API to draft releases.
Version 0.9.709 (March 21, 2026)
New Features
- Common: Slack Messaging Adapter: New Slack Events API adapter for the messaging bridge with bot signature verification, media download support, and settings UI configuration.
- Common: Slate File Roundtrip: Open and save local files directly in Slate with filesystem handle persistence and hash-based dirty tracking. New "Load File" button in the slate selector dialog preprocesses documents and opens them with the correct editor.
- Common: Slate File Menu Restructure: File type menu reorganized into Documents, Renderable (HTML/Vega/Mermaid), and Code & Data categories with auto-detection on rename.
- Common: Remote Browser App: Remote browser clients can now fetch sidepanel assets via the private relay, enabling browser-based access without the extension installed.
- Common: Device Identity in Private Relay: Relay clients now show a human-readable device name. When displaced by another device, the UI shows which device replaced you.
- Common: Composer Mic Chip Redesign: Microphone button redesigned as a visible chip/pill with clear recording state indicator, 44px minimum touch targets for mobile, and voice overlay rendered over the input area.
- Common: User Guide Link: User guide now linked from the Documentation & Legal settings section.
- iOS: BYOA Setup Gate: BYOA wizard on mobile now shows a message directing users to set up Private Apps on desktop, since the 8-step browser flow is unworkable on mobile.
- Server: Login Challenge: Email-based 6-digit MFA verification for login, with consent tracking and session revocation.
Improvements
- Common: Private Sync Settings-First: Settings now sync before conversations, ensuring tokens and config are available immediately while bulk thread sync continues in the background.
- Common: WebSocket Request Correlation: WebSocket and private relay transports now properly await server responses instead of resolving immediately, fixing 40+ UI operations on macOS sidepanel and relay clients (Drive audit, Ollama test, MCP operations, etc.).
- Server: Webhook Signature Verification: Webhook signature verification moved to route-level app secret for cleaner architecture.
Bug Fixes
- Common: WhatsApp Response Delivery: Fixed agent responses being silently swallowed instead of sent back to WhatsApp. The messaging relay-back path now logs diagnostic details when delivery fails, making future issues immediately diagnosable.
- Common: Thinking Block Signatures Across Models: Fixed "Invalid signature in thinking block" errors when switching from a non-Claude model to Claude with thinking mode. Reasoning format detection now defaults to 'unknown' instead of misidentifying as Anthropic format.
- Common: Slate Editability for New Documents: New DOCX, PDF, RTF, and XLSX slates now correctly open in their editors instead of rendering as static HTML.
- Common: Slate Export Fallback: New documents without original binary data now export gracefully — XLSX falls back to CSV, RTF to HTML conversion, PDF to browser print.
- Common: Sandbox Preview Rendering: Fixed blank HTML and Vega previews caused by sandbox origin mismatch in postMessage targeting.
- Common: Track Changes Column Offsets: Deletions in tracked changes now correctly map HTML-to-text offsets, fixing mispositioned changes in TipTap.
- Common: Private Sync Auth Recovery: Expired or revoked OAuth tokens during sync now auto-trigger re-authorization instead of requiring manual reconnect.
- Common: Voice Streaming Draft Indicator: Voice transcript overlay now shows "Preview" status with reduced opacity to signal the text is interim until recording stops.
- Common: MCP Tools on Native Platforms: Local MCP server tools are now properly registered on macOS, Windows, and Linux — previously the handler started the MCP process but never fetched or registered its tools.
- Common: Slate Context Chip Update: Active tab context chip now refreshes from storage when a slate tab's title changes after loading from Drive or GitHub. Dark mode title input text is no longer unreadable.
- Extension: Chrome Stub Polyfill: Chrome browser tabs at localhost with
window.chromebut no extension APIs now get proper stub injection instead of being skipped. - Extension: Relay Client Bundle Crash: Fixed
import.meta.envcrash in relay-client IIFE bundle by defining build-time environment variables. - Extension: Remote App Login: Remote browser app login cleaned up with Google OAuth as the primary method and email/password as a collapsed secondary option.
Version 0.9.708 (March 19, 2026)
Improvements
- Common: Generating Indicator Coordination: Per-message typing dots now coordinate with the thread-level fallback indicator, eliminating duplicate bouncing dots during generation. Dots now persist during reasoning and tool execution phases, only hiding once the final answer starts streaming.
- Common: Streaming Cursor: A blinking cursor now appears at the end of streaming text, providing a clear visual indicator that the response is still being generated.
- Common: Reasoning Preview Formatting: Collapsed reasoning blocks now render inline bold and italic formatting instead of raw markdown syntax.
- Common: Slate Document Listing:
list_slatesnow correctly shows content size for DOCX, PDF, and XLSX files that were loaded via lightweight thread queries. Listings also include workflow guidance for editing DOCX and XLSX documents. - Common: Private Sync Settings Section: Private Sync settings are now in their own dedicated section for clearer organization, separated from general settings.
- Extension: Connections Settings Restructured: The Private Relay section is reorganized into two clear subsections — Private Relay (remote access toggle, always visible) and Caiioo Bridge (local MCP servers and system tools). The remote access toggle no longer requires the Bridge to be running.
- Desktop: Tauri Unified Binary: Windows/Linux desktop app consolidated from compile-time variants into a single binary with runtime tier gating — sidepanel visibility determined by subscription tier at runtime.
Bug Fixes
- Extension: Private Relay Without Bridge: Private relay now connects independently without the Caiioo Bridge app. Previously, authentication was only initialized by the Bridge connection, leaving extension-only users (including Linux/Windows) unable to receive messaging webhooks or use remote access.
- Extension: Google OAuth Routing: Fixed OAuth popup failures on Chrome by skipping
getAuthTokenwhen unavailable and falling back to BYOA credentials. COOP popup resilience prevents blank windows on restrictive sites. - Common: ElevenLabs Streaming STT: Fixed voice transcription dropping or failing when ElevenLabs streaming encounters connection interruptions. Native app voice input now falls back gracefully.
- iOS: External Links in WKWebView: Links that should open in Safari now correctly open externally instead of loading inside the app's WebView.
- Common: Ad-Blocker Orphaned Rules: Fixed dynamic DNR ad-blocking rules persisting after service worker restarts.
disableAllBlockingnow directly queries and removes orphaned rules even when the blocker instance is null. - Common: License Sync on Profile Switch: Switching profiles now syncs the license from the server, ensuring tier-gated tools refresh immediately instead of requiring a restart.
- Common: Token Refresh Before Provisioning: Expired OAuth tokens are now refreshed before provisioning API keys, preventing silent failures. Fresh servers can now adopt existing identity connections.
- Web: Mobile Menu on iOS Safari: The hamburger menu on the marketing website is no longer transparent on iOS Safari.
- Common: React-18 Batching Race: Thread-level generating indicator added as a fallback for cases where React-18 state batching prevented per-message typing dots from appearing.
Version 0.9.707 (March 17, 2026)
Improvements
- BYOA Reconfigure Pre-Fill: Reconfiguring an existing Google or Microsoft Private App now pre-fills the client secret and tenant ID, so you don't have to re-enter them.
- Model List Sorting: Models within each provider group now sort by release date (most recent first), ensuring consistent ordering across platforms.
Bug Fixes
- Private Sync Auth Recovery: OAuth token failures (expired sessions, revoked tokens, Chrome profile tokens) during private sync initialization now surface correctly to the UI instead of silently entering an error state.
- Slate Tracked Changes Reliability: 10 trust-critical fixes for tracked changes — revision merge race condition, diff base persistence across all update paths, DOCX export formatting preservation, HTML tag stripping for TipTap matching, accept/reject-all ordering, and convergence fixes for entity escaping and tag regex matching.
- Slate Data Integrity: Fixed pasted images writing to a dead local cache, cloud source metadata not persisting, and BroadcastChannel fallback for live AI updates in relay/mobile mode.
- Slate Security: Fixed XSS injection in image viewer and replaced wildcard postMessage origins with scoped extension URLs.
Version 0.9.706 (March 16, 2026)
New Features
- Provider Error Banners: When an AI provider returns an error (402 payment required, missing API key, rate limit), an actionable banner now appears with clear instructions instead of a generic failure message.
Improvements
- DOCX Cross-Span Editing: The
propose_changetool now correctly handles search/replace operations that span across multiple formatting runs (e.g., partially bold text). Intent is passed through so the agent can make contextual edits. - PDF Save Performance: Native PDF save is significantly faster — redundant parsing eliminated, import modules cached, and unnecessary operator cleaning skipped.
- Mode Name in Messaging: The messaging settings mode selector now displays the mode's branding name instead of the internal ID.
Bug Fixes
- Google OAuth Token Refresh: Fixed "Unauthorized" errors when refreshing Google OAuth tokens that could block Calendar, Gmail, and Drive access.
- Provisioned Key Error Message: Users with provisioned API keys no longer see a misleading "Add funds" error — the message now explains the actual issue and next steps.
- Concurrent Thread Race Condition: Fixed a race where switching threads during generation could cause GENERATION_COMPLETE/ERROR events to apply to the wrong thread.
- Revision History Restore Button: The restore button in Slate's revision history now correctly appears when the current content has diverged from the selected revision.
- Double OAuth Popup: New Chrome extension users no longer see two OAuth popups when signing in for the first time.
Version 0.9.705 (March 15, 2026)
New Features
- Settings Category Grouping: 19 settings sections are now organized into 5 collapsible categories for progressive disclosure. Agent Modes section shows cross-cutting indicator with navigation links to affected sections (Tool Config, Model Config, Appearance). API Provider section opens by default with model capability warning.
Bug Fixes
- Agenda Sync Duplicate Calendars: Google Calendar accounts no longer appear twice in the Sync tab when both a login (identity) connection and a Private Connection exist for the same email. Connections are now deduplicated before rendering.
- i18n Settings Categories: Settings category labels are now translated across all supported languages.
Version 0.9.704 (March 14, 2026)
New Features
- Microsoft 365 Integration Foundation: Microsoft scope routing, Graph API client, and account service with BYOA support. Scope hierarchy definitions and tool-scope registry extensions for Microsoft provider. Microsoft added as a relay-proxied provider.
- Rich Inline Rendering (RIR): New codec architecture for rendering rich content inline in chat messages. Includes HTML, Markdown, and DOCX codecs with an extensible codec interface.
- PDF Structure Tree & Layout Analysis: Tagged PDF structure tree parser for semantic document understanding. Heuristic paragraph grouping for untagged PDFs provides fallback layout analysis. Image replacement/insertion and content overflow detection.
- Messaging Mode Selection: Messaging bridge settings now include a default mode selector, letting the agent respond in the right personality when handling inbound messages.
Improvements
- Provider Rate Limit Retry: All LLM providers now automatically retry on 429 (rate limit) and 529 (overloaded) responses with exponential backoff, instead of immediately failing. Up to 3 retries with jitter.
- Google OAuth Scope Superset Matching: Write scopes now satisfy read scope requirements (e.g.,
drivecoversdrive.readonly). Prevents unnecessary re-authorization when a broader scope is already granted. - BYOA Full-Scope Initial Auth: When a Private Connection is needed, the initial auth flow now requests all necessary scopes upfront instead of prompting twice (once for basic, once for expanded).
- Google 404-to-Expanded-Access Promotion: When a BYOA connection gets a 404 on a Google Workspace file, the error is promoted to an expanded access prompt instead of a dead-end error message.
- Messaging Credentials Private Sync: Messaging bridge credentials now sync across devices via E2E encrypted private sync instead of being device-specific. Sync manifest bumped to v10.
- OAuth Wait Extended Timeout: BYOA and expanded access OAuth flows now get a 5-minute timeout (up from 2 minutes), giving users enough time to complete Google Cloud Console steps.
- Sparkle Framework Signing: macOS distribution builds now properly sign Sparkle framework nested binaries in inside-out order, fixing notarization failures.
- Tauri Node Binary Bundling: Tauri configs now bundle the Node.js binary as a resource for Windows/Linux sidecar execution.
- Messaging Settings Search: Messaging bridge section is now discoverable via the settings search bar with keywords like "whatsapp", "telegram", "webhook".
- Settings Relay Forwarding: Settings saves are now forwarded to the relay server via WebSocket bridge, ensuring messaging credentials and other config changes reach the server's state file.
Bug Fixes
- Slate Currency vs Math: Dollar amounts like
$9/moin markdown tables are no longer misinterpreted as LaTeX inline math delimiters. The math regex now respects escaped currency dollars. - Conversation Page ToastProvider: Pop-out conversation tabs were missing
ToastProvider, causing toast-dependent features to silently fail. Provider tree now matches sidepanel.
Version 0.9.703 (March 14, 2026)
New Features
- Google Picker Integration: Full Google Drive file picker with popup/iframe hybrid, multi-view filters (Docs, Sheets, Slides, PDFs), and drive reference chips in the composer. Files picked through the Google Picker are automatically granted
drive.fileaccess. - Google Sheets Native Sync: Open Google Sheets in Slate with full cell-level round-trip editing. The Sheets codec converts spreadsheet grid data to TipTap HTML tables and back, with conflict detection via Drive
modifiedTime. Supports reading, batch cell updates, and structural changes (add/delete sheets, merge cells). - Google Slides Read-Only View: Google Slides presentations render in Slate with slide thumbnails and extracted text content for searchability.
- Gmail Batch Fetch & Markdown Conversion: Gmail tool now fetches message metadata in batches (up to 100 at a time) instead of one-by-one, with a 5-minute label name cache and automatic HTML-to-markdown body conversion for cleaner LLM consumption.
- Selection Overlay: "Add to prompt" button appears when selecting text on any webpage, letting you quickly add selected content to the conversation.
- Relay Overlay Manager: Agent overlay commands now route through the browser extension, enabling agent-driven UI overlays on the active webpage.
- Inline Quick-Tasks Model Picker: Quick tasks model selection is now embedded directly in the model picker dropdown instead of a separate menu.
- Slate DOCX Export from Google Drive: Google Docs loaded in Slate now export as DOCX (via
readFileContentAsBinary), activating the full TipTap visual editor with tracked changes instead of opening as plain markdown. - Private Connection Setup Wizard: Improved BYOA wizard with a dedicated "Add Test User" step matching the current Google Cloud Console flow. Consent screen substeps now follow Google's actual 4-step accordion. Email instructions are personalized when the user's identity is known.
- Private Connection Promotion Banner: When the agent needs expanded Google scopes (Gmail, Calendar, etc.) and no Private Connection exists, an animated banner appears with a one-click path to the setup wizard. The agent stays paused while the user completes setup and automatically resumes when the connection is saved.
- Export Fidelity Warnings: Before exporting a document to a different format (e.g., PDF to DOCX), a fidelity dialog warns about potential formatting losses with severity-categorized items (info, warning, critical).
- PDF Annotation Extraction: PDF text annotations and markup are now extracted and preserved during document processing and reconstruction.
- Reminders Calendar Sync: Reminders modal now supports calendar sync with expanded access gating.
Improvements
- Google Docs Sync Fidelity: Improved formatting span extraction with proper tag matching (handles self-closing tags, mismatched nesting) and conflict detection for concurrent edits.
- Private Sync Account Mismatch Detection: The sync settings account dropdown now detects when the configured sync email doesn't match any connected Google account and shows a "not connected" indicator with the option to switch.
- Teams/Enterprise Effective Tier: License validation now uses
effectiveTierfrom team/enterprise delegation, ensuring correct feature access when a user's tier is inherited from a team plan. - "Private App" renamed to "Private Connection": All user-facing references now use "Private Connection" instead of "Private App" — clearer for non-developers who don't associate OAuth integrations with "apps".
- Google OAuth Scope Alignment (
drive.file): All elevated Google Drive scopes (drive,drive.readonly,documents.readonly,spreadsheets.readonly,presentations.readonly) are now replaced with the narrowerdrive.filescope for non-BYOA users, matching Google's verified app requirements. BYOA users retain full scopes. - Gmail Permissions Visible in Scope Selector: Gmail read, compose, and modify scopes are now visible to all users in the Google Permissions editor under "Highly sensitive permissions" instead of being hidden behind BYOA-only. These scopes are approved on the consent screen and requested via JIT when Gmail tools need them.
- Google Picker Consent Recovery: When Google permissions are revoked externally (e.g., from Google Account settings), the app now auto-detects stale scopes, invalidates them, re-authorizes, and retries — instead of silently failing with 403 errors.
- Agent Google Drive Guidance: Tool error messages and empty-result hints now instruct the agent to tell the user to click the + button in the composer and select Google Drive, instead of suggesting unavailable tool actions.
- Slate Drive Picker Simplified: Slate's "Load from Google Drive" now loads files directly, skipping the sidepanel-style "Work with in Drive" vs "Add to conversation" choice screen.
- Official Google Drive Logo: Drive chips and the attachment menu now use the official Google Drive logo instead of the generic green triangle.
- Unified Model Picker: Helper model selection falls through to the default model, with a single consolidated model dropdown.
- DOCX Non-Text Element Passthrough: DOCX reconstruction preserves non-text elements (images, charts, embedded objects) that aren't part of the text editing flow.
- PDF Multi-Segment Line Editing: PDF WYSIWYG editor handles lines split across multiple text segments.
- Auth Rate Limit Increased: Auth endpoint rate limit raised from 10/min to 30/min to accommodate rapid OAuth token exchanges during BYOA setup.
- OAuth Token Endpoint Reclassified: OAuth token exchange moved from auth rate limit bucket to general, preventing throttling during multi-scope authorization flows.
Bug Fixes
- iOS Modal Stacking: Fixed crash when presenting file pickers or slate overlays while another modal (e.g., browser overlay) was already showing. Pickers and overlays now present on the topmost view controller.
- Slate Line Range Validation:
get_slate_contentnow returns a clear error whenstartLineis beyond the document length instead of silently returning empty content. - Server Path Traversal: Storage keys are now validated against directory escapes, preventing path traversal attacks on the sync storage endpoint.
- Ad Blocker Regex Cap: DNR converter now caps regex rules at Chrome's 1000-rule limit and drops large bounded quantifiers that exceed RE2's memory limit, preventing extension install failures.
- OAuth Timeout on Expanded Access: Fixed 120-second timeout when the agent needs expanded Google scopes (Gmail, Calendar). The pending OAuth wait now extends to 5 minutes for BYOA setup, and completing the connection in settings automatically resumes the paused agent.
- Chat UI Freezes: Fixed overlapping async intervals and silent broadcast errors that could freeze the chat interface.
- Cloud Sync Download Batch: Fixed TypeScript null-check errors in the cloud sync download batch handler.
- Service Worker Dynamic Imports: Converted dynamic imports to static imports for Chrome MV3 service worker compatibility.
- BYOA-Only Scope Enforcement:
convert_to_google_docandconvert_to_google_sheetactions now correctly require onlydrive.fileinstead of broad editor write scopes. - Drive Chip Format: Three-segment format for drive reference chips with proper BYOA-only scope gating.
- Google Picker CORS: Picker now hosted on the private relay to resolve cross-origin issues with the popup scope flow.
Version 0.9.701 (March 6, 2025)
Improvements
- Ollama Context Window Indicator: Context window usage circle now renders for Ollama models by querying the
/api/showendpoint fornum_ctx, instead of relying on OpenRouter's model list which doesn't include local models. - Ollama Tool Calling: Fixed dynamic tool selection for Ollama — passes
tool_choicethrough to the API (supported since Ollama v0.5.1), disables think mode when tools are active to prevent corrupted JSON, and falls back to a lightweight non-streaming helper model when the primary model doesn't support tool calling. - OpenRouter Reasoning Cache: Fixed reasoning text being cleared prematurely by the streaming provider. Cache lifecycle is now owned by
agent-runnerat iteration boundaries, consistent with Anthropic and Google providers. - PDF WYSIWYG Reliability: Global text alignment handles PDF.js splitting operators across multiple TextItems. Block editing disabled on unmatched spans (annotation/header text) to prevent export errors. CMap 2-byte decoding no longer corrupts Word-generated PDFs with 1-byte ASCII fonts. Operator matching bounds check prevents out-of-range indices on empty text items.
- Desktop Platform Detection: Dynamic tool selector uses
localFolderSynccapability instead ofcaiiooRemindersfor desktop detection, andappleScript/appleNotesinstead of Calendar/Reminders for macOS detection (Calendar/Reminders are cross-Apple via EventKit).
Version 0.9.700 (March 5, 2025)
New Features
- File Manager: Full file management system with nested folders, user tags, starring, and a dedicated File Manager UI. Organize attachments into folder hierarchies, filter by type (screenshots, AI images, photos), search, sort, and view files in grid or list mode. Thread auto-tagging links files to conversations. Private sync support via manifest v9.
- Local Folder Sync: Bidirectional sync between local filesystem folders and the File Manager. Mutation-driven resync automatically propagates deletes, bulk deletes, and moves to disk with a 2-second debounce. Subfolder move detection relocates files on disk when reorganized in caiioo. Navigating into any subfolder within a sync tree triggers auto-resync.
- Cross-Filetype Tracked Changes: Unified accept/reject workflow across DOCX, PDF, and Markdown. DOCX edits are now recorded as tracked changes with the same diff engine used by Markdown and code slates. PDF tracked changes integrate with the visual editor.
resolve_tracked_changesaction lets the agent programmatically accept or reject revisions. - PDF WYSIWYG Editing: Direct text editing on rendered PDF pages with document-matched styling. Text layer links edits to source content-stream operators via
textItemIndexfor surgical reconstruction on export. Includes plainText baseline for change detection. - DOCX Rendering Fidelity: Rich DOCX preview with paragraph alignment, hyperlinks, font color/size, line spacing, indentation, and table cell shading. Code view shows markdown conversion for token-efficient LLM consumption.
- Universal Messaging (internal testing only): Send and receive messages through WhatsApp, Telegram, and Slack with local-first privacy. The agent can compose and send messages, list conversations, and reply to threads — all routed through your own device.
- Google Meet Integration: Multi-action meeting recall tool with
list_meetings(discover recent meetings with date range filtering) andget_meeting(fetch transcripts). Threaded through the full OAuth/JIT/scope-approval pipeline with credentials vault support. - Gmail Send Email: Direct email sending via Gmail API, gated behind high-risk tool approval. Per-action risk level overrides allow
send_emailto require explicit confirmation while draft actions stay at medium risk. - Microsoft BYOA: Bring Your Own App support for Microsoft 365. Register Azure AD app registrations for Microsoft OAuth connections. Multi-provider BYOA registry (Google + Microsoft) with tenant ID support and a 3-step Azure Portal setup wizard.
- Poe Provider: New LLM provider integration for Poe, with model listing, provider-grouped model selector, vision/tools/reasoning capability detection, and pricing display.
- What's New Dialog: API-served content pipeline delivers release notes and user guide pages. Async what's-new dialog shows version highlights on update.
- Private Sync Account Selector: Choose which Google account to use for private sync, instead of defaulting to the primary account.
- Native TipTap Diff Marks: Replaced the separate marked.js rendering overlay for AI revision previews with native ProseMirror marks (DiffAdded/DiffRemoved). Revisions now render as inline tracked changes with consistent typography, plus chunk-level accept/reject targeting.
- Mermaid Diagrams: Live Mermaid diagram rendering in code blocks within Slate's TipTap preview mode.
- Fuzzy Section Search in Help Tool: Help tool now supports fuzzy matching when searching for specific sections within documentation pages.
Improvements
- File Manager UX Polish: Folder sync icon indicators, delete confirmation dialogs, shift-click range selection, toolbar wraps instead of overflowing at narrow widths, and fixed tag persistence across reloads.
- Responsive Composer Layout: Three-zone flex layout prevents the new-thread button from being pushed off-screen at narrow widths. Mode and model selectors use staggered text-hiding breakpoints. Custom agent and AI chip icons replace generic defaults.
- Settings UX: Reordered sections for better flow (Personalization → Credentials → Tool Approvals → Agent Modes → Tool Configuration → Skills Library → Tool Servers → Document Processing → Voice).
- Revision State Integrity: Fixed corruption when deleting large content blocks during active AI revisions — sourceContent immutability, chunk relocation threshold raised to 0.7, stale chunks auto-rejected, and version history integrity checks.
- Agent Loop Reliability: Fixed abort controller race condition where old runs could delete new run's controller. Cancel signal now propagates into queued tools. Orphaned running states broadcast errors instead of leaving blank messages.
- Interjection Handling: Unified
getActiveBranchMessagestraversal on server and UI to skip interjections consistently. Fixed response disappearing after follow-up when streaming parentId pointed at the interjection instead of the user message. - macOS Native Stability: Fixed restart race condition with process epoch tracking, URLSession leak on reconnect, and LineBuffer data race between pipe handler and stop.
- Google Sheets Fix:
create_tablenow writes column header names instead of leaving the first row blank. - Middleware Cleanup: Extracted
requireAuthmiddleware from 10 inline auth checks across apple-routes, mcp-routes, and attachment-routes. - Image Type Filtering: File Manager distinguishes screenshots, AI-generated images, and user photos with separate filter options and distinct icons.
- Mode/Model Picker Parity: Equalized font weight and icon stroke between mode picker and model picker.
- Beta Tier Gating: Credentials vault gates beta features behind tier checks.
- CIDFont Re-Encoding: PDF reconstruction supports CIDFont encoding with ToUnicode CMap tables and identity fallback for characters not in the map.
- PDF Export Renamed: "Download" renamed to "Export" across the PDF workflow for clarity.
Bug Fixes
- PDF Text Disappearing: Fixed PDF view text vanishing on blur and garbled content when exporting from WYSIWYG editing.
- PDF Tracked Changes Routing: Prevented PDF tracked changes from routing through the unified diff engine, which corrupted PDF-specific operator data.
- PDF Export Integrity: PDF download now fails loud with diagnostics instead of silently falling back to markdown export.
- Markdown Table Line Breaks:
<br>tags in markdown tables now render as actual line breaks instead of literal text. - DOCX Tracked Deletions Preserved: Fixed DOCX tracked deletions being stripped when toggling between visual and code view.
- Tracked Changes on Stored Markdown: Fixed
propose_changeon markdown slates showing no redlines when loaded from storage. - Reject Reverts Content:
resolve_tracked_changeswith reject now properly reverts content for markdown and code slates. - Slate Save Loop: Fixed infinite save loop triggered when track changes was active.
- Messaging Stability: Fixed routing, reply threading, profile handling, queue resilience, webhook signatures, sender sanitization, and payload limits.
Version 0.9.662 (February 27, 2025)
New Features
- Bring Your Own App (BYOA): Use your own Google OAuth credentials instead of Caiioo's built-in app. Multi-app support lets teams share a single OAuth client across the organization while keeping individual API keys. BYOA connections auto-detect stale tokens and prompt reauth.
- Wikilink Navigation: Obsidian-style double-bracket wikilinks between Slate documents for building interconnected knowledge bases. Click a wikilink to navigate between Slates instantly.
- Skills Menu: Tab-to-use skill insertion, dynamic height dropdown, inline editing with variable chips, and a shared SkillForm component for full CRUD from the composer.
- DOCX Download: Download Slate markdown and text documents as DOCX files with tracked changes preserved.
- Table Auto-Populate:
insert_componentnow auto-fills table cells from structured data, eliminating empty placeholder tables. - Gemini 3.1 Flash: Added as an image generator model option alongside existing Gemini, FLUX, and Seedream models.
- Google Drive Save for All Accounts: Save to Google Drive is now available for any connected Google account, not just the primary one.
- Unified Diff Engine: Complete rewrite of Slate's change-tracking diff system (phases 0–7) with TOCTOU safety, caching, per-change accept/reject widgets, and mobile tap-to-toggle support.
Improvements
- Adaptive Private Sync: Sync polling rate adjusts dynamically based on activity. Download-only lock optimization reduces contention.
- OAuth Email Capture: Google OAuth always includes identity scopes so the user's email is captured on every connection, fixing blank emails on some accounts.
- Security Hardening: Hardened OAuth PKCE flows, agent subprocess boundaries, and per-account brute-force lockout for repeated failed authentication attempts.
- Browser Chip Layout: Browser selector chips are smaller and wrap instead of overflowing the container.
- Registry-Driven Image Providers: Replaced hardcoded image provider list with a dynamic registry, making it trivial to add new models.
- Content i18n Pipeline: Unified hash-tracked content sync for all documentation pages with incremental translation support.
Bug Fixes
- Math Expressions in Preview: Fixed math expressions being corrupted when switching from TipTap preview to code view.
- Wikilinks in Preview: Fixed wikilink syntax being stripped when toggling between TipTap preview and code view.
- Thread Messages Disappearing: Fixed thread messages being stripped by STATE_PATCH or vanishing when switching models mid-conversation.
- Interjection Orphans: Fixed orphaned parentIds after interjection message filtering causing render errors.
- Private Sync Infinite Loop: Resolved infinite sync loop caused by concurrent collection item clock conflicts and thread_data_ prefix pollution.
- BYOA Edge Cases: Fixed BYOA config not restoring on clear, async save races, extension using dead localhost popup instead of launchWebAuthFlow, and stale issuedByClientId precedence.
- Private Sync Scope Recovery: Fixed private sync unlock failing silently when Google Drive scopes were expired or missing.
- Duplicate Skills: Prevented duplicate skills from appearing after private sync merges.
- Tracked Changes Routing: Wired
get_tracked_changesandget_commentsinto SlateTool action routing so the agent can read revision state. - DOCX Download Source: Fixed downloadAsDocx using stale artifact contents instead of live editor state.
- Smart Model Name Abbreviation: Fixed long model names overflowing the compact picker with intelligent abbreviation.
- Markdown Link Rendering: Fixed chrome-extension: and Caiioo: URLs being blocked in rendered markdown links.
- Thread Creation Speed: Halved thread creation time by eliminating redundant settings cascade lookups.
- Chrome Tab Title: Restored dynamic Chrome tab title based on the active conversation thread.
- Adaptive Greeting Tokens: Bumped max_tokens for adaptive greetings to prevent truncation with reasoning models.
Version 0.9.661 (February 25, 2025)
New Features
- Agent Interjection: Guide the AI mid-run by typing a message and pressing Enter while it's working. The agent sees your guidance at its next decision point and adjusts course — no need to cancel and restart. An amber-badged send button appears alongside the stop button when you have text to send.
- API Tool Management: The API integration tool now supports
save_tool,list_tools, anddelete_toolas first-class actions, making it easy to manage agent-discovered APIs directly from conversations. Free users see GET and list_tools; Pro users get the full set. - Live Token Counter: Token and cost usage now updates in real time during streaming, displayed in each message footer. Specialized costs (image generation, web search) fold in incrementally as each tool completes.
- Auto-Expand Tool Cards: Tool cards automatically expand when they start receiving streaming arguments or need approval, and auto-collapse on successful completion. Errors and denied tools stay expanded for review.
- Monaco Diff Editor: Slate code revisions now use Monaco's native inline diff editor with per-change accept/reject buttons, replacing the previous custom decoration system.
Improvements
- Private Sync Auth Recovery: Private sync errors from expired or revoked Google tokens now show actionable "Reconnect Google" and "Grant Permissions" buttons instead of a passive error message. Reconnection requests the correct Drive permissions and pre-selects the right Google account.
- Private Sync Multi-Account: Private sync now authenticates against the Google account matching the config email instead of falling back to whichever account was connected first. Fixes syncing to the wrong Drive when multiple Google accounts are connected.
- Brand Refresh: New river blue and rock grey color palette derived from the Caiioo logo, replacing the previous purple theme across the entire app and website.
- OAuth Security: Google, GitHub, and Slack OAuth client secrets removed from the extension package. Token exchange now routes through a secure Cloudflare relay proxy, eliminating secrets from client code.
- Faster Startup: Attachment content (images, extracted text) is now loaded on demand instead of at boot, significantly reducing initial load time for conversations with many attachments.
- Faster Streaming: Context window usage now streams via the fast broadcast path instead of round-tripping through storage, giving real-time updates without async I/O per agent loop.
- Private Sync Speed: Google Drive file ID cache is persisted across restarts, eliminating a full file listing API call on first sync cycle.
- Unified Streaming Render Path: Streaming and final content now flow through a single render path, eliminating the flash when generation completes and preserving interjection messages.
- Cross-Model Switching: Switching between AI providers mid-thread (e.g., Gemini to Claude) no longer causes "Invalid signature in thinking block" errors. Foreign reasoning artifacts are automatically flattened to narrative text, preserving context without incompatible cryptographic signatures.
- Disabled Tool Guidance: When the AI tries to use a disabled tool, it now receives actionable suggestions (similar enabled tools, how to enable) instead of a generic "not found" error.
- Responsive Composer Layout: Composer toolbar uses staggered progressive disclosure — mode selector, model selector, settings, agenda, and help icons appear as space allows instead of being clipped.
- Settings Color Coding: Settings sections now use a 3-color system — accent tint for customization sections, alternating neutrals for the rest — for clearer visual grouping.
- Onboarding Errors: Free key provisioning now shows descriptive error messages below action buttons instead of silently failing.
- AI Provider Settings: Renamed "API & Provider" section to "AI Provider" for clarity.
- Extended Free Trial: Pro trial period extended from 7 days to 14 days, giving new users more time to explore Pro features.
- Optimistic Branch Switching: Switching between message branches and reloading messages now updates the UI instantly instead of waiting for storage round-trips.
- MCP Tool Configuration: Schema auto-correct and improved MCP tool configuration for more reliable tool setup.
Bug Fixes
- Per-Message Costs: Individual message costs now cumulatively match the thread header total. Helper, OCR, image gen, and search costs are attributed to individual messages instead of only the thread summary.
- Interjection Rendering: Interjection messages now render as compact amber annotations inline in the assistant timeline instead of being lost when generation completes.
- Loop Detection: Fixed false-positive repetition detection on citation-heavy responses where URLs naturally repeat 3+ times.
- Slate Diff Alignment: Fixed redline/greenline misalignment in preview mode for multi-line chunks where the overlay loop advanced by only 1 line regardless of chunk span.
- Slate Deletion Positions: Fixed double-counting of insertion shift causing deletions to render after their paired insertions instead of before.
- Private Sync Fixes: Fixed JIT scope request not firing across code-split chunks and infinite sync loop after uploads.
- Onboarding Flow: OAuth flow now completes onboarding properly, hyphenated mode IDs are supported in trigger detection, and 0 days remaining no longer coerces to null.
- Password Reset: Resetting password via email link now verifies the email address. Added resend-verification endpoint for users with expired verification emails.
- License Tier: Users with admin-granted tier but no license row are no longer stuck on the upgrade gate.
- Settings Scroll: Clicking the brain icon for a learned page now scrolls to the correct settings section instead of stopping short due to lazy-loaded sections above.
- Monaco Disposal: Fixed diff editor model disposal order to prevent dangling references and Monaco showing through TipTap preview.
- Walkthrough Targeting: Walkthrough steps now skip elements hidden by responsive container queries instead of spotlighting a 16x16 pixel at the top-left corner.
- Kokoro TTS: Fixed text-to-speech broken in the extension — ONNX runtime detection failed in offscreen documents, causing "no available backend found" errors.
- Custom API Tool Save: Fixed saving agent-discovered APIs as custom tools crashing in the extension due to forbidden dynamic
import()in Service Worker context. - Boot Performance: Fixed session migration running on every settings access (20+ times at boot), causing unnecessary disk writes and slower startup.
- Duplicate Storage Broadcasts: Suppressed double storage change notifications in relay/server mode that could cause UI flicker.
- Cairn Texture Recovery: Fixed texture generation failing when the in-memory world cache was evicted between load and render, and fixed renderer showing black for attachment-backed textures.
- Security Patches: Updated fast-xml-parser (DoS via DOCTYPE entity expansion) and tar (hardlink escape vulnerability).
- MV3 Compliance: Replaced CDN-loaded vega-embed with local vendor bundles, ensuring all chart rendering code passes Chrome Web Store remote code policy.
- Vega-Lite Charts: AI now generates Vega-Lite visualizations directly in Slate instead of CDN-loaded HTML, enabling interactive charts without remote script dependencies.
- Google Drive Save: Fixed 403 errors when saving to Google Drive from Slate caused by stale OAuth scope checks in non-agent Drive handlers.
- New User Onboarding: Fixed onboarding screen being skipped entirely for new users when an API key was auto-provisioned during login.
- OpenRouter OAuth Popup: Fixed "Connecting..." spinner stuck indefinitely when popup blockers silently killed the OAuth window.
- Trial Banner Copy: Trial banner no longer claims models are gated by tier — updated to accurately describe Pro features (write access, image generation, custom modes).
- DOCX Review Toolbar: Fixed toolbar flickering when showing document review controls by deferring scroll until initial visibility.
- Add to Prompt Overlay: Fixed overlay logo blocked by Content Security Policy and background opacity not matching theme.
- Message Edit/Reload State: Fixed in-memory state going stale after editing a message, reloading a response, or switching branches.
- Server Login Hang: Fixed WebSocket storage operations failing during the server login flow. Auth is now decoupled so the local session works immediately while server identity resolves in the background.
- WebSocket Connection Hang: Fixed a race condition where AUTH_STATE: ready arriving during initial WebSocket connection could orphan the auth promise, preventing the connection from ever completing.
Version 0.9.655 (February 19, 2025)
New Features
- Slack Integration: Connect your Slack workspace to enable AI-powered Slack tools with full OAuth V2 authorization
- API Tool Credentials: Agent-discovered APIs can now be saved as custom MCP tools with stored credentials, enabling reuse across conversations
- Saved API Badge: Agent-created API tools display a "Saved API" badge in settings for easy identification
- Conversation Import: Import conversations from ChatGPT, Claude, Gemini, Perplexity, and Grok — upload your export file and Caiioo converts it with full message history, attachments, and metadata
Improvements
- Learned APIs Nesting: Learned API tools are now grouped under a collapsible section in settings for cleaner organization
- Caiioo Branding: Built-in tools now display the Caiioo icon for visual consistency
- Browser-Aware Links: URLs opened by the agent now launch in the browser you've selected in settings
- Slate Cloud Load: Loading a Slate document from Google Drive now triggers a JIT OAuth prompt instead of failing with a cryptic error
Bug Fixes
- DOCX Editing: Fixed paragraph run regex matching across nested spans, causing content corruption when editing formatted DOCX text
- Search/Replace Safety: Restored structural HTML pattern guard lost during refactoring — prevents edits from corrupting XLSX/DOCX internal markup
- Tool Approval Loop: Fixed race condition causing repeated approval prompts for the same tool within a single agent run
- Stale Identity: Fixed stale user identity not clearing on 401 during license sync, with logout escape on returning-user onboarding
Version 0.9.651 (February 17, 2025)
New Features
- Minimum-Privilege Permissions: Google Workspace tools now request only the exact scopes needed for each action instead of broad access. Reading emails requests read-only; creating events requests only calendar write access.
- Enhanced Permission Dialog: The just-in-time authorization dialog shows exactly what permissions are being requested, why they're needed, and whether the operation is read-only or read-write
- Permission Selector: When manually connecting a Google account, you can choose which services (Calendar, Gmail, Drive, Docs, Sheets, Slides) and access levels (read-only vs read-write) to grant upfront. Edit Permissions button on existing connections.
- Image Results: Generated images and screenshots now render prominently between collapsed process steps and the final answer, eliminating the need to expand tool calls to see visual output
Improvements
- Private Sync Safety: Concurrent edits now download for merge instead of uploading, lock refresh verifies ownership, and manifest merge prevents data loss from premature trash operations
- P2P Sync Removed: Eliminated P2P real-time sync, removing 100+MB of wasteful WebSocket traffic and improving extension performance
- Settings Performance: Settings panel memoization and collapse context isolation for faster rendering
- DOCX Rendering: Improved paragraph rendering, numbering, and style support in the document parser
Bug Fixes
- Google Docs Scope: Fixed "insufficient authentication scopes" error when indexing Google Docs —
index_documentneeded write scope for named ranges - Google Docs Read: Fixed
read_documentandget_document_infofailing when only Docs scopes were granted — these actions also need Drive read access - Calendar Copy: Fixed
copy_eventaction missing from the scope registry, preventing JIT permission requests - API Key Persistence: Fixed BYOK API key being silently lost when
saveCustomModecrashed the settings save path during Pro trial activation - Settings Reload Race: Fixed unsaved API key being wiped when collapsing/expanding settings sections triggered a storage reload cycle
- Chrome Identity Cancellation: Fixed user cancellation during incremental scope requests not being distinguished from errors, and auth method now correctly transitions after obtaining a web OAuth refresh token
- Private Sync Overwrite: Fixed private sync overwriting recent turns when concurrent edits occurred
- Viewport Screenshots: Stopped surfacing viewport screenshots above final response text
- Agent Coordination: Fixed 7 issues across turn coordination, perception, and state management
- Track Changes: Fixed track changes mode broken after slate decomposition due to dual-state variable desync
- File Manager: Delete and download actions now available in all modes, not just orphaned files
- MCP Tool Names: Fixed tools using internal server IDs instead of human-readable names
- Agenda Tool: Fixed sync card always showing missing permissions, wired into Google OAuth connection system
Version 0.9.642 (February 16, 2025)
Improvements
- New Document Creation: New DOCX, XLSX, PDF, and RTF files created from the "New File" dropdown are now fully editable with tracked changes support and proper
- Gemini Stability: Added reasoning repetition detection to automatically break Gemini thought loops and prevent leaked thinking text from appearing in chat
- Image Handling: DNG raw photo previews now respect EXIF orientation for correct display preventing distortion on uploaded images.
Bug Fixes
- New Document Editability: Fixed blank documents created via "New File" being rendered as static previews instead of editable rich text editors
- API Key Detection: Switching to your own API key (BYOK) now takes effect immediately without requiring a page refresh
- Screenshot Stability: Fixed attachment ID handling to prevent screenshot data from being lost or corrupted
- Tab Grouping: Fixed crash when browser had non-normal windows (e.g., devtools, popups)
- Console Noise: Removed unnecessary llms.txt probing that was spamming 404 errors
- Tool Timeouts: Removed fragile keepalive workarounds that could interfere with long-running tool calls like 4k image generation with Gemini
- Prompt Caching: Fixed multimodal content (images/screenshots) being dropped during prompt cache optimization
Internal
- Significant codebase quality improvements across error handling, type safety, and architectural layering
- Modularized core document parser into focused, maintainable modules
Version 0.9.641 (February 14, 2025) - Happy Valentine's Day!
Improvements
- Model Selector Tier Medals: Replaced hardcoded recommendation badges with 5 API-driven tier medals (Best for Caiioo, Quality, Reliability, Speed, Value) showing gold/silver/bronze rankings directly from benchmark data
- Skills System: Skills are now have a restore defaults button to re-add built-in skills
- Prompt Caching: Further optimized prompt caching for cost savings
Bug Fixes
- Browser Compatibility: OAuth login and auto-connection now work in Vivaldi and other non-Chrome browsers
- Cost Tracking: Fixed race condition where cancelling a run could clobber cost and usage data; cost/usage data is now preserved on cancelled and errored runs
- Private Sync: Rewrote purgeAllStorage to nuclear clear, fixed sync listener race, consolidated private sync UI
- i18n: Added missing translation initialization to conversation.html popup page
- Security: Patched dependency vulnerabilities (qs, @casl/ability, axios, markdown-it)
- Benchmarks Page: Migrated benchmarks page backend to Cloudflare auth
Version 0.9.65 (February 13, 2025)
Improvements
- MiniMax M2.5 Benchmarks: Full benchmark suite completed — tool accuracy 91%, trustworthiness 95% (rank #2), composite rank #5 with gold value tier
- Model Intelligence API: Increased default response limit from 100 to 500, ensuring all models with earned badges are visible to clients
Bug Fixes
- Missing Model Badges: Fixed 35 models with earned tier badges (gold/silver/bronze) not appearing in the extension model list — including Claude Opus 4.6 (gold composite), Claude Sonnet 4.5 (silver composite), and Claude Haiku 4.5 (bronze composite). Root cause: API defaulted to returning only 100 models sorted by trustworthiness, but tiers were computed from all 367 models. Models outside the top 100 by trustworthiness had their badges silently dropped.
Version 0.9.64 (February 12, 2025)
Improvements
- Ollama Provider Parity: Full feature parity with OpenRouter — abort signal support, error handling, reasoning details, resolved model ID, and think-tag processing via streaming mixin
- AbortSignal Propagation: Subprocess SIGTERM/SIGINT now cancels in-flight LLM API calls
- Build-Time Schema Validation: Settings schema validation at build time catches missing
SETTING_METADATAentries - Platform Capabilities Caching: Cached for performance instead of recomputed on every access
- Website i18n: Trust page and pricing refactor translations synced across all 22 locales
Bug Fixes
- Website Authentication: Replaced legacy Supabase auth with direct Cloudflare Worker API calls for Google sign-in, email login, signup, and password reset
- Stripe Checkout Locale: Fixed "Invalid locale" error on checkout and portal by mapping
navigator.language(e.g.en-US) to Stripe-supported locales with fallback toauto - Password Minimum Length: Synced 12-character minimum across server signup, password reset, and website reset page
- Private Sync Profile Dedup: Login now detects and removes duplicate profiles caused by earlier sync bugs
- State Manager Init Race: Fixed initialization race condition in state-manager
- Agent Subprocess Cleanup: Zombie subprocess cleanup on agent termination
- Tab Group/Storage API Guards: Proper guards for tab group and storage APIs across platforms
- Geolocation Error Handling: Graceful handling of geolocation permission errors
- OAuth Refresh Locking: Prevents concurrent OAuth token refresh attempts
- Settings Save Mutex: Concurrent settings saves no longer clobber each other
- Agent JSON Parse Isolation: Malformed agent output no longer crashes the parser
- Thread Search Race: Fixed race condition in thread search results
- Streaming Version Staleness: Checks for stale version during streaming responses
- Content Script Timeouts: Added timeouts for content script message passing
- Context Pruning for Multimodal: Improved context pruning when multimodal content is present
Version 0.9.63 (February 12, 2025)
Bug Fixes
- Private Sync Mode Variables: Fixed legacy settings migration running on every sync cycle, silently overwriting recent local edits (e.g., mode variable changes) with stale Drive data
Version 0.9.62 (February 12, 2025)
New Features
OpenRouter OAuth One-Click Setup
- PKCE Key Creation: New one-click OpenRouter OAuth flow lets users create and link an API key without leaving Caiioo
- Redesigned Onboarding: Streamlined onboarding and upgrade flows with OpenRouter OAuth integration
- Privacy Warning: Free models onboarding option now displays a clear privacy/training data warning
Granular Private Sync
- Per-Item Sync: MCP servers, tool approvals, profiles, skills, modes, overrides, and reminders now sync at the individual item level instead of overwriting entire collections
- Per-Key Settings Sync: Settings sync granularly per key, preventing remote overwrites of unrelated local changes
- Deduplicated Reads: In-flight WebSocket storage reads are deduplicated to reduce unnecessary network traffic
Improvements
- Mode Settings UX: Overrides now auto-save, and Restore Defaults correctly restores deleted mode variables
- Responsive Composer: Skills button collapses to icon-only at narrow widths; removed layout spacer from Skills section
- Support Tickets: Build version and datetime are now included automatically
- Slate Search: New
search_documentaction added to the Slate tool
Bug Fixes
- Google Slides/Sheets OAuth: Now throws proper
OAuthAuthorizationRequiredErrorinstead of returning setup text, enabling just-in-time authorization - Configuration Tool UI Refresh: Agent mutations via the configuration tool now broadcast
STORAGE_CHANGEDso the UI updates immediately - Tool Approval Modal: Fixed React hooks ordering violation (useMemo above early return)
- Localhost HTTPS: API integration tool now handles self-signed certificates for local HTTPS servers
- Mode Variables Persistence: Fixed i18n getter properties not materializing before storage, causing variables to vanish on reload
- Slate Thread Safety: Resolved race condition in Slate tools that caused duplicate slates during parallel AI turns
- Slate DOCX: Fixed deletion visibility and baseline corruption in tracked changes
- Private Sync Stability: Eliminated bouncing and vanishing edits from sync conflicts
Version 0.9.61 (February 10, 2025)
Improvements
Slate Editor i18n
- Translated Toolbar & Menus: All Slate editor toolbar buttons, context menus, and dialog strings are now fully translated
Bug Fixes
- OAuth Fetch Timeouts: All OAuth token exchange and refresh requests now have a 15-second timeout, preventing infinite hangs on network stalls
- GitHub Private Email: Fixed GitHub connection failing when the user's profile email is private (now fetched from
/user/emailsAPI) - OAuth Connection Dialog: Generalized OAuth connection dialog and fixed multiple token/connection bugs
- Password Length Consistency: Synchronized 12-character minimum password requirement across all signup and reset surfaces
Version 0.9.6 (February 8, 2025)
New Features
Internationalization (20+ Languages)
- Full i18n Support: Caiioo is now available in 20+ languages including English, Spanish, French, German, Japanese, Korean, Chinese, Arabic, Hebrew, Hindi, and more
- RTL Language Support: Full right-to-left layout for Arabic, Hebrew, and Urdu
- UI Language Setting: Choose your preferred language in Settings — all UI elements, tool labels, and status messages are translated
Improvements
Settings Panel Performance
- Lazy-Loaded Sections: Settings panel sections now load on-demand, reducing initial render time
- Modular Storage: Settings are stored in granular per-section keys instead of a single monolithic blob, improving read/write performance
- Typed Getters: Internal settings access uses strongly-typed getters with change granularity tracking
Slate Track Changes Reliability
- 17 Revision Manager Fixes: Comprehensive hardening of the track changes system across diff computation, acceptance, rejection, and persistence
- Plain Text Diffing: Redline changes now diff plain text instead of raw markdown, producing cleaner and more accurate change highlights
- Persistent User Changes: User-made tracked changes now survive page refresh
- Race Condition Fix: Force-bake tracked changes before save to prevent data loss
Bug Fixes
- Service Worker Crashes: Replaced 112+ dynamic
import()calls with static imports to prevent Chrome service worker crashes - Mode Welcome Messages: Fixed language, provisioned key detection, and persistence issues in mode welcome messages
- Google Tool Account Selection: Account picker now dynamically reflects actually connected accounts
- DOCX Nested Lists: Fixed display markers and export corruption for nested list round-trips
- LaTeX Math Rendering: Fixed currency dollar escaping breaking LaTeX math expressions starting with numbers
- Tab Group Creation: Deferred lazy tab group creation until the web browsing tool is actually used
- Agenda OAuth: Re-throw OAuth errors so just-in-time authorization triggers correctly
- Settings Persistence: Added missing metadata entries for 6 settings that silently failed to save
Version 0.9.5 (February 6, 2025)
Security Hardening
- Content Script Origin Validation: Messages from web pages to the extension are now restricted to caiioo.ai origins only, with strict same-origin checks preventing cross-origin message injection
- CSP Tightened: Removed development-only localhost script sources from the extension pages Content Security Policy
- Auth Response Scoping: Extension auth responses are now sent to the specific page origin instead of broadcasting to all frames
Version 0.9.4 (February 4, 2025)
New Features
Multilingual Speech-to-Text
- Language Selection: Choose your STT language in Settings for accurate non-English voice input
- 90+ Languages Supported: Works with Whisper and ElevenLabs for comprehensive language coverage
Real-Time Voice Activity Detection
- Low-Latency VAD: New Silero VAD v5 integration detects speech in real-time with minimal delay
- Smarter Recording: Recording automatically starts and stops based on voice activity
- Reduced False Positives: Better distinction between speech and background noise
Suggestions Visibility Toggle
- Hide/Show Suggestions: New toggle to hide AI follow-up suggestions when you want a cleaner interface
- Persistent State: Your preference is saved and remembered across sessions
Improvements
Track Changes Auto-Bake
- Diff-Based Tracking: More efficient change tracking using diff algorithms
- Auto-Save: Changes are periodically saved to prevent data loss during long editing sessions
- Snapshot on Exit: Exiting track changes mode automatically bakes all pending changes
Enhanced Provider Streaming
- Gemini Improvements: Better streaming and reasoning capabilities for Google Gemini models
- Consistent Behavior: Unified streaming behavior across OpenRouter and native providers
Version 0.9.3 (January 31, 2025)
New Features
Guided Onboarding Walkthrough
- Interactive UI Tour: New step-by-step walkthrough guides new users through every part of the interface after entering their API key
- Three Phases: Covers the composer (11 steps), settings panel (7 steps), and mode system (5 steps)
- Spotlight Effect: Each step highlights the relevant UI element with a focused spotlight
- Conversational Onboarding: AI-guided profile building to personalize your experience from the start
Prompt History
- Access Previous Prompts: Your recent prompts are saved and can be accessed in the composer
- Quick Reuse: Easily reuse or modify previous messages
Ad Blocker Levels
- Granular Control: Ad blocking now supports multiple levels instead of just on/off
- Choose Your Protection: Select the level of blocking that works for your browsing needs
DOCX Review Toolbar
- Track Changes Controls: New toolbar when viewing Word documents with tracked changes
- Accept/Reject Actions: Easily review and resolve document edits
Improvements
Thread Retention Settings
- Configurable Cleanup: Set how long to keep old threads before automatic cleanup
- Storage Management: Better control over your conversation history
Caiioo Animation
- Performance Optimizations: Smoother animation with improved rendering
- Wetness Effects: New visual overlay showing water saturation
- Direct Particle Rendering: Cleaner visuals with optimized particle drawing
Profile Switching
- Enhanced State Management: More reliable profile switching with improved protocol handling
- Better Sync: Profile changes sync correctly across the extension
Google Calendar
- Token Management: Improved access token handling for calendar operations
- More Reliable Sync: Better OAuth flow for calendar integration
Version 0.9.2 (January 26, 2025)
New Features
Ad & Tracker Blocking
- Built-in Ad Blocker: Block ads and trackers using the Ghostery engine with MV3-compatible declarativeNetRequest
- Toggle in Settings: Enable or disable ad blocking from the Settings panel
- Filter List Updates: Automatic caching of filter lists for reliable blocking
Just-in-Time Google Permissions
- Incremental Authorization: Google tool permissions are now requested only when needed, not upfront
- Clearer Scope Management: Missing scopes trigger helpful error messages with options to grant access
- Better Privacy: Only request the specific Google scopes required for each tool
Improvements
Caiioo Animation
- Enhanced Physics: Improved particle dynamics with better elevation, speed, and density forces
- Smoother Flow: Particles now follow channel direction on spawn, reducing clumping
- Realistic Stacking: Particles stack naturally when blocked by pebbles or dams
Google OAuth Flow
- Faster Sign-In: Now prioritizes ID token retrieval for faster authentication
- Improved Reliability: Better token handling reduces auth failures on non-Chrome browsers
Onboarding & Settings
- Preview Mode: Test onboarding flows without clearing user data
- Cleaner Free Tier: Removed deprecated provisioned API key restoration UI
Internal
- Minigame System: New MinigameContainer infrastructure for interactive intro experiences
- Hidden Easter Egg: Minigame visibility state persisted across sessions
Version 0.9.1 (January 25, 2025)
New Features
Interactive Water Simulation
- Caiioo Intro: New interactive fluid simulation on the loading screen - watch water flow through a dynamic S-shaped channel
- Draggable Pebbles: Move pebbles around to redirect water flow and create dams
- Sediment Dynamics: Realistic erosion and deposition - fast water picks up sediment, slow water deposits it
- Theme-Adaptive: Water colors automatically match your chosen theme tint
Vega/Vega-Lite Chart Support
- Interactive Visualizations: View and edit Vega and Vega-Lite charts directly in Slate
- Data Visualization: Create bar charts, line graphs, scatter plots, and complex multi-layer visualizations
- Spec Editing: Edit the JSON specification and see changes rendered in real-time
Mermaid Diagram Support
- Diagram Types: Create flowcharts, sequence diagrams, class diagrams, state diagrams, and more
- Live Preview: Edit Mermaid syntax with instant visual preview
- Export Options: Diagrams render as SVG for crisp output at any size
File Creation Templates
- Quick Create Menu: New dropdown menu when creating files in Slate with templates for common file types
- Template Categories: Markdown, code files, data formats, diagrams, and more
- One-Click Start: Jump straight into a new document with the right structure
Improvements
API Error Handling
- Visual Notifications: API errors now display as dismissible toast notifications
- Auto-Dismiss: Non-critical errors (like cancellations) automatically clear after a few seconds
- Clearer Messages: Better error messages help identify and resolve issues faster
Platform Capabilities
- Smart Feature Detection: Features that require specific platforms (like Apple Calendar on macOS) are now detected automatically
- Graceful Fallbacks: Tools adapt to your environment rather than failing silently
- Apple Tool Improvements: Apple Reminders now supports uncomplete and list actions
Safari & Non-Chrome Browsers
- Better OAuth Flow: Improved browser detection for Google sign-in
- Fallback Mechanism: Non-Chrome browsers now have a more reliable authentication path
Version 0.9.0 (January 22, 2025)
New Features
Automatic Data Cleanup
- Storage Management: Old threads and attachments are automatically cleaned up based on your retention preferences
- Configurable Policies: Set how long to keep data before automatic cleanup
GitHub Sync
- Backup to GitHub: Sync your Caiioo data to a GitHub repository for backup and cross-device access
- Smart Conflict Resolution: Changes from multiple devices are automatically merged without data loss
- Selective Sync: Control which data is synced with
.gitignore-style patterns
GitHub Tool
- AI GitHub Integration: The AI can now interact with GitHub on your behalf - create issues, browse repositories, manage pull requests, and more
- Repository Browsing: Ask the AI to explore codebases, find files, and understand project structure
Improvements
Browser Automation
- Enhanced Page Interaction: More reliable clicking, scrolling, and form filling on complex web pages
- Smarter Tool Selection: The AI now picks the right tools for each task more accurately
Document Handling
- Better Word Documents: Improved handling of tables, lists, and formatting in DOCX files
- Google Slides: More control over slide formatting, shapes, and layouts
Settings & UI
- Expanded Settings Panel: More configuration options with better organization
- Improved Thread List: Better sorting and filtering of your conversations
- Location Permission: Clearer flow when granting location access for location-based queries
Version 0.8.9 (January 19, 2025)
New Features
Google Slides Integration
- AI-Powered Presentations: New Google Slides tool allows the AI to create, read, and modify slide presentations
- Full Slide Control: Create slides, add text boxes, images, shapes, and tables
- Template Support: Use built-in templates or work from blank presentations
- Collaborative Editing: Works with your existing Google account connection
Improvements
Code Quality
- TypeScript Fixes: Cleaned up type errors and unused imports across the codebase
- Test Coverage: Updated test fixtures to match current type definitions
Version 0.8.8 (January 17, 2025)
New Features
Resemble.ai Text-to-Speech
- Professional Voice Synthesis: New Resemble.ai integration for high-quality AI voice generation
- Streaming Audio: Real-time audio streaming for responsive voice output
- Configurable Voices: Select from multiple professional voice options
- Truncation Warnings: Clear feedback when long text is truncated for synthesis
Calendar Sync Service
- Background Sync: Automatic calendar synchronization with Google Calendar
- Incremental Updates: Only changed events are synced for efficiency
- Alarm-Based Scheduling: Reliable sync scheduling using Chrome alarms
Improvements
Desktop App (Electron)
- Wake Detection: System now detects when your Mac wakes from sleep and refreshes OAuth tokens automatically
- Improved Reliability: OAuth connections stay fresh even after extended sleep periods
Rich Composer Input
- Enhanced Composition: Improved message input with better formatting support
- Tab Autocomplete: More responsive tab reference suggestions
Version 0.8.71 (January 15, 2025)
New Features
Thread Search
- Search Your Conversations: New search box in the thread list to quickly find threads by title or content
- Instant Filtering: Type to filter - matching threads appear immediately with search highlighting
- Smart Debouncing: Search is optimized to not lag even with hundreds of threads
Physics Simulation Tool
- AI-Powered Physics: New physics tool lets the AI perform physics calculations and simulations
- Projectile Motion: Calculate trajectories, predict collisions, and solve motion problems
- Structural Analysis: Analyze stress, beam bending, buckling, and stability of structures
- Material Properties: Built-in database of common engineering materials (steel, aluminum, wood, concrete, etc.)
- Physics Formulas: Kinetic energy, momentum, force, impulse calculations
Cairn World Builder Enhancements
- Physics Engine: Full Rapier3D physics integration for realistic simulations
- Joints & Constraints: Create hinges, sliders, ball joints, and fixed connections between bodies
- Sensors & Triggers: Define sensor regions that detect when objects enter/exit
- Character Controller: First-person character with gravity, jumping, and collision response
- Game Mode: Real-time physics with fixed timestep for interactive exploration
Improvements
Private Sync v2
- Incremental Sync: Each thread and attachment is now synced individually rather than as one large file
- Faster Sync: Only changed items are uploaded, dramatically reducing sync time for large libraries
- Better Conflict Handling: Per-item vector clocks enable more precise merge resolution
- Reduced API Calls: Smart diffing means fewer Google Drive API requests
Sync Reliability
- Extension/Server Parity: Fixed attachment storage to properly track vector clocks on both platforms
- Tombstone Filtering: Deleted profiles are now correctly hidden on both extension and server
Version 0.8.70 (January 14, 2025)
New Features
Private Sync (Free Tier)
- Cross-Device Sync: Sync your threads, settings, and attachments across all your devices via Google Drive
- End-to-End Encryption: All synced data is encrypted with your passphrase before leaving your device
- Automatic Background Sync: Changes sync automatically every 30 seconds with smart debouncing
- Conflict Resolution: CRDT-style vector clocks ensure changes merge correctly across devices
Improvements
Private Sync Efficiency
- Reduced Polling: Sync interval increased from 3s to 30s to reduce API calls
- Smart Debouncing: Waits 10 seconds after changes settle before syncing
- In-Flight Protection: Threads being actively processed by the AI are excluded from sync until complete
Version 0.8.69 (January 14, 2025)
New Features
Slate Revision Navigation
- Navigate Between Changes: New navigation buttons to jump between pending revision chunks in the editor
- Change Counter: Visual indicator shows current position (e.g., "2 of 5") within pending revisions
- Keyboard Shortcuts: Use Alt+Up/Down to quickly navigate between chunks without leaving the keyboard
Improvements
Document Processing
- Remote OCR Fallback: When local PDF text extraction fails or produces poor results, documents are automatically processed via cloud OCR for improved accuracy
- Processing Status: Real-time feedback shows when documents are being processed remotely
Reasoning Model Support
- Multi-Turn Reasoning: Better caching of reasoning details across conversation turns, improving continuity for extended thinking models
- Cleaner Messages: Internal system notes are now stripped from rendered messages, preventing instruction leakage
Version 0.8.68 (January 11, 2025)
New Features
Long-Term Memory
- Context Persistence: AI now maintains long-term memory across conversations, remembering important context about your preferences and workflows
- Usage Tracking: Enhanced tracking of token usage and costs with detailed logging
Model Intelligence
- Smart Model Selection: New model intelligence features help identify optimal models based on your usage patterns
- Ranking System: Models are ranked by performance and value metrics
Improvements
Storage Reliability
- Cross-Platform Storage: Model cache, tool approvals, and learned pages now work reliably across extension, server, and LAN modes
- Auth Timeout: Fixed potential hang when connecting to relay server - now times out gracefully after 5 seconds instead of waiting indefinitely
- Settings Load Speed: Settings and model picker now load faster after extension reload
Content Pagination
- Proper Page Sizing: Web page content is now split based on the actual model's context window (e.g., 131k tokens), not a hardcoded default. This means you see larger page chunks and fewer pages when using high-context models.
Bug Fixes
- API Key Persistence: Fixed issue where OpenRouter API key would be "forgotten" after briefly enabling then disabling LAN mode
Version 0.8.67 (January 7, 2025)
New Features
Kokoro TTS - Local Neural Text-to-Speech
- High-Quality Voices: Kokoro is a local neural TTS model with natural-sounding speech across multiple voices
- No API Key Required: Runs entirely on-device using WebGPU/WASM - no cloud services or API keys needed
- Multiple Voices: Choose from American, British, and other accent options with male/female variants
- Long Text Support: Properly handles long text via streaming synthesis - no more cutoffs at 30 seconds
- Clean Speech: Automatically strips markdown formatting (bold, italics, etc.) before speaking
Improvements
Voice Settings
- Unified Voice Section: TTS and STT settings consolidated in a cleaner layout
- Voice Preview: Test selected voice before using it
Version 0.8.66 (January 7, 2025)
Improvements
Settings Panel Search
- Filter Search: New search box at the top of Settings to quickly filter sections by keyword
- Instant Results: Type to filter - matching sections appear immediately
- Keyboard Friendly: Search is auto-focused when opening settings
MCP Server Reliability
- Startup Verification: MCP servers are now verified as running before returning success
- Better Error Messages: When MCP servers crash during startup, the actual error is shown instead of generic failure
- Fixed Examples: Corrected example package names to use
@modelcontextprotocol/server-*(not@anthropic/mcp-server-*)
macOS App Authentication
- Self-validating Tokens: Fixed 401 errors when adding MCP servers before WebSocket state sync
- Faster Auth: HTTP endpoints no longer require waiting for WebSocket connection
Version 0.8.65 (January 6, 2025)
New Features
ElevenLabs Voice Integration (BYOK)
- Cloud TTS: High-quality text-to-speech using ElevenLabs - choose from multiple voices and models
- Cloud STT: Scribe transcription with real-time streaming (~150ms latency) and 90+ language support
- Voice Selection: Browse and select from ElevenLabs voice library directly in Settings
- Model Options: Choose between Flash v2.5 (ultra-fast ~75ms), Turbo v2.5, or Multilingual v2 (best quality)
- Bring Your Own Key: Uses your ElevenLabs API key - no additional cost from Caiioo
Multilingual Whisper Model
- Whisper Tiny Multilingual: New local STT option supporting 99 languages (~39MB download)
- Same Size as English-only: Same compact 39MB size as Whisper Tiny English
- Language Detection: Automatically detects spoken language
Google Docs Enhanced Reading & Writing
- Markdown by Default: Text is now formatted as markdown by default when writing. Use
useMarkdown: falsefor plain text insertions that preserve existing formatting. - Rich Formatting: Converts markdown headings, bold, italic, strikethrough, and links to native Google Docs styles
- Lists & Tables: Supports ordered/unordered lists with nesting and markdown tables
- Suggestion Tracking: Pending suggestions shown with semantic tags:
<ins>added text</ins>for insertions,<del>removed text</del>for deletions. Adjacent tags indicate replacements. - Inline Comments: Comments appear inline with author attribution:
<comment author="Name" on="quoted text">content</comment>with nested<reply>tags for threads
Improvements
Gemini Extended Thinking
- Improved Thought Signatures: Better handling of Gemini's thought_signature across streaming chunks, improving extended thinking continuity with multi-turn tool use
Version 0.8.64 (January 5, 2025)
New Features
Google Sheets Cell Metadata
- Read Hyperlinks & Notes: Use
includeMetadata: trueto retrieve hyperlinks, notes, and data validation rules from cells - Add Hyperlinks: New
update_cell_metadataaction to add clickable hyperlinks with custom display text - Add Notes: Attach notes/comments to cells programmatically
- Data Validation: Create dropdowns, number ranges, text validation, and custom formula rules on cells
Safari Tiling (macOS)
- Smart Window Positioning: When opening links from the sidepanel, Safari windows automatically position next to the sidepanel for easy side-by-side browsing
- Screen Space Optimization: Tiling logic calculates optimal Safari placement based on available screen space
- Re-tile on Mode Change: Safari windows automatically reposition when the sidepanel changes modes
Copy/Paste in macOS Sidepanel
- Full Copy/Paste Support: Copy and paste now works reliably in the macOS sidepanel app
- System Keyboard Shortcuts: Standard ⌘C/⌘V shortcuts work as expected
Improvements
macOS Server Reliability
- Signal Handling: Improved handling of pipe signals to prevent unexpected app termination
- Restart Reliability: Server stop and restart operations are now more reliable with proper cleanup
- Connection Stability: Better handling of OAuth token refresh with retry logic and exponential backoff
Safari Extension
- Stable Browser IDs: Safari extension now generates stable browser IDs to prevent duplicate entries during reconnections
- Cleaner Reconnection: Server-side browser registration uses client-provided stable IDs for cleaner reconnection handling
Model Selector
- Scroll to Selected: When opening the model dropdown, it now automatically scrolls to the currently selected model
Tab References in macOS App
- Works in Sidepanel: Tab references and context now work in the native macOS sidepanel, not just the Chrome extension
Version 0.8.63 (December 31, 2025)
New Features
Voice Output (Text-to-Speech)
- Read Aloud: AI responses can now be read aloud using Microsoft Edge TTS
- Auto-Play Option: Enable automatic reading of new AI responses in Settings
- Speed Control: Adjust playback speed from 0.5x to 2x
- Pause/Resume: Control playback with pause and resume buttons on each message
Local Speech-to-Text (Whisper)
- Whisper Upgrade: Opt into local Whisper transcription for more accurate voice input
- Model Download: Download the Whisper Tiny model (~40MB) for offline use
- Privacy: Audio processed locally, never sent to external servers
- Fallback: Falls back to Web Speech API if Whisper unavailable
Unified Agenda Tool
- Single Tool: New
agendatool consolidates calendar and reminder operations - Multi-Provider: Works with Google Calendar, Apple Calendar, Apple Reminders, and Caiioo reminders
- Simpler for Agents: One tool interface for all scheduling needs
Improvements
Settings Panel
- Persistent Collapse State: Section open/closed states are now remembered across sessions
- Voice Settings: New section for configuring TTS and STT preferences
Version 0.8.62 (December 31, 2025)
New Features
DOCX List Support
- Numbered Lists: Word documents with numbered lists now render correctly with proper formatting
- Bulleted Lists: Bullet point lists are preserved and displayed accurately
- Nested Lists: Multi-level list indentation is maintained in the HTML preview
CSV Export for Spreadsheets
- Export as CSV: XLSX files can now be exported as CSV for easy data extraction
- Format Selection: Choose between XLSX or CSV when downloading spreadsheet attachments
Version 0.8.61 (December 30, 2025)
New Features
Tool Approval Workflow
- Interactive Approval: Certain tools now require user approval before execution - you'll see a modal asking to approve or deny the action
- Status Tracking: Tool executions now show 'pending approval' and 'denied' states in the timeline
- Safe by Default: Sensitive operations wait for explicit user consent before proceeding
Current Location Variable
- {{currentLocation}}: New variable for adding your current location context to prompts
- Geolocation Permission: Requires browser geolocation permission when first used
- Context Aware: Great for location-based queries like "restaurants near me" or travel planning
Batch Thread Management
- Multi-Select Mode: Toggle selection mode to pick multiple threads at once
- Batch Delete: Delete selected threads in a single action
- Batch Archive/Export: Archive or export multiple threads simultaneously
Image Viewer in Slate
- Dedicated Viewer: Images now open in a full-screen viewer within Slate
- Zoom Controls: Zoom in/out and pan around large images
- Download Option: Quick download button for saving images locally
Improvements
Reasoning Display
- Better Aggregation: Model thinking/reasoning blocks are now properly combined without duplication
- Cleaner Display: Reasoning content from extended thinking models displays more reliably
Ollama Integration
- Streaming Reasoning: Real-time streaming of reasoning/thinking content from local Ollama models
- Better Model Handling: Improved compatibility with Mistral models and strict message ordering requirements
Attachment Management
- Orphaned File Cleanup: New dialog in Settings to manage orphaned attachments that aren't linked to any thread
- Assign to Thread: Move orphaned attachments to existing threads
- Bulk Deletion: Clean up orphaned files to free up storage space
Slate Defaults
- Markdown by Default: When creating a new Slate without specifying type, markdown (.md) is now the default format
Version 0.8.6 (December 19, 2025)
New Features
Wait Action for Browser Automation
- Discrete Wait Types: New
waitaction in browser automation with 4 specialized wait modes:timeout- Simple delay (default 1000ms, max 30000ms) for basic timingselector- Wait for element to appear or disappear (useful for spinners/loaders)network_idle- Wait for fetch/XHR requests to settle (extension-only)animation- Wait for CSS animations and transitions to complete
- Smart Element Visibility: Selector wait checks display, visibility, opacity, and offsetParent for accurate visibility detection
Improvements
Browser Automation Architecture
- Unified Script Execution: New
executeInPageabstraction ensures consistent behavior across Chrome extension and relay/server contexts - MAIN World Execution: Proper MAIN world script execution for operations that need to intercept page-level JavaScript (fetch, XHR)
- Graceful Degradation: Extension-only features now provide helpful error messages with alternatives when used in relay mode
Bug Fixes
OAuth Token Refresh
- Chrome Extension Token Refresh: Fixed stale token issue where Chrome's cached OAuth tokens weren't being refreshed properly. Tokens are now verified and stale tokens are cleared before retry.
- PKCE Authorization Flow: OAuth popup now uses authorization code flow with PKCE instead of implicit flow, enabling proper refresh token support without requiring a client secret.
- Direct Token Refresh: Connections with refresh tokens can now be refreshed directly using Google's token endpoint, without requiring a backend server.
- Token Validation: Added token verification step to catch revoked/invalid tokens early and trigger automatic re-authentication.
Version 0.8.5 (December 19, 2025)
New Features
Apple Calendar Integration
- Native Calendar Access: New Apple Calendar tool provides fast, native access to your macOS calendars via EventKit
- Unified Agenda: Combined view of reminders and calendar events from all connected accounts (Google Calendar + Apple Calendar)
- Multi-Account Support: Pull events from multiple Google Calendar accounts and Apple calendars simultaneously
Learned Pages
- Smart Page Learning: Teach Caiioo about specific websites by capturing their structure
- URL Pattern Matching: Learned patterns automatically apply to similar pages on the same site
- DOM Snapshot Storage: Captured page structures help the AI better understand and interact with complex web apps
RTF Document Support
- Rich Text Editing: Upload and edit RTF (Rich Text Format) documents directly in Slate
- Bidirectional Conversion: Convert between RTF and HTML while preserving formatting
- Export Options: Download edited documents as RTF for use in Pages, Word, or other word processors
Improvements
Desktop App Security
- Relay Authentication: Secure HMAC-SHA256 authentication between extension and desktop server
- Per-User Tokens: Auth tokens are now tied to user identity for multi-user security
- Protected Endpoints: All sensitive API endpoints now require authentication
Performance
- Swift Helpers: Native Swift binaries for Apple Reminders, Calendar, and Notes provide 10x faster access than AppleScript
- Direct Callers: When running in desktop context, Apple tools bypass HTTP relay for lower latency
- Shared Utilities: Consolidated relay API client eliminates code duplication
Bug Fixes
- Calendar Event Deduplication: Events appearing in multiple calendars are now properly deduplicated in the unified view
- Prompt Caching: Fixed cache control markers being stripped during token estimation
Version 0.8.45 (December 18, 2025)
New Features
- Image Deduplication: When you upload the same image multiple times in a conversation, the AI now recognizes it as a duplicate and references the original instead of processing it again. This saves context tokens and helps the AI understand you're referring to the same image.
Improvements
- Document Deduplication: Improved cross-source document matching - the same document content is now recognized whether it comes from a user upload, Gmail attachment, or web page ingestion.
Bug Fixes
- PDF Auth Errors: Fixed issue where authentication errors (401, 403) when fetching protected PDFs were being masked as generic extraction failures. The actual auth error is now properly surfaced with helpful guidance.
Version 0.8.44 (December 18, 2025)
Bug Fixes
- Reload/Regenerate Button: Fixed issue where clicking reload on an assistant message would show the old response instead of the new one being generated. The UI now properly switches to the new branch during streaming.
Improvements
- Context Window Management: More conservative token estimation (3 chars/token) now used consistently across all pagination and context calculations, reducing the chance of context overflow errors with large web pages.
Version 0.8.43 (December 17, 2025)
Improvements
Timeline & Reasoning Display
- Auto-Collapse on Completion: Reasoning blocks and process timeline now automatically collapse when generation finishes, keeping the conversation clean while still accessible
- Intermediate Thoughts Visible: Agent's "thinking out loud" between tool calls is now displayed inline in the timeline, giving better insight into the agent's decision-making process
- Tool Action Labels: Tool calls now show the specific action in the label (e.g., "Web Browse → click" instead of just "Web Browse")
Tool Results
- Screenshot Display Fix: Screenshots and images no longer auto-expand in tool results - only rendered text content (like search results) auto-expands
Bug Fixes
- Ollama CORS: Fixed connection issues when using Ollama from the Chrome extension - CORS headers are now automatically handled
Version 0.8.41 (December 15, 2025)
Improvements
- Type Safety: Server storage adapter now uses proper TypeScript types instead of
anyfor threads, skills, profiles, MCP servers, and license info - Build System: Backup directory is now opt-in via
caiioo_BACKUP_DIRenvironment variable (no longer hardcoded)
Version 0.8.4 (December 15, 2025)
New Features
DNG/RAW Image Support
- Camera RAW Files: Upload DNG (Digital Negative) files directly from your camera or photo library
- Automatic Preview Extraction: Embedded JPEG previews are extracted from RAW files for fast processing
- Preserve Original Quality: Original RAW data is preserved while AI works with the high-quality preview
Improved Image Handling
- Server-Side Compression: Large images that exceed local compression limits are now processed by the desktop server
- Better Error Feedback: Visual error indicators (red border, alert icon) when image processing fails
- Graceful Fallbacks: Compression automatically falls back to server when offscreen document is unavailable
Improvements
- Shared Agent Architecture: Unified agent runner shared between extension and desktop server for consistent behavior
- Protocol Handler Consolidation: Storage and message handling now uses shared protocol handlers
Version 0.8.3 (December 13, 2025)
New Features
Native Mobile Apps
- Android App: Native Kotlin app (
android-app/) with WebView + native bridge channels (CalendarContract, SAF/photo picker, AlarmManager, OkHttp streaming) - Shared UI: Mobile apps use the same React UI as the web extension for consistent experience
- iOS + Android: Native Swift (iOS) and native Kotlin (Android) with matching bridge APIs
Improvements
- Identity Synchronization: Improved sync of license and profile data between extension and desktop server
- Attachment Management: Better handling of attachments in LAN/relay mode
- Extension Client Tracking: Desktop server now tracks connected extension clients with timestamps
Removed
- Swift Relay App: macOS relay functionality now fully handled by native app server (introduced in 0.8.2)
Version 0.8.2 (December 12, 2025)
New Features
Cross-Platform Desktop Server
- Caiioo Server: New Electron-based desktop application replaces the macOS-only Swift relay app
- Windows Support: Native Windows installer (NSIS) and portable executable
- Linux Support: AppImage and .deb packages for Linux distributions
- Menu Bar Integration: System tray/menu bar app with status indicators and quick controls
Platform-Specific Script Execution
- Unified Script API: New
/api/scriptendpoint auto-detects platform and uses the appropriate script engine - PowerShell on Windows: Execute PowerShell scripts for system automation on Windows
- Bash on Linux: Execute shell scripts (bash/sh/zsh) on Linux systems
- AppleScript on macOS: Existing AppleScript/JXA support preserved
Cross-Platform Helpers
- Desktop Notifications: Display notifications using native APIs on all platforms
- Clipboard Access: Read and write clipboard contents cross-platform
- Active Window Detection: Get the foreground application/window title
- System Info: Retrieve OS, CPU, and memory information
Improvements
- Automated Build Pipeline: Version, icons, and licenses sync from main project during build
- Smaller Distribution: Removed redundant mobile app codebases in preparation for unified builds
Version 0.8.1 (December 11, 2025)
New Features
Flexible Sign-In Options
- Multiple Authentication Methods: Sign in with Google OAuth, email/password, or license key - choose what works best for you
- Account Linking: Link your Google account to an existing email/password account for seamless access across methods
Custom OAuth for MCP Servers
- Bring Your Own OAuth Credentials: For MCP servers that require pre-registered OAuth clients, you can now provide your own client ID and secret
- Dynamic Token Refresh: OAuth tokens are automatically refreshed, ensuring uninterrupted connections to MCP servers
Improvements
- Document Processing Indicators: Clear visual indicators when processing PDFs, Word documents, and Excel spreadsheets
- Settings Navigation by Tier: Settings panel now intelligently shows relevant options based on your subscription tier
- Better OAuth Error Handling: Improved error messages when MCP server OAuth discovery fails
Version 0.8.0 (December 10, 2025)
New Features
MCP Server Browser & Management
- Browse MCP Servers: Search and install MCP servers from both the MCP Registry and npm directly within Settings
- Local MCP Servers: Run MCP servers locally on your machine via the relay app for enhanced privacy and control
- Credential Resolution: MCP servers can now pull credentials from the Credentials Vault for secure authentication
- Health Monitoring: Automatic health checks for connected MCP servers with status indicators
Profile Management
- Multiple Profiles: Create and switch between multiple user profiles
- Profile Switcher: Easy-to-access dropdown for switching profiles in the composer
- Profile Deletion: Remove profiles you no longer need
Google Account Selection
- Account Choice Dialog: When connecting Google services, choose between your Chrome profile account or add a different Google account
- Web OAuth Flow: Option to authenticate via web browser for accounts not signed into Chrome
Enhanced Browser Agent
- Go Back Action: Agent can now navigate back in browser history
- ARIA Snapshot: Capture accessibility tree snapshots for more efficient page analysis with fewer tokens
Mobile Apps
- iOS App: Native iOS client for accessing Caiioo on your local network
- Android App: Native Android client with server discovery and WebView caching
Multi-Device Sync
- Device Identity: Each connected device/extension is tracked with its own identity
- State Synchronization: Real-time state sync across multiple connected extensions
- Web Client Authentication: Secure authentication for web clients connecting to the relay
Improvements
- LLM Provider Key Management: Manage API keys for various LLM providers directly in the Credentials Vault
- MCP Tool Images: MCP tools that return images now display inline in the conversation
- PDF Text Detection: Improved detection of garbled/spaced-out text in PDF extraction with confidence assessment
- CORS Bypass for Images: Images are now fetched via offscreen document to bypass CORS restrictions
Bug Fixes
- Generation State Tracking: Improved final state capture to ensure generation completion is accurately detected
Version 0.7.8 (December 7, 2025)
New Features
Excel Spreadsheet Support
- XLSX Import: Upload Excel spreadsheets (.xlsx) directly to your conversations
- Data Extraction: Spreadsheet content is parsed and made available to the AI for analysis
- Round-Trip Editing: Edit spreadsheets in Slate and export back to XLSX format
What's New Dialog
- Update Notifications: After updating Caiioo, a "What's New" dialog automatically shows release notes for the new version
- Version-Specific Notes: See exactly what changed in your update, with formatted feature lists and improvements
- Non-Intrusive: Dialog only appears once per update, and doesn't show on first install
Version 0.7.6 (December 3, 2025)
New Features
Word Document Support with Tracked Changes
- DOCX Import: Upload Word documents (.docx) directly to your conversations
- Tracked Changes Visible: See insertions (green) and deletions (red strikethrough) with author and date on hover
- Comment Support: Comments are highlighted in yellow with tooltips showing comment text and author
- Slate Editing: View and edit DOCX content in the slate with full tracked changes styling
- Dark Mode Support: All tracked change and comment styles work in both light and dark themes
Version 0.7.4 (December 2, 2025)
New Features
FLUX Image Generator
- AI Image Generation: Generate images from text descriptions using FLUX AI models via OpenRouter
- Image Editing: Edit existing images in your conversation by providing the attachment ID and editing instructions
- Multi-Reference Support: Combine elements from up to 10 images with flux.2-flex model
- Multiple Models: Choose from flux.2-pro (fast, default) or flux.2-flex (max quality)
- Automatic Storage: Generated images are saved to your conversation and displayed inline
- Cost Tracking: Image generation costs are tracked separately and added to thread totals
Version 0.7.3 (December 1, 2025)
New Features
Enhanced Model Selector
- Unified Model Picker: Consistent model selection experience across composer and settings
- Privacy Indicators: Shield icon shows models with Zero Data Retention (ZDR) - your prompts won't be used for training
- Vision Support: Eye icon indicates models that can analyze images
- Recommended Models: Star icon highlights recommended choices (Claude Haiku 4.5, Claude Sonnet 4.5)
- Cost Transparency: See pricing per million tokens directly in the model list - easily spot free models
- Icon Legend: Quick reference in settings explains what each indicator means
Version 0.7.2 (December 1, 2025)
New Features
PDF Document Ingestion
- Upload PDFs Directly: Attach PDF documents to your messages - they're automatically processed via Mistral OCR
- High-Quality Text Extraction: Tables, figures, equations, and formatting are preserved as Markdown
- Agent Document Ingestion: Agents can process PDF URLs they encounter while browsing using the new
ingest_documentaction - Persistent Storage: Extracted content is stored for future reference without re-processing costs
Bug Fixes
- Large Image Attachments: Images over 5MB are now automatically compressed before sending to LLM APIs, fixing "image exceeds 5 MB maximum" errors with providers like Google/Gemini
Version 0.7.0 (December 1, 2025)
New Features
Multi-Thread Support
- Run Multiple Agents Simultaneously: You can now have up to 3 threads running at the same time
- Visual Running Indicators: Animated dots appear next to thread titles in the sidebar when that thread is actively generating
- Background Processing: Agents work in the background without stealing focus - browse freely while they work
- Thread Isolation: Each thread tracks its own tabs and state independently
Agent Non-Interference
- Stay in Control: When an agent opens tabs or navigates pages, it won't steal your focus if you've moved to a different tab
- Smart Tab Awareness: Agents only work with tabs they created or started with, never following you to new tabs mid-run
Apple Integration (macOS)
- Apple Notes: Read, create, search, and organize notes across folders - AI can help draft and edit notes directly
- Apple Reminders: Manage to-do lists and reminders - create, complete, and organize tasks with AI assistance
Version 0.6.5 (November 30, 2025)
New Features
AI Follow-up Suggestions
- Smart Prompts: After the assistant responds, AI-generated follow-up suggestions appear to help continue the conversation
- Context-Aware: Suggestions are based on the conversation context and what you might want to do next
Improvements
LAN Relay Settings
- Display Network Address: When connected to the relay server, Settings now shows the actual IP address and
.localhostname for easy mobile device connection - Simplified Instructions: Connection info appears automatically once the server is running - no need to check the terminal
Version 0.6.4 (November 29, 2025)
Bug Fixes
Slate Revision System
- Word-Level Accept/Reject: Individual word changes can now be reliably accepted or rejected without breaking subsequent changes
- Stable Change Tracking: Fixed issue where accepting a change would cause other pending changes to become unclickable
- Markdown Rendering in Preview: Change previews now render markdown formatting (bold, italic, etc.) instead of showing raw asterisks
- Infinite Loop Fix: Fixed browser hang when accepting pure text additions
- Complete Diff Display: Change preview now shows all deleted and added words, not just minimal differences
Version 0.6.3 (November 26, 2025)
New Features
- Basic User Default Experience: New users now start as basic users instead of admin
- License Key Upgrade: Added prominent "Upgrade to Pro" button in Settings to unlock admin features with a license key
- Simplified Onboarding: Cleaner first-time user experience focused on core functionality
Version 0.6.2 (November 26, 2025)
Bug Fixes
- Generation Timeout Recovery: Fixed silent failures during long Slate operations where the UI would show "generating" indefinitely
- Backend now tracks activity during generation and times out after 2 minutes of no progress
- Streaming chunks, tool execution, and agent decisions all reset the timeout
- Automatic recovery when service worker restarts mid-generation (orphaned state detection)
- Error message displayed to user when timeout occurs instead of silent hang
Version 0.6.1 (November 26, 2025)
New Features
Thread Import/Export
- Export Selected Threads: Toggle selection mode to pick specific threads to export
- Export All Threads: One-click export of all conversations with attachments
- Import from File: Import threads from exported JSON files
- Selective Import: Choose which threads to import from a file
- Duplicate Detection: Automatically detects threads that already exist with option to skip or overwrite
- Attachment Support: All images, PDFs, and files are included in exports
- Archived Thread Support: Both active and archived threads can be exported/imported
Improved Prompt Caching
- Better cache hit rates for long conversations with Claude models
- Dynamic caching strategy adapts to conversation length
Google Workspace Integration
- Google Drive: Search, create, copy, move, share files and folders
- Gmail: Read emails, manage labels, create drafts (safety-first: drafts only, no auto-send)
- Google Calendar: List calendars, query events, create/update/delete events, find free time slots
- Inline Authorization: Authorize Google access directly from chat without visiting Settings
Web Browsing Improvements
- New
click_coordinatesaction for clicking elements by screen position - URLs now open in new tabs by default (prevents tab overwriting)
- Tabs automatically grouped per conversation thread
- Better CSS selector detection for reliable element clicking
Model Selection
- Model selector moved to composer area for quick switching
- Model persists globally across mode switches
- Vision-capable models marked with eye icon
Streaming UI
- Tool parameters display as they stream in
- Progressive display of reasoning/thinking blocks
- "Generating..." status indicator during response
Token Usage & Cost Tracking
- Shows input, output, reasoning, and cached tokens
- Displays cost per message and cumulative thread cost
UI Improvements
- Mode Selector Available to All Users: Non-admin users can now switch between modes in Settings
- Collapsible Settings Sections: Settings panel sections can be collapsed/expanded for easier navigation
- AI & Model Configuration (LLM Provider, API Key, Model, Temperature, Max Iterations)
- Personalization (Personal Instructions, Profile Variables)
- Agent Mode Configuration (Mode Selector, Variables, Instructions, Branding, Tools, MCP Servers)
- Appearance settings
- Backup & Restore
- Documentation
- Mode-Specific Theme Colors: Each built-in mode now has a default color scheme
- Shopping Agent: Green
- Travel Agent: Blue
- Helper Agent: Pink
- Caiioo (General): Purple
- Chromatic Mode: Enable color rotation to gradually cycle through the spectrum
- Theme Override Management:
- Visual indicator shows when theme is customized (won't be lost on mode switch)
- "Save as Mode Defaults" button (admin) persists theme as the mode's new default
- "Reset to Mode Defaults" restores original mode theme colors
- New Caiioo branding and icon
- Mode selector moved to composer area
- New thread button shows mode selection dropdown
- Thread list toggle in composer top bar
- Vignette border effect on controlled browser tabs
- Floating stop button on controlled tabs
- Dropdown menus properly position near screen edges
Bug Fixes
- Model selector updates immediately after settings change
- New user onboarding now correctly launches Helper Agent
- Settings panel no longer crashes with malformed custom variables
- Tab group names update when thread title changes
- Slate accept/reject widgets positioned correctly
- Extended thinking works with more models (Haiku 4.5, Gemini, etc.)
- Fixed "maximum 4 cache_control blocks" error
Version 0.3.4 (November 24, 2025)
New Features
Text Selection Context
- "Add to Prompt" button appears when selecting text on webpages
- Selected text shown as chips in composer with page context
- DOM location captured for precise agent interaction
Google Calendar Integration
- Full calendar management (list, create, update, delete events)
- Smart availability search across all calendars
- Natural language time parsing ("tomorrow", "next week")
Bug Fixes
Slate Mode
- Accept/Reject buttons no longer hidden behind editor content
- Word-level diff highlighting (not entire lines)
- Multiple propose_change calls now accumulate correctly
- View toggle no longer "bounces back" unexpectedly
Rebrand
- Renamed from "ContextFlow" to "Caiioo"
Version 0.3.0 (November 22, 2025)
New Features
Mobile & LAN Access
- LAN Server: Access Caiioo from any device on your local network
- Conversation Sidebar: Open conversations in a browser tab for larger screen real estate
- Mobile Export: Export conversations as standalone HTML for offline viewing
Bug Fixes
- Fixed critical message branching logic bug
- Fixed archived thread operations
Version 0.2.0 (November 21, 2025)
New Features
Extended Thinking Support
- Claude models now support extended thinking/reasoning blocks
- Collapsible reasoning UI to view model thought process
- Reasoning details cached for multi-turn continuity
New LLM Providers
- Ollama Integration: Run local LLMs through Ollama
- Provider selection in settings panel
Image Format Support
- HEIC/HEIF Support: Apple image formats automatically converted for compatibility
UI Improvements
- Tools menu auto-saves on close
- Better settings panel organization
Bug Fixes
- Variable autocomplete positioning improved
Version 0.1.x (Previous Releases)
Core Features
- Multi-model AI chat (OpenRouter, Anthropic, OpenAI, Ollama)
- Browser automation and web scraping
- Slate for code and document editing
- MCP server integration
- Screenshot and vision capabilities
- Voice input
- Thread management and branching
- Profile and mode system