2.0.28 is live
FEATURES
- Sign in with ChatGPT
Use your ChatGPT Plus or Pro subscription instead of an OpenAI API key to route Codex models. xAI Grok OAuth also ships.
- Live Performance Stats In Chats
While streaming, a line shows elapsed time, decode speed, and context usage; an expandable row adds time to first token and token counts.
- Exa Web Search Provider
Exa joins Tavily, Serper, Brave, and LangSearch: neural search returning full page text, 1,000 free searches per month.
- User Profile From Your Memories
Each memory sector's Profile button asks an AI model to summarize its long-term memories into a short paragraph that prefixes every chat there.
- One-Tap Memory Consolidation
A Consolidate button finds near-duplicate long-term memories and merges each cluster into one entry.
- Memory Maintenance In One Tap
Profile and Consolidate first categorize Uncategorized memories and promote short-term notes to long-term.
- Memory Audit History
A new Audit History view lists every add, update, and delete with keyword search.
- On-Device Model Benchmark
A Benchmark button runs a llama-bench style test reporting speed and peak memory, then suggests optimal context, batch, and thread settings.
IMPROVEMENTS
- Text Selection In Fullscreen Chat
Selecting text in the fullscreen chat input no longer slides the main menu open, so editing on iPad stays precise.
- Choose Transcription Language
When converting video or audio to text, pick the spoken language for every speech engine, or leave it on Auto.
- Smoother Transcription Progress
Audio-to-text progress now advances steadily instead of jumping backward.
- Smarter Duplicate Detection
Multiple memories from one turn are evaluated in a single AI call, and near-identical entries are skipped.
- Memory Search Knows Profile From Facts
The cached user profile is sent separately from matching memory rows for a cleaner prompt.
- llama.cpp Engine b9553
Updated from b9279 to b9553 (269 changes), adding DeepSeek-OCR 2, Granite 4 Vision, Gemma 4 vision and audio, and DeepSeek V3.2 sparse attention.
- MLX-Swift LM Engine tag-20260607
Faster, more accurate on-device vision: corrected Qwen2.5-VL attention, SmolVLM2 up to 9x faster, plus audio input and more reliable tool calling.
- Sharper Camera Focus for Documents
View Assistant locks focus more reliably on close-up subjects, and you can tap the live preview to focus.
BUG FIXES
- Soul Greeting Now Appears From the Model View
Starting a chat with a Soul from model settings now shows its configured first message.
- Attaching a Video to Chat
Fixed Video Processing Error: file doesn't exist when attaching a video.
- Empty Transcript From iOS Speech
Fixed the iOS speech recognizer returning nothing on a clear recording.
- Convert to Text Could Hang
Fixed the iOS Speech Analyzer getting stuck at Analyzing audio; it now falls back to the system recognizer.
- Legacy Memories Now Reachable
Memories with an Uncategorized chip were invisible to Consolidate and Profile; all maintenance flows now pick them up.
- Memory AI Calls No Longer Stall On First Event
Fixed Profile, Consolidate, and Categorize calls returning empty when the first AI event was misread as the end.
- Memory List Refreshes After Maintenance
After Profile or Consolidate, Uncategorized chips on just-categorized rows disappear immediately.
- VoiceOver Input Stays Reachable After Sending
Fixed the chat input collapsing after sending; with VoiceOver it stays visible and focusable.