Ultron — Implementation Ledger

session 2026-06-12 · branch claude/eloquent-hopper-zz3nej · prod ultron-app--0000127 (7f6afe1) · app.51ultron.com
9Verified live
4Shipped
1Shadow
4Pending
1Blocked

Shipped & verified live

Verified
Model-picker persists the picked tier
First turn of a new conversation now writes conversations.model (was guarding on the null request id), so Allegro stays Allegro across refresh. DB-verified.
9fe41f9
Verified
?c=new refresh loop killed
The URL-sync effect no longer clobbers a real ?c=<id> on first mount (a Next 14.2 replaceState→useSearchParams feedback loop). User-confirmed smooth.
9fe41f9
Verified
Sub-agents run on the SELECTED tier
lite→Kimi, smart→DeepSeek V4 Pro (Azure, new OpenAI adapter), deep→Opus. No Sonnet. Was silently forcing every sub-agent through the Kimi bridge. Proven: [subagent] parent_tier=smart -> deepseek, real shell tool-call worked through the adapter.
2b922bc · e168760
Verified
Guardrails L2 — side-model sanitizer engages
A flagged scrape logged a sanitized event (593→292 chars). The 8s→20s timeout fix stopped it from silently fail-opening.
539055d · 40078a3
Verified
Guardrails L1 — scrubber + provenance envelope
An injection scrape logged injection_pattern [ignore-previous, role-reassign]; the model returned only the facts and ignored the embedded deploy_wfp/send_email.
9fe41f9 line
Verified
Background-job auto-wake
fire → settle → bg_trigger wake → resume, with no manual nudge. The phantom messages.user_id column was the "not waking up" bug; now fixed.
earlier fix
Verified
Real end-to-end build on Allegro (paperclip repo)
DeepSeek orchestrator, 3 CF build jobs, auto-woke twice, pushed a real branch, ~$0.49, telemetry captured. Survived client disconnect (headless durability).
live run
Verified
Keep-going watchdog (Stop-hook)
When a user turn ends asking-to-continue with no job pending, fires exactly one self-correcting nudge; the bg_trigger nudge is never re-nudged (hard bound, no loop). Correctly silent on two real clean completions.
7f6afe1
Verified
Per-turn cost + consecutive-failure breaker
chat-v2 used to check credits once then run ~120 unbounded rounds. A loop-head guard now trips on a cost ceiling ($10 default, env-overridable) or 4 all-failing rounds, ends the turn cleanly, and logs to guardrail_events. Verified live: a real turn tripped cost_ceiling at round 1 ($0.0123 vs a test $0.001 ceiling) and stopped.
e426892

Shipped (code-verified / structural)

Shipped
Phase 0 — terminal-status guard centralized + typed tool-flags
One isTerminalJobStatus()/isTerminalRawStatus() replaces 8 drifting sites; typed tool-flags scaffold (fail-closed). tsc-clean, behaviour-inert.
80e204e
Shipped
Phase 1 — lying badge + monitor cap
The lying RUNNING leaf chip now reads neutral "launched" instead of spinning forever; running monitor cards capped at 3 + "+N more". Code-verified (UI — needs a browser to see).
1371342
Shipped
Dreaming cron
Nightly memory-consolidation + failure-audit fires /api/cron/dream daily 08:00 UTC via the ultron-cron worker.
dc0d6c2
Shipped
CI gate + @ultron mention workflow
Pre-commit render-harness, PR gate, and an @ultron GitHub action.
earlier

Built but in shadow / partial

Shadow
Capability-tier gate
An outward/destructive tool fired after untrusted-content ingest is LOGGED to guardrail_events (concurrency-safe per-turn guardState), not yet enforced. Flip to enforce once the false-positive rate is known.
39f3dad

Pending / next

Pending
Cold-cache tool-result eviction
Drop fat tool bodies on bg_trigger / scheduled re-entries after a long idle. Cheap deterministic win.
Pending
Anthropic cache breakpoints
Only one cache_control today (system prompt); add tool + history breakpoints. Anthropic-only → helps the deep/Opus path.
Pending
Baseline tsc cleanup
25 real app errors (mostly a missing IconPlaceholder import) + a tsconfig exclude for chrome-extension, then drop ignoreBuildErrors so type bugs can't ship.
Pending
Smaller items
Hide the 🌙 Dream conversation from the session list; OnError-persist + other hook gaps; telemetry /api/summary + analytics card.

Blocked

Blocked
deep→Opus end-to-end verification
Anthropic balance empty until Monday. The deep→Opus routing CODE is correct; the whole Forte tier (orchestrator + sub-agents + title-gen, all Anthropic) is down until top-up. Smart (DeepSeek) and Lite (Kimi) unaffected.

Records on the branch

MASTER-PLAN.mdE2E-FINDINGS.md BACKGROUND-JOBS-RESEARCH.mdMORE-RECYCLABLES.md