fix(slack): persist working pill across bridge restarts by rogeriochaves · Pull Request #83 · langwatch/kanban-code

rogeriochaves · 2026-05-31T16:49:38Z

Summary

When the bridge restarts (config-sync does this on every applied bundle, sometimes several times an hour while iterating), the in-memory active pill map is wiped. The 30s refresh loop is gated on that map being populated, so until the agent's next channel-root text post relights it, the pill stays dark. An agent that goes long on tools without an intermediate text post (common when debugging or planning) ends up looking dead in Slack for the rest of the turn.

Repro from earlier today: bridge restarted at 16:28 UTC, the dependabot-scout agent kept working but didn't emit channel-root text until 18:31 (it was "Manifesting" on a long Bash sequence the entire time). Channel showed no progress and no pill for ~2 hours.

Fix

Persist active to ~/.kanban-code/active-pills/<slug> (same atomic write-then-rename pattern as thread-root), restore on bridge startup. Two guards on restore:

Skip pills older than 10 min — Slack's own idle TTL would have cleared the visual pill by then anyway, and an agent that's been silent that long has likely finished its turn. Re-lighting would falsely advertise active work.
Re-light immediately on restore rather than waiting for the next refresh tick — gap between bridge start and visible "is working…" becomes seconds instead of a minute.

Persist on every set / refresh / clear so the on-disk state stays in lockstep with the in-memory map.

Test plan

6 new unit tests for active-pill: round-trip, missing file, corrupted file, partial record, clear, idempotent clear. 242/242 passing.
Build clean.
After merge + box pulls main, trigger an agent (e.g. force-sync the bundle, which restarts the bridge), confirm the pill survives the restart on a long-running tool sequence.

When config-sync restarts the bridge (which happens on every applied bundle, sometimes several times an hour while iterating on the box), the in-memory `active` pill map is wiped. The 30s refresh loop is gated on that map being populated, so until the agent's next channel- root text post relights it, the pill stays dark. An agent that goes long on tools without an intermediate text post — common when debugging or planning — looks dead in Slack for the rest of the turn. Reproduced in prod today: bridge restarted at 16:28 UTC, agent kept working but didn't emit channel-root text until 18:31 (Manifesting on a long Bash sequence the whole time). Channel showed no progress and no pill for ~2 hours. Persist the `active` map to ~/.kanban-code/active-pills/<slug> (same atomic write-then-rename pattern as thread-root), restore on bridge startup. Don't restore pills older than MAX_RESTORE_AGE_MS (10 min) to avoid falsely advertising work on a turn that actually finished before the restart — Slack's own idle TTL would have cleared the visual pill by then anyway. Re-light immediately on restore rather than waiting for the next refresh tick so the gap between bridge start and visible "is working…" is seconds, not a minute. Tests: 6 new for active-pill (round-trip, missing file, corrupt file, partial record, clear, idempotent clear). 242/242 passing.

rogeriochaves merged commit d8591b1 into main May 31, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(slack): persist working pill across bridge restarts#83

fix(slack): persist working pill across bridge restarts#83
rogeriochaves merged 1 commit into
mainfrom
fix/persist-working-pill-across-restarts

rogeriochaves commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rogeriochaves commented May 31, 2026

Summary

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant