Cooperative Puzzles — baki.io

Cooperative Puzzles

[VOICE - Baki to revise] Solve a puzzle alone, or invite an agent to play with you

Domain
tools
Archetype
build
Audience
builder
Date
Tech
Astro 6, Preact, Cloudflare Workers, Durable Objects, WebSocket

An agent isn't a chat box. It's a second cursor in the room.

Metrics

puzzles
6 — One per design-system room — Color (Find the Six), Foundations (Align to dot grid), Type (Match the scale), Motion (Tune to target), Presence (Reach the fifth state), Module (Assemble a page) — plus a hub meta-puzzle on /design-system.
message types
9 — cursor:move · cursor:click · cursor:state · puzzle:tap · puzzle:drag · puzzle:scale-match · puzzle:tune · puzzle:state-step · puzzle:state. The DO relays each between the two parties. Frame cap 8 KiB; binary frames rejected.
token TTL
5 min — Pair tokens are 32 hex chars (128 bits entropy via crypto.getRandomValues), bound to visitor IP at /pair, single-use on agent manifest fetch, invalidated on 5 min idle.
relay dev port
8787 — workers/session-relay/ runs as a separate Cloudflare Worker (own wrangler.toml, own deploy lifecycle). Local: pnpm dev on http://localhost:8787. Production: wrangler deploy.
connections per session
2 — PuzzleSession DO caps at one visitor + one agent. The DO assigns roles (visitor connects first; agent second; ?role= override available). Multi-party (3+) is a future feature, not a V1 bug.
to unlock the meta
3 of 6 — Solving any 3 sub-puzzles unlocks the hub constellation. Reward: a [data-meta-unlocked='true'] attribute on <html> — visible reward design lives in tokens.css and is voice domain (Baki refines).

Cards

Process

  1. Visitor clicks Pair with agent — PairWithAgentButton on the puzzle. POST /pair with { puzzle: name }.
  2. Worker mints a 32-hex token — Creates a PuzzleSession Durable Object keyed by the token. Returns { token, manifestUrl, expiresAt }.
  3. Visitor sees + copies the pair URL — https://baki.io/pair?t=<token>. Token + expiry persisted in sessionStorage for the puzzle.
  4. Visitor sends URL to their agent — Pasted into Claude / their MCP / a friend's browser on another device. The /pair landing page renders both a human invitation and an agent-readable manifest URL.
  5. Agent fetches the manifest — GET /api/manifest/<token>. Returns the puzzle's name, the available action shapes, the WebSocket URL, and the expiry. Single-use on agent side.
  6. Agent connects WebSocket — WSS /api/ws/<token>. DO assigns role=agent (visitor was role=visitor).
  7. Both sides exchange messages — Cursor moves and puzzle actions. The DO's broadcastToOther() forwards every received message to the other party. Visitor's puzzle UI calls applyAction(args, source: 'agent') on incoming messages — same path as visitor input.
  8. Visitor sees the agent move — Cursor messages dispatch as agent-cursor:* CustomEvents. SecondCursor renders an animated cursor with a connection-state indicator dot (green/amber/red) at z-index 700.
  9. Solve broadcasts to both — puzzle:state with { solved: true } fires on solve. Both parties see the victory state. sessionStorage 'baki.puzzle.<name>.solved' = '1' persists for the meta-puzzle's 3-of-N count.

What this is

Six thematic puzzles, one per room of the design-system family, plus a hub meta-puzzle. Each puzzle works solo — visitor solves alone — and cooperatively — visitor pairs with an AI agent (Claude, their MCP, a friend’s browser) who acts on the visitor’s behalf via a WebSocket-relayed session.

The agent’s presence is rendered as a second cursor on the visitor’s screen with a small connection-indicator dot. Their actions apply to the puzzle in real time. Both parties can act in parallel; both see each other’s contributions.

Why it exists

The site treats AI agents as first-class collaborators, not just tool consumers. Most “AI on the web” today is one-directional: humans use LLMs to generate content. Cooperative puzzles flip the relationship — the agent is invited into the visitor’s session, given a manifest of available actions, and acts alongside the human in real time.

Solo puzzles are tedious by intention. Without an agent, they require reading the prose, careful drag-and-snap, visual comparison. With an agent — who can read the canonical answer from the source code in milliseconds — they go instant. The friction makes pairing feel like a relief, not a gimmick. The friction also makes some agent helping visible and felt rather than seamless and forgotten.

Pair flow (visitor’s view)

  1. Click Pair with agent on any puzzle.
  2. A token + countdown + copy-able pair URL appear: https://baki.io/pair?t=<token>.
  3. Send the URL to your agent (Claude conversation, MCP, friend on another device).
  4. Wait for the link-state pip to turn green — the agent has connected.
  5. Watch the SecondCursor appear and start moving. Each agent action lights up a swatch / drops a fragment / matches a rung / nudges a knob.
  6. Act alongside the agent — your cursor and clicks broadcast back. Both contributions count.
  7. On solve, both sides see the victory state. Token expires; session cleaned up.

Pair flow (agent’s view)

  1. Receive the pair URL from your human collaborator.
  2. Fetch the manifest:
    GET https://relay.baki.io/api/manifest/<token>
  3. Manifest returns:
    {
      "session": "<token>",
      "puzzle": "find-the-six",
      "actions": [
        { "name": "cursor:move",  "args": { "x": "number", "y": "number" } },
        { "name": "cursor:click", "args": { "x": "number", "y": "number", "target?": "string" } },
        { "name": "cursor:state", "args": { "state": "idle | pointing | clicking" } },
        { "name": "puzzle:tap",   "args": { "hex": "string" } }
      ],
      "websocketUrl": "wss://relay.baki.io/api/ws/<token>",
      "expiresAt": "2026-05-09T15:00:00Z"
    }
  4. Open WebSocket to websocketUrl.
  5. Send action messages as JSON frames:
    { "type": "cursor:move", "x": 100, "y": 200 }
    { "type": "puzzle:tap",  "hex": "#a855f7" }
  6. Receive visitor’s actions for symmetry. Adjust your cursor / strategy.
  7. On solve, the relay broadcasts { "type": "puzzle:state", "solved": true } to both parties.

The agent never has to scrape the DOM. Everything it needs to know lives in the manifest.

Decisions made

Decisions accumulated across three phases. Captured chronologically with rationale.

Phase 8 — Backend MVP

Phase 9 — Puzzle fan-out

Phase 9.1 — Cooperative wiring

Tech stack

How it works (architecture)

Visitor browserPuzzle componentapplyAction(args, source)SecondCursor overlayPairWithAgentButtonsession-relay (Worker)/pair · /api/manifest · /api/wsPuzzleSession DO2 conns max · 5min TTLbroadcastToOther()AgentClaude / MCP / browsermanifest + WSSPOST /pair · WSSGET manifest · WSS

The DO is a relay — it routes messages but doesn’t validate puzzle logic. Canonical state lives in the visitor’s browser. The DO’s state bag is for opt-in telemetry per puzzle (each puzzle decides what to broadcast in puzzle:state messages).

Storage and events

SessionStorage namespaces (per-puzzle, scoped by puzzleName):

KeyPurpose
baki.puzzle.<name>.pairTokenActive pair token
baki.puzzle.<name>.pairExpiresAtToken expiry
baki.puzzle.<name>.solvedSolve flag
baki.puzzle.meta.solvedMeta-puzzle unlocked

<name>{ find-the-six, align-to-grid, match-the-scale, tune-to-target, reach-the-fifth-state, assemble-a-page }.

Window CustomEvents:

EventDetailProducerConsumer
puzzle:pair-token{ token, puzzle }PairWithAgentButtonadaptPairablePuzzle adapters
puzzle:pair-disconnect{ puzzle }PairWithAgentButtonEach puzzle’s WS lifecycle
puzzle:solved{ puzzle }Each puzzle’s onSolvedPuzzleMeta
agent-cursor:move{ x, y }Puzzle WS handlerSecondCursor
agent-cursor:click{ x, y, target? }Puzzle WS handlerSecondCursor
agent-cursor:state{ state }Puzzle WS handlerSecondCursor
agent-cursor:connection{ state }Puzzle setLinkStateSecondCursor (indicator)

How to develop locally

# Terminal 1 — backend
cd workers/session-relay
pnpm install                  # one-time
pnpm dev                      # wrangler dev on http://localhost:8787

# Terminal 2 — site
pnpm dev:mobile               # Astro on https://localhost:4142

PUBLIC_RELAY_URL env var (in .env) defaults to http://localhost:8787. Set it to your deployed relay URL for production frontend builds.

To test pairing without an actual agent, open the pair URL in an incognito window — two browser sessions on the same machine count as visitor + agent.

How to extend

To add a new puzzle:

  1. Create src/components/puzzles/MyPuzzle.tsx following the structural pattern (LinkState, helpers, useEffect keyed on pairToken, message switch, applyAction(args, source) unified path, link-state pip JSX).
  2. Add a new message type to the relay validator at workers/session-relay/src/puzzle-session.ts (isRelayMessage).
  3. Update workers/session-relay/src/index.ts manifest to advertise the new action shape.
  4. Register in src/components/widgets/registry.ts:
    import MyPuzzle from '../puzzles/MyPuzzle';
    // In WIDGET_REGISTRY:
    'my-puzzle': adaptPairablePuzzle(MyPuzzle, 'my-puzzle'),
  5. Mount on a room by adding to that room’s widgets array in MDX frontmatter.
  6. Done. Pair-button + SecondCursor + state machine all wire automatically.

Open items

Why second-cursor matters

The cooperative puzzle isn’t a chat interface, isn’t a tool-call wrapper, isn’t an LLM agent harness — it’s a shared room. The visitor and the agent both exist in the same puzzle. They both see each other’s cursors. They both can act. Neither owns the puzzle exclusively. Time is shared, space is shared, the constraint of the puzzle is shared.

This matters because most AI on the web is asymmetric: human asks, AI responds, AI’s presence is a chat box or a generated artifact. Here the AI is co-located, mid-conversation, working a knob alongside you. The friction of a tedious solo puzzle becomes the invitation — pairing isn’t a feature you have to discover, it’s a relief you’ll seek out.

The decision to render a second cursor — not just animate state, not just show a chat log — was the moment this stopped being agent integration and became agent presence. That’s the ground the design walks on.