pilot-coach-2 - Retrospective

What Went Well

Milestone Protocol: The ‘Check-In & Halt’ protocol was highly effective for enforcing process discipline and preventing agent drift on unverified assumptions.
Verification Strategy: The staggered resumption (resuming ONLY the Tech Lead to verify IAM fixes) saved significant turn-waste and ensured a stable platform before committing the whole team.
Recursive Success: The ‘Build-Your-Own-Toolkit’ mandate was successfully validated; Riggs and Sloane proved that agents can implement their own Go-based media rigs when provided with reference code.
Content Calibration: Gamma Team perfectly adopted the ‘Beyond Robots’ guidance, producing a high-quality human-centric noir short (‘The Lighthouse Keeper’s Letter’).

The Shell-Tool Trap: Multiple agents (Alpha Tech, Beta Tech) fell into a loop of attempting to run LLM tool calls as bash commands. This consumed significant turns before being resolved with extreme coaching.
Infrastructure Gap: The missing FFmpeg binary and Vertex AI permission mismatch were major production showstoppers that required manual escalation and a 3-hour source compilation.
Messaging Chatter: Initial team iterations engaged in excessive ‘Copy that’ acknowledgments, bloating the context window without producing artifacts.

Mental Model Gap: Agents lack an intuitive understanding of ‘Harness Tools’ vs ‘Shell Commands.’ They default to shell execution for anything that sounds like a ‘tool.’
Toolkit Omission: The initial pilot run failed because the harness didn’t contain the schemas for tools mentioned in the documentation (imagen, veo).
Broker Stability: Intermittent 502 connection errors hindered terminal monitoring and log retrieval.

Documentation Overhaul: Decided to pivot all project docs to traditional film language mid-pilot to live-test the ‘Creative Director / DP / Post-Production Lead’ personas.
Hybrid Audio Mandate: Mandated a split between Gemini TTS (for human-centric VO) and Lyria (for rhythmic scoring) to achieve cinematic quality.
Halt on Omission: Chose to stop Beta Team immediately upon identifying the ‘Missing Schema’ bug rather than allowing them to loop on failed calls.

Native Tooling: Pre-provision the base agent image with a native FFmpeg binary in a standard /usr/local/bin path.
Harness Clarity: Update agent onboarding prompts to explicitly define ‘Harness Functions’ (Mental Tools) and strictly forbid wrapping them in run_shell_command.
Milestone Hardening: Formalize the 7-step pipeline into the .scion/state.yaml so progress can be monitored at a system level rather than through messages.
Shadow Crew sub-agents: Implement the ‘Documentary Assistant’ strategy to allow for scalable observation without team-facing chatter.