What Went Well
- Milestone Protocol: The ‘Check-In & Halt’ protocol was highly effective for enforcing process discipline and preventing agent drift on unverified assumptions.
- Verification Strategy: The staggered resumption (resuming ONLY the Tech Lead to verify IAM fixes) saved significant turn-waste and ensured a stable platform before committing the whole team.
- Recursive Success: The ‘Build-Your-Own-Toolkit’ mandate was successfully validated; Riggs and Sloane proved that agents can implement their own Go-based media rigs when provided with reference code.
- Content Calibration: Gamma Team perfectly adopted the ‘Beyond Robots’ guidance, producing a high-quality human-centric noir short (‘The Lighthouse Keeper’s Letter’).
What Didn’t Go Well
- The Shell-Tool Trap: Multiple agents (Alpha Tech, Beta Tech) fell into a loop of attempting to run LLM tool calls as bash commands. This consumed significant turns before being resolved with extreme coaching.
- Infrastructure Gap: The missing FFmpeg binary and Vertex AI permission mismatch were major production showstoppers that required manual escalation and a 3-hour source compilation.
- Messaging Chatter: Initial team iterations engaged in excessive ‘Copy that’ acknowledgments, bloating the context window without producing artifacts.
Failure Modes & Bottlenecks
- Mental Model Gap: Agents lack an intuitive understanding of ‘Harness Tools’ vs ‘Shell Commands.’ They default to shell execution for anything that sounds like a ‘tool.’
- Toolkit Omission: The initial pilot run failed because the harness didn’t contain the schemas for tools mentioned in the documentation (imagen, veo).
- Broker Stability: Intermittent 502 connection errors hindered terminal monitoring and log retrieval.
Key Decisions Made
- Documentation Overhaul: Decided to pivot all project docs to traditional film language mid-pilot to live-test the ‘Creative Director / DP / Post-Production Lead’ personas.
- Hybrid Audio Mandate: Mandated a split between Gemini TTS (for human-centric VO) and Lyria (for rhythmic scoring) to achieve cinematic quality.
- Halt on Omission: Chose to stop Beta Team immediately upon identifying the ‘Missing Schema’ bug rather than allowing them to loop on failed calls.
Suggestions for Improvement
- Native Tooling: Pre-provision the base agent image with a native FFmpeg binary in a standard
/usr/local/binpath. - Harness Clarity: Update agent onboarding prompts to explicitly define ‘Harness Functions’ (Mental Tools) and strictly forbid wrapping them in
run_shell_command. - Milestone Hardening: Formalize the 7-step pipeline into the
.scion/state.yamlso progress can be monitored at a system level rather than through messages. - Shadow Crew sub-agents: Implement the ‘Documentary Assistant’ strategy to allow for scalable observation without team-facing chatter.