← Back to Team Toolkit
TL

Role Guide

Technical Lead (Director of Photography)

Synthesis pipeline, character reference chains, video generation, extend safety, and shot manifests.

Technical Lead — Role-Specific Production Guide

This document contains the detailed mandates, checklists, and procedures for the Technical Lead (Director of Photography) across all 7 production steps. Read the main Video Production Playbook first for the overall workflow and cross-role coordination.


Genre Integrity — Tone Anchors

AI video and image models default to moody, dramatic realism. The Idea Person will define Tone Anchor keywords in the design_brief.md — genre-defining terms (e.g., “bright, slapstick, warm lighting, absurd expression” for comedy). You MUST include these keywords in every image and video generation prompt, alongside your technical/cinematographic instructions. If the model isn’t explicitly told the genre, it will generate a drama regardless of the script.


Step 2: The Beat Sheet


Step 2.5: Scene Review & Object Anchoring


Step 3: Character Workshop & Voiceover Generation

Toolkit Preparation

“Prepare the Rig”—review the centralized toolkit documentation at /workspace/tools/genmedia/USAGE.md.

Character Generation Checklist (The Reference Chain)

To ensure visual consistency, you MUST use the --reference-image flag in genmedia-image to chain character designs. For each character, generate the following 4 images in sequence, saving them in a dedicated folder (e.g., /workspace/shared-dirs/[team-name]/characters/[character-name]/):

Composite Character Reference Sheet

After generating the 4 individual reference images, you MUST generate a composite character reference sheet for each character. This single image consolidates multiple views and zoom levels onto one clean sheet, and becomes the primary reference image for Steps 4 and 5.

Prompt template:

Professional character reference sheet on a pure white background. Show the same character in four views arranged in a clean grid layout: (1) full-body front-facing standing pose in the upper left, (2) close-up portrait headshot in the upper right, (3) full-body 3/4 action pose with arms gesturing expressively in the lower left, (4) full-body side profile standing pose in the lower right. The character must be completely isolated from any background — pure flat white behind every view. No environmental elements, no shadows on the ground, no props. Clean studio lighting, consistent character appearance across all four views. Character design sheet style, clean linework and professional presentation.

This composite sheet packs body turnaround, facial detail, and pose variety into a single image. Using one character_sheet.png per character (instead of multiple individual references) simplifies the Veo 3-reference-image budget: 1 character sheet + 1 setting reference = 2 images, leaving room for a second character or an object reference.

Setting Reference Image Generation

Master Settings defined by the Idea Person in Step 2 are text descriptions. You MUST generate a reference image for each Master Setting, anchoring the environment visually the same way character headshots anchor faces. For each Master Setting:

These setting reference images are used alongside character references in Steps 4 and 5 to maintain environment consistency across shots. Without them, the same “limo interior” will look different in every generation.

Voiceover Generation

Warning on TTS Durations: TTS models are non-deterministic regarding pacing. A 100-word script might result in 30s of slow speech or 20s of fast speech. Always provide the actual durations of generated stems to the Editor so they can adjust the timeline before assembly.

MUST execute genmedia-voice to generate the complete Narrator Voiceover (VO) track based on the lines written in Step 2. Save audio stems to the team’s shared /voice/ directory before Step 5 begins.


Motion Prompt Alignment (Audio Agreement)

You MUST align your motion prompts with the Audio Classification provided by the Idea Person:

You operate the synthesis rig to generate the storyboard.

Mandates


Prompt Sanitization & Safety Tips

If a prompt fails a safety filter (especially in Step 5), use these proven sanitization techniques:

  1. Archetype Substitution: Instead of naming potentially sensitive figures, use descriptive visual tags (e.g., “man in rhinestone jumpsuit” instead of “Elvis”).
  2. Gesture Rephrasing: “Finger guns” or pointing while armored can trigger “weapon” filters. Rephrase as “expressive hand gestures,” “pointing emphatically toward a colleague,” or “hand raised in a beckoning motion.”
  3. Framing as Performance: Explicitly include keywords like “acting,” “theatrical,” “stage performance,” or “comedic” to provide non-threatening context to the model.
  4. Environment Anchoring: Always anchor potentially “gritty” characters (like knights in armor) within bright, mundane environments (like an office) to counterbalance filter triggers.

Step 5: Principal Photography

You lead this step.

Video Generation Mandates

Extend Safety

When using genmedia-video extend, the output already contains the original video + the extension. Do NOT concatenate the original clip with the extend output — this produces duplicated footage (the original appearing twice). Use the extend output directly. See USAGE.md Workflow 2 for the correct pattern.

Shot Manifest (Mandatory)

You MUST generate a shot-manifest.json alongside the principal photography scripts. This manifest maps each shot filename to its planned duration and number of extend operations. The Editor uses this to run verify-dailies. See /workspace/tools/genmedia/USAGE.md (verify-dailies section) for the format.


Step 6: The Soundstage

Integrated Audio Workflow

  1. Dialogue/Narration: Use TTS (e.g., gemini-3.1-flash-tts-preview) to strictly match the script.
  2. Score/Music: Use a dedicated music model (e.g., Lyria) to generate the music, keeping in mind that music arcs often span across multiple visual scenes.

Note on Ambient/SFX: No separate foley or ambient soundscapes are generated. These are baked into the raw video clips via Veo audio prompting in Step 5.


Step 7: The Editing Suite

Other Role Guides