Step 2 Beat Sheet — Technical Lead Review
Reviewer: Marcus Delaney (Tech Lead)
Date: 2026-05-18
Verdict: ⚠️ CONDITIONAL — 3 issues must be resolved before sign-off.
What’s Working
- Camera coverage maps perfectly to the tempo structure. Wide → OTS → MCU → ECU → wide-out arc is exactly what I specified.
- Reference manifests are clean. Every shot stays within the 3-ref Veo budget (max 2 characters + setting). No shot violates the 2-character limit.
- Timing hints present on every shot. Physical gesture vocabulary is rich and specific — great motion prompt material.
- Dialogue lines are properly short (1–3 sentences per shot). Character voices are distinct: Sarah is clipped/structured, Mark is loose/rhetorical.
- Emotional tone tags included on all dialogue entries.
- Genre integrity holds — warm diner setting, no noir drift in shot descriptions.
Issues (3 total — must fix)
ISSUE 1: Runtime at Risk — 186s (3:06) ⚠️
| Scene | Planned | Target |
|---|---|---|
| 1. Allegretto | 46s | ~60s |
| 2. Accelerando | 42s | ~75s |
| 3. Adagio | 54s | ~75s |
| 4. Coda | 44s | ~60s |
| Total | 186s (3:06) | ~270s (4:30) |
3:06 is dangerously close to the 3:00 floor. Once we apply crossfade transitions (which eat ~0.5s per transition) and any trimming in post, we will almost certainly dip below 3:00. We need 30–50 additional seconds.
Recommendation: Add 2–3 shots to Scene 2 (Accelerando) and 1–2 to Scene 1 (Allegretto). Specific suggestions:
- Scene 1 new shot: An exterior wide of the Starlight Diner in the rain before the interior establishing shot (6s, [SILENT], diner-exterior). Gives us a proper opening frame and adds breathing room.
- Scene 2 new shots: The argument in the story has more beats we haven’t used. Two candidates:
- Mark’s “You didn’t want a husband, you wanted a project manager!” (4s MCU, [DIALOGUE])
- Sarah’s “I wanted a partner who showed up!” rebuttal (4s MCU, [DIALOGUE])
- The truck headlights sweeping across their faces (6s, [SILENT]) — great transitional insert to break the argument before Scene 3.
- Scene 3: Consider extending Shot 3.6 (hands touching) or adding a two-shot of them sharing the laughter moment (6s).
These additions bring us to ~220–230s (3:40–3:50), safely inside the 3:00–5:00 window with room for crossfades.
ISSUE 2: COMPOUND Classification Error — 0 Valid Compounds ❌
Shots 3.1, 3.2, and 3.4 are tagged [COMPOUND] but none of them qualify. A [COMPOUND] shot requires two characters speaking sequentially within a single shot. Each of these shots has only one character delivering dialogue:
| Shot | Tagged | Characters Speaking | Correct Tag |
|---|---|---|---|
| 3.1 | [COMPOUND] | Sarah only | [DIALOGUE] |
| 3.2 | [COMPOUND] | Mark only | [DIALOGUE] |
| 3.4 | [COMPOUND] | Mark only | [DIALOGUE] |
The checklist mandates at least 2–3 genuine COMPOUND shots.
Recommendation: Convert 2–3 existing two-shots into true COMPOUND shots where both characters exchange lines within one clip. Best candidates:
- Shot 1.4/1.5 merge → new COMPOUND: Two-shot over the table. Sarah says “You’re early,” Mark replies “And you’re exactly on time.” Both visible, sequential delivery. This is a natural dialogue ping-pong.
- Shot 2.1/2.2 merge → new COMPOUND: OTS two-shot. Mark’s espresso machine line, Sarah fires back about the manual. Rapid exchange within one frame captures the accelerating tempo.
- Shot 3.4/3.5 merge → new COMPOUND: OTS two-shot. Mark asks about the fire alarm, Sarah laughs about the blueberry tart. Shared memory moment — beautiful compound candidate.
Reclassify 3.1, 3.2 as [DIALOGUE].
ISSUE 3: Shot 1.1 — Betty Visibility Contradiction (Minor)
Shot 1.1 describes “A waitress pours coffee in the background.” We agreed Betty stays off-screen to preserve reference slots. Having her visible, even in background, introduces an uncontrolled third character that the model will render inconsistently across shots.
Fix: Change to “Steam rises from a thick ceramic coffee cup on the counter. A neon blue ‘OPEN 24 HOURS’ sign flickers through a rain-streaked window.” Props-only establishing shot.
Non-Blocking Notes
-
Shot 4.5 [VO]: Sarah says “Take care of yourself, Mark” while walking away (back to camera). This is technically off-screen character dialogue, not narrator VO. The [VO] classification works because her mouth isn’t visible and we’ll generate via TTS — but label it as “character off-screen dialogue” in the audio notes so the Editor doesn’t confuse it with narrator voiceover in the mix. Different vocal treatment (Sarah’s voice, not a narrator).
-
Accelerando pacing: All 7 shots are 6s. For true acceleration, consider dropping the last 2–3 argument shots to 4s planned duration. The compression of shot length reinforces the tempo name. The Editor may flag this too.
Checklist Verification
| Criterion | Status | Notes |
|---|---|---|
| Audio classification on every shot | ✅ | All 27 shots tagged |
| VO-Safe Rule | ✅ | Shot 4.5 [VO] has no speaking action |
| ≥2-3 COMPOUND shots | ❌ | 0 valid — see Issue 2 |
| Timing hints on all dialogue/VO | ✅ | Present |
| Character voice distinction | ✅ | Sarah clipped, Mark loose |
| Audio-Only Test | ⚠️ | Passes if COMPOUND shots are fixed — currently no sequential exchanges |
| Visual-Only Test | ✅ | [VO] shot 4.5 shows walking, not speaking |
| Max 2 characters per shot | ✅ | All clear |
| Reference manifest within 3-ref budget | ✅ | All clear |
| Total runtime 3:00–5:00 | ⚠️ | 3:06 — needs padding (Issue 1) |
Once Issues 1–3 are resolved, I’ll sign off on Step 2.