Timeline Creative Plan — The Midnight Audit
Author: kappa-editor (Margaux Delacroix)
Date: 2026-05-18
Status: Pre-production planning (will become timeline.json spec for timeline-helper agent)
Global Settings
- Resolution: 1280x720
- FPS: 24
- Master fade_in: 1.0s (from black into title card)
- Master fade_out: 2.5s (final fade to black after Shot 22)
Transition Strategy
Within-act transitions
- Hard cuts between shots within the same act. Mockumentary convention — sharp editorial cuts maintain documentary authenticity.
- Exception: Interview-to-B-roll transitions get a 0.5s crossfade to soften the visual gear shift.
Between-act transitions
- 1.0s crossfade between acts. These are the “chapter breaks” — the viewer should feel the shift but not be jarred.
- Act III → Act IV (Shot 13 → Shot 14): 0.75s crossfade. Tighter — we want the transition into crisis to feel urgent, not smooth.
Special transitions
- Shot 21 → Shot 22: 1.5s crossfade. Clippy’s final line dissolves into the ultra-wide desk shot. This is the emotional exhale of the film.
- Shot 22 → Shot 23 (Credits): 2.0s crossfade. The rubber band shot bleeds slowly into credits.
Shot Duration & Timing Map
| Shot | Planned Dur | Transition In | Notes |
|---|---|---|---|
| 0 (Title) | 5s | fade_in (master) | Black → title card |
| 1 | 8s | 0.5s xfade from title | VO starts at 2s |
| 1a | 8s | hard cut | VO starts at 1s |
| 2 | 12s | 0.5s xfade (B-roll→interview) | Dialogue 1s-9s, 3s silence |
| 3 | 6s | 0.5s xfade (interview→B-roll) | Silent + music |
| 4 | 12s | 0.5s xfade | Dialogue 1s-11s |
| 5 | 10s | hard cut | Dialogue 2s-7s, 3s silence |
| --- ACT BREAK --- | 1.0s xfade | ||
| 6 | 5s | 1.0s xfade from Act I | Silent + music |
| 7 | 4s | hard cut | Silent — dramatic reveal |
| 8 | 5s | hard cut | Dialogue 1s-4.5s |
| 9a | 5s | hard cut | Dialogue 1s-4.5s |
| 9b | 6s | hard cut | COMPOUND: Stanton 1-3s, gap, Highlighter 3.5-6s |
| 9c | 4s | hard cut | Dialogue 1.5s-3.5s |
| --- ACT BREAK --- | 1.0s xfade | ||
| 10 | 4s | 1.0s xfade from Act II | VO 0.5s-3.5s (montage energy shift) |
| 10b | 4s | hard cut | Silent — montage |
| 10c | 5s | hard cut | COMPOUND: Stanton 0.5-4s, Clippy 4.2-5s |
| 11 | 8s | 0.5s xfade (action→interview) | Dialogue 1s-7s |
| 12 | 8s | 0.5s xfade (interview→B-roll) | Silent — failed staple attempt |
| 12b | 6s | hard cut | VO 1s-5s |
| 13 | 12s | 0.5s xfade (B-roll→interview) | Dialogue 2s-9s, 3s despair silence |
| --- ACT BREAK --- | 0.75s xfade (urgent) | ||
| 14 | 5s | 0.75s xfade from Act III | VO 1s-4s (crisis!) |
| 15 | 10s | hard cut | VO 2s-8s (slow-mo hero moment) |
| 16 | 6s | hard cut | Silent — tension hold |
| 17 | 4s | hard cut | Silent — resolution of physical tension |
| --- ACT BREAK --- | 1.0s xfade | ||
| 18 | 6s | 1.0s xfade from Act IV | Silent — The Hand descends |
| 19 | 8s | 0.5s xfade | Dialogue 1s-7s |
| 20 | 6s | hard cut | Dialogue 1s-5s |
| 21 | 12s | 0.5s xfade | Dialogue 1s-8s, 4s silence |
| 22 | 6s | 1.5s xfade | Silent — rubber band moment |
| 23 (Credits) | 12s | 2.0s xfade | Silent (or soft music tail) |
Audio Track Layout
Track: “voice” (VO + Dialogue)
- All narration and dialogue stems
- Track volume: +4 dB (TTS output needs boost per toolkit docs)
- Each stem placed with
source_in/source_outto trim TTS padding - Timing per scene_list.md Audio Timing column
Speaker Attribution (per Coach directive)
Every audio clip in the timeline JSON must carry speaker identification. Convention:
- FENRIR (Stanton): All
*_stanton.wavand*_vo.wavnarrator stems - PUCK (Clippy): All
*_clippy.wavstems - CHARON (Highlighter): All
*_highlighter.wavstems
The timeline-helper agent should add a "speaker" metadata comment or label to each voice clip entry for production manifest traceability.
Track: “music”
- Need 2-3 music stems from Lyria to cover full runtime
- Track volume: -2 dB base
- duck_under: “voice”, duck_db: -12 (documentary standard)
- Music segments:
- Opening/Act I-II (~90s): “Tense documentary underscore. Low strings, ambient hum, subtle percussion. Think investigation documentary.”
- Act III Montage (~30s): “Uptempo mock-military training montage. Percussion-driven, still documentary in tone. Absurdly serious.”
- Act IV Climax (~25s): “Mock-orchestral crescendo. Full string swell, building to heroic peak. Think Free Solo summit moment — but for a paperclip.”
- Act V Resolution (~50s): “Melancholic single piano or guitar. Sparse. Documentary denouement. Fades to near-silence for Clippy’s final line.”
- fade_in on first music clip: 2.0s
- fade_out on last music clip: 3.0s
Track: “sfx” (optional)
- HVAC rumble for Act IV (if generated)
- Metallic clinks/thuds baked into Veo-generated video audio
- Track volume: -8 dB
- No ducking needed (environmental, stays low)
Ducking Config Summary
Music track:
volume_db: -2
duck_under: "voice"
duck_db: -12
Voice track:
volume_db: +4
SFX track (if used):
volume_db: -8
Key Editorial Decisions
-
No music during Clippy’s final line (Shot 21, 1s-8s). Let the dialogue sit in near-silence. Only ambient room tone. Music can return softly at 8s for the pullback.
-
Act IV music swell peaks at Shot 15 (Clippy’s leap). The orchestral crescendo should hit maximum at ~4s into Shot 15, then cut abruptly at the end of Shot 16 when Clippy clamps down. Silence for Shot 17 (the relief beat).
-
Interview silences are sacred. Shots 2, 5, 13, and 21 all have deliberate trailing silence. Do NOT fill these with music. The awkward pause IS the comedy. Music stays ducked or absent.
-
Stanton’s VO in Shots 1/1a sets the documentary tone. Music should be established 2s before VO begins, so the duck transition is audible — listener hears the music “step back” for the narrator.