← Kappa Artifacts | Kappa Team

Audio Pivot Audit

Kappa Team — "The Midnight Audit"

Audio Strategy Pivot — Impact Audit (RESOLVED: REVERTED)

Author: kappa-editor
Date: 2026-05-18
Trigger: Coordinator directive via Coach — use Veo native audio for character dialogue, keep TTS for narrator VO only.


Stem Classification: KEEP vs DISCARD

KEEP — Narrator [VO] (no visible speaking, TTS stems remain valid)

StemShotAudio ClassWhy Keep
shot01_vo.wav1[VO]Narrator over desk B-roll. No characters visible.
shot01a_vo.wav1a[VO]Narrator over desk B-roll. No characters visible.
shot10_vo_stanton.wav10[VO]Stanton VO over Clippy action montage. Clippy visible but NOT speaking.
shot12b_vo.wav12b[VO]Narrator over desk wide shot. No characters visible.
shot14_vo_stanton.wav14[VO]Stanton VO over crisis B-roll. No characters visible.
shot15_vo_stanton.wav15[VO]Stanton VO over Clippy’s leap. Clippy visible but NOT speaking.

Total KEEP: 6 stems

DISCARD — Character [DIALOGUE] (visible speaking → Veo native audio)

StemShotAudio ClassCharacter Speaking
shot02_dialogue_stanton.wav2[DIALOGUE]Stanton interview, appears to speak
shot04_dialogue_clippy.wav4[DIALOGUE]Clippy interview, appears to speak
shot05_dialogue_highlighter.wav5[DIALOGUE]Highlighter interview, appears to speak
shot08_dialogue_stanton.wav8[DIALOGUE]Stanton B-roll, appears to speak
shot09a_dialogue_stanton.wav9a[DIALOGUE]Stanton B-roll, appears to speak
shot09c_dialogue_stanton.wav9c[DIALOGUE]Stanton B-roll, appears to speak
shot11_dialogue_clippy.wav11[DIALOGUE]Clippy interview, appears to speak
shot13_dialogue_stanton.wav13[DIALOGUE]Stanton interview, appears to speak
shot19_dialogue_stanton.wav19[DIALOGUE]Stanton interview, appears to speak
shot20_dialogue_highlighter.wav20[DIALOGUE]Highlighter interview, appears to speak
shot21_dialogue_clippy.wav21[DIALOGUE]Clippy hybrid interview, appears to speak

Total DISCARD: 11 stems

COMPOUND — Needs Special Handling

StemsShotIssue
shot09b_dialogue_stanton.wav + shot09b_dialogue_highlighter.wav9bBoth characters visible and speaking sequentially. Veo must generate both voices in one clip.
shot10c_dialogue_stanton.wav + shot10c_dialogue_clippy.wav10cBoth characters visible. Stanton speaks, Clippy whimpers. Veo must handle both.

Total DISCARD (compound): 4 stems

Grand total: 6 KEEP, 15 DISCARD


Timeline Impact — Actually Cleaner

The design_brief.md editorial guardrails already specified: “complete, awkward silence during interviews (save for the low hum of the HVAC system).” This means no music during interview/dialogue shots was always the plan. The pivot aligns perfectly:

New Audio Architecture

Shot TypeVideo AudioVoice Track (TTS)Music Track
[VO]--generate-audio=false (no Veo audio)✅ TTS stem placed on timeline✅ Music plays, ducked under voice
[DIALOGUE]✅ Veo native audio (embedded dialogue)❌ None❌ No music (documentary silence)
[COMPOUND]✅ Veo native audio (multi-character)❌ None❌ No music
[SILENT]✅ Veo native audio (ambient/foley)❌ None✅ Music at full level

Why This Is Actually Better

  1. No ducking conflict. Music only plays during [VO] and [SILENT] shots. The voice track (TTS stems) is the only duck key signal. Clean separation.
  2. Authentic mockumentary feel. Real documentaries cut music during talking-head interviews. The silence makes the deadpan comedy land harder.
  3. Simpler timeline. Voice track has 6 clips instead of 21. Music track only needs to cover ~90s of VO/SILENT footage, not the full runtime.
  4. Natural audio transitions. Veo-generated ambient (HVAC hum, room tone) in dialogue clips provides organic sound beds. No need to manufacture silence.

One Critical Concern

Veo voice consistency. The DP must ensure Veo generates consistent character voices across all dialogue shots. Stanton should always sound like a grizzled baritone. Clippy should always sound nervous. Highlighter should always sound cynical and raspy. This needs explicit prompting in Step 5.

Suggestion for DP: Include detailed voice direction in every Veo dialogue prompt:


RESOLUTION

Coach ruling (04:59 UTC): REVERTED. DP correctly identified that our claymation characters (stapler, paperclip, highlighter) have no lips — lip-sync is physically impossible. The coordinator’s directive does not apply to our project.

Final strategy: Original plan stands.