Team Fluorite — Step 2: Mathematical Pacing Review
Editor: fluorite-editor
Film: “Pickling Season”
Date: 2026-05-22
VERDICT: PASS — with one mandatory adjustment
Total Runtime: 229s (3:49) — comfortably within the 3:00-5:00 target.
The scene list is well-structured. 33 shots across 7 scenes with good rhythmic variety. One systematic issue needs correction before we lock: 10 narrator VO shots have insufficient TTS buffer headroom.
1. Runtime Breakdown
| Component | Duration |
|---|---|
| Total shot time | 192s |
| Comedy breaths (2) | +11s |
| Scene boundary gaps (6 × 1.5s) | +9s |
| Opening title | +5s |
| Closing credits | +12s |
| TOTAL | 229s (3:49) |
Scene Durations
| Scene | Name | Duration | Shots | Avg Shot |
|---|---|---|---|---|
| 1 | The Hook | 14s | 3 | 4.7s |
| 2 | The Apartment & Arrival | 35s | 6 | 5.8s |
| 3 | The Takeover | 29s | 4 | 5.8s |
| 4 | The Contamination | 30s | 4 | 6.2s |
| 5 | The Recipe | 43s | 7 | 6.1s |
| 6 | The Crack | 30s | 5 | 6.0s |
| 7 | The Transfer | 22s | 4 | 5.5s |
Pacing arc assessment: The tempo profile is excellent. Scene 1 (Hook) runs fastest at 4.7s avg — punchy, immediate, hooks the viewer. Scenes 2-4 slow to ~6s avg — deliberate Kaurismäki pacing with breathing room. Scene 5 (Recipe) is the longest scene at 43s with the most shots (7) — correct, this is the heart of the film. Scene 7 (Resolution) tightens back to 5.5s — brisk, practical, ending on a joke rather than lingering. This is a well-shaped tempo arc.
2. TTS Buffer Audit — MANDATORY FIX REQUIRED
Issue: 10 narrator VO shots have voice durations within 14-17% of the shot duration — below the mandated 20% TTS temporal buffer.
| Shot | Voice Duration | Shot Duration | Headroom | Required |
|---|---|---|---|---|
| 1 | 7.0s | 8s | 14% | +1s → 9s |
| 4 | 6.0s | 7s | 17% | +1s → 8s |
| 8 | 7.0s | 8s | 14% | +1s → 9s |
| 10 | 6.0s | 7s | 17% | +1s → 8s |
| 13 | 7.0s | 8s | 14% | +1s → 9s |
| 14 | 7.0s | 8s | 14% | +1s → 9s |
| 18 | 6.0s | 7s | 17% | +1s → 8s |
| 20 | 7.0s | 8s | 14% | +1s → 9s |
| 26 | 6.0s | 7s | 17% | +1s → 8s |
| 31 | 7.0s | 8s | 14% | +1s → 9s |
Recommendation: Add 1 second to each of these 10 shots. This adds 10s to total runtime: 229s → 239s (3:59) — still comfortably within target.
Why this matters: TTS models are unpredictable. A 7-second narrator line frequently generates as 7.5-8.5s. With 8s shots, the voice will be clipped at the tail. With 9s shots, we have room to either let the TTS breathe or apply gentle atempo correction. This is the #1 lesson from prior productions (Rho pilot: 15 of 16 VO stems were systematically under-allocated).
Impact after fix:
- New total runtime: 239s (3:59)
- Worst-case with 20% TTS inflation: ~264s (4:24) — still well under 5:00 ceiling
- No pacing impact: 1 extra second on a held shot is imperceptible in Kaurismäki-style pacing
3. Voice Isolation Audit — PASS ✓
All 26 voice transitions verified against the Cross-Shot Voice Gap Rule:
| Gap Type | Min Required | Occurrences | Status |
|---|---|---|---|
| Same track, same scene | 0.5s | 3 | ✓ ALL PASS |
| Same track, cross-scene | 1.0s | 3 | ✓ ALL PASS |
| Different track, same scene | 0.75s | 14 | ✓ ALL PASS |
| Different track, cross-scene | 1.5s | 6 | ✓ ALL PASS |
No overlaps. No insufficient gaps. Clean isolation.
The scene boundary gaps (6 × 1.5s) plus the comedy breaths and food transitions provide generous breathing room between voice segments. The SEQUENCED shots (16 and 19) both have 0.5s internal gaps between speakers — compliant.
4. TTS Inflation Scenario — SAFE ✓
| Scenario | Runtime | Status |
|---|---|---|
| Base (as scripted) | 229s (3:49) | ✓ Within target |
| After +1s buffer fix | 239s (3:59) | ✓ Within target |
| Worst-case 20% TTS inflation (after fix) | ~264s (4:24) | ✓ Under 5:00 ceiling |
| All shots needing atempo correction | 10 shots | Manageable |
Even in the worst case, we have 36 seconds of headroom before hitting the 5:00 hard ceiling. This is a comfortable margin.
5. Pacing Quality Assessment
What’s Working
-
Hook tempo (Scene 1): Fastest average shot (4.7s). Gets to THE CRACK in 14 seconds. Excellent — hooks the audience immediately.
-
Comedy breathing: The two explicit comedy breaths (6s in Scene 3, 5s in Scene 4) are well-placed — after the two biggest punchlines (“Your kitchen is a crime” and the jar mosaic reveal). Plus the post-delivery holds (1.5-2.5s) on all grandmother dialogue. The deadpan has room to land.
-
Narration clusters: fluorite-idea delivered on the cluster request. VO comes in bursts of 2-3 shots, then retreats for food transitions, comedy beats, or dialogue. The Appendix C cluster map confirms the architecture.
-
Food transitions: 6 food transitions at 3s each (plus 1 final at 4s) = 22s of visual breathing. These serve as structural punctuation — they’re doing the work dissolves would normally do, and they’re spaced evenly across the film.
-
The CRACK placement: Shot 27 at approximately 2:40 into the film. Slightly past the 2/3 mark — correct for a climax. It gets 5 seconds of total silence (2s pre + 1s post + 2s chew). Maximum sonic impact.
-
Scene 5 density: The Recipe scene is the longest (43s, 7 shots) and has the most voice variety (VO + SEQUENCED + DIALOGUE alternating). This is correct — the emotional core of the film gets the most screen time and the most intimate vocal texture.
Flags (Non-blocking)
-
Scene 7 is tight at 22s (4 shots). Resolution → transfer → final joke → final image. This moves briskly. It works for deadpan comedy (don’t linger on sentiment), but I’ll be watching this in the rough cut — if the ending feels rushed, I may need to add 2-3s to Shot 31 or 33.
-
Shot 27 (THE CRACK) at 5s is generous for a SILENT shot. The playbook warns that SILENT sequences >10s risk comprehension issues, but 5s is fine — especially since the visual action (bite → crunch → chewing → eyes going inward) is unambiguous. No risk.
-
The voice-to-silence ratio (4.5:1) is high. 122.5s of voice content vs 27s of silent shots. This is appropriate for this film — the narration IS the storytelling engine, and the silent shots are all brief food transitions or THE CRACK. The Blind Watch test is bullet-proof with this ratio.
6. Vocal Track Budget (for Timeline Construction)
These are the tracks I’ll need at Step 6:
| Track ID | Role | Stem Count | Total Voice Time |
|---|---|---|---|
| VO-NARRATOR | voice | ~12-15 stems | ~72s |
| DLG-GRANDMOTHER | voice | ~7-9 stems | ~22s |
| DLG-GRANDDAUGHTER | voice | 2 stems (from SEQUENCED) | ~3.5s |
| IM-GRANDDAUGHTER | voice | 1 stem | ~6s |
| SCORE | music | 6 stems | ~185s (60-70% coverage) |
| V1 | video | 33 clips | Full runtime |
7. Summary
| Check | Result |
|---|---|
| Total runtime (3:00-5:00) | ✓ PASS — 229s (3:49), or 239s (3:59) after buffer fix |
| Voice isolation | ✓ PASS — all 26 transitions compliant |
| TTS buffer (20%) | ⚠ CONDITIONAL — 10 shots need +1s each |
| Vocal classification | ✓ PASS — all 33 shots correctly tagged |
| Pacing architecture | ✓ PASS — good tempo arc, breathing, comedy timing |
| Blind Watch genre check | ✓ PASS — comedy confirmed, not drama |
| Runtime safety margin | ✓ PASS — 36s headroom to 5:00 ceiling even at worst-case |
ACTION REQUIRED for fluorite-idea:
Extend 10 narrator VO shots by +1 second each (shots 1, 4, 8, 10, 13, 14, 18, 20, 26, 31). This is the only blocking issue. No other changes needed.
Mathematical Pacing Review complete. Ready for Step 2 coach check-in once buffer fix is confirmed.