Team Fluorite — Step 2: Mathematical Pacing Review

Editor: fluorite-editor
Film: “Pickling Season”
Date: 2026-05-22

VERDICT: PASS — with one mandatory adjustment

Total Runtime: 229s (3:49) — comfortably within the 3:00-5:00 target.

The scene list is well-structured. 33 shots across 7 scenes with good rhythmic variety. One systematic issue needs correction before we lock: 10 narrator VO shots have insufficient TTS buffer headroom.

1. Runtime Breakdown

Component	Duration
Total shot time	192s
Comedy breaths (2)	+11s
Scene boundary gaps (6 × 1.5s)	+9s
Opening title	+5s
Closing credits	+12s
TOTAL	229s (3:49)

Scene Durations

Scene	Name	Duration	Shots	Avg Shot
1	The Hook	14s	3	4.7s
2	The Apartment & Arrival	35s	6	5.8s
3	The Takeover	29s	4	5.8s
4	The Contamination	30s	4	6.2s
5	The Recipe	43s	7	6.1s
6	The Crack	30s	5	6.0s
7	The Transfer	22s	4	5.5s

Pacing arc assessment: The tempo profile is excellent. Scene 1 (Hook) runs fastest at 4.7s avg — punchy, immediate, hooks the viewer. Scenes 2-4 slow to ~6s avg — deliberate Kaurismäki pacing with breathing room. Scene 5 (Recipe) is the longest scene at 43s with the most shots (7) — correct, this is the heart of the film. Scene 7 (Resolution) tightens back to 5.5s — brisk, practical, ending on a joke rather than lingering. This is a well-shaped tempo arc.

2. TTS Buffer Audit — MANDATORY FIX REQUIRED

Issue: 10 narrator VO shots have voice durations within 14-17% of the shot duration — below the mandated 20% TTS temporal buffer.

Shot	Voice Duration	Shot Duration	Headroom	Required
1	7.0s	8s	14%	+1s → 9s
4	6.0s	7s	17%	+1s → 8s
8	7.0s	8s	14%	+1s → 9s
10	6.0s	7s	17%	+1s → 8s
13	7.0s	8s	14%	+1s → 9s
14	7.0s	8s	14%	+1s → 9s
18	6.0s	7s	17%	+1s → 8s
20	7.0s	8s	14%	+1s → 9s
26	6.0s	7s	17%	+1s → 8s
31	7.0s	8s	14%	+1s → 9s

Recommendation: Add 1 second to each of these 10 shots. This adds 10s to total runtime: 229s → 239s (3:59) — still comfortably within target.

Why this matters: TTS models are unpredictable. A 7-second narrator line frequently generates as 7.5-8.5s. With 8s shots, the voice will be clipped at the tail. With 9s shots, we have room to either let the TTS breathe or apply gentle atempo correction. This is the #1 lesson from prior productions (Rho pilot: 15 of 16 VO stems were systematically under-allocated).

Impact after fix:

New total runtime: 239s (3:59)
Worst-case with 20% TTS inflation: ~264s (4:24) — still well under 5:00 ceiling
No pacing impact: 1 extra second on a held shot is imperceptible in Kaurismäki-style pacing

3. Voice Isolation Audit — PASS ✓

All 26 voice transitions verified against the Cross-Shot Voice Gap Rule:

Gap Type	Min Required	Occurrences	Status
Same track, same scene	0.5s	3	✓ ALL PASS
Same track, cross-scene	1.0s	3	✓ ALL PASS
Different track, same scene	0.75s	14	✓ ALL PASS
Different track, cross-scene	1.5s	6	✓ ALL PASS

No overlaps. No insufficient gaps. Clean isolation.

The scene boundary gaps (6 × 1.5s) plus the comedy breaths and food transitions provide generous breathing room between voice segments. The SEQUENCED shots (16 and 19) both have 0.5s internal gaps between speakers — compliant.

4. TTS Inflation Scenario — SAFE ✓

Scenario	Runtime	Status
Base (as scripted)	229s (3:49)	✓ Within target
After +1s buffer fix	239s (3:59)	✓ Within target
Worst-case 20% TTS inflation (after fix)	~264s (4:24)	✓ Under 5:00 ceiling
All shots needing atempo correction	10 shots	Manageable

Even in the worst case, we have 36 seconds of headroom before hitting the 5:00 hard ceiling. This is a comfortable margin.

5. Pacing Quality Assessment

What’s Working

Hook tempo (Scene 1): Fastest average shot (4.7s). Gets to THE CRACK in 14 seconds. Excellent — hooks the audience immediately.
Comedy breathing: The two explicit comedy breaths (6s in Scene 3, 5s in Scene 4) are well-placed — after the two biggest punchlines (“Your kitchen is a crime” and the jar mosaic reveal). Plus the post-delivery holds (1.5-2.5s) on all grandmother dialogue. The deadpan has room to land.
Narration clusters: fluorite-idea delivered on the cluster request. VO comes in bursts of 2-3 shots, then retreats for food transitions, comedy beats, or dialogue. The Appendix C cluster map confirms the architecture.
Food transitions: 6 food transitions at 3s each (plus 1 final at 4s) = 22s of visual breathing. These serve as structural punctuation — they’re doing the work dissolves would normally do, and they’re spaced evenly across the film.
The CRACK placement: Shot 27 at approximately 2:40 into the film. Slightly past the 2/3 mark — correct for a climax. It gets 5 seconds of total silence (2s pre + 1s post + 2s chew). Maximum sonic impact.
Scene 5 density: The Recipe scene is the longest (43s, 7 shots) and has the most voice variety (VO + SEQUENCED + DIALOGUE alternating). This is correct — the emotional core of the film gets the most screen time and the most intimate vocal texture.

Flags (Non-blocking)

Scene 7 is tight at 22s (4 shots). Resolution → transfer → final joke → final image. This moves briskly. It works for deadpan comedy (don’t linger on sentiment), but I’ll be watching this in the rough cut — if the ending feels rushed, I may need to add 2-3s to Shot 31 or 33.
Shot 27 (THE CRACK) at 5s is generous for a SILENT shot. The playbook warns that SILENT sequences >10s risk comprehension issues, but 5s is fine — especially since the visual action (bite → crunch → chewing → eyes going inward) is unambiguous. No risk.
The voice-to-silence ratio (4.5:1) is high. 122.5s of voice content vs 27s of silent shots. This is appropriate for this film — the narration IS the storytelling engine, and the silent shots are all brief food transitions or THE CRACK. The Blind Watch test is bullet-proof with this ratio.

6. Vocal Track Budget (for Timeline Construction)

These are the tracks I’ll need at Step 6:

Track ID	Role	Stem Count	Total Voice Time
VO-NARRATOR	voice	~12-15 stems	~72s
DLG-GRANDMOTHER	voice	~7-9 stems	~22s
DLG-GRANDDAUGHTER	voice	2 stems (from SEQUENCED)	~3.5s
IM-GRANDDAUGHTER	voice	1 stem	~6s
SCORE	music	6 stems	~185s (60-70% coverage)
V1	video	33 clips	Full runtime

7. Summary

Check	Result
Total runtime (3:00-5:00)	✓ PASS — 229s (3:49), or 239s (3:59) after buffer fix
Voice isolation	✓ PASS — all 26 transitions compliant
TTS buffer (20%)	⚠ CONDITIONAL — 10 shots need +1s each
Vocal classification	✓ PASS — all 33 shots correctly tagged
Pacing architecture	✓ PASS — good tempo arc, breathing, comedy timing
Blind Watch genre check	✓ PASS — comedy confirmed, not drama
Runtime safety margin	✓ PASS — 36s headroom to 5:00 ceiling even at worst-case

ACTION REQUIRED for fluorite-idea:

Extend 10 narrator VO shots by +1 second each (shots 1, 4, 8, 10, 13, 14, 18, 20, 26, 31). This is the only blocking issue. No other changes needed.

Mathematical Pacing Review complete. Ready for Step 2 coach check-in once buffer fix is confirmed.