Autonomous F1
content pipeline.
Autonomous Formula 1 content pipeline: Veo 3 + NotebookLM + Nano Banana 2 generate Latin-American Spanish reels, Remotion edits, and Meta/YouTube/TikTok publish — daily.
Last video published
One orchestrator, six engines, three publishers.
The pipeline is a single Python orchestrator that wakes up on a schedule, talks to six generation engines through whichever interface they actually expose (browser, API, or local GPU), composes the results in Remotion, and posts the final reel to three platforms — emitting a structured telemetry event at every step.
System architecture — every node emits telemetry events
The pipeline is not a creative tool. It is a constraint-solver wearing creative clothes — 90% engineering around platform limits, 10% aesthetics.
The orchestrator picks the day's fact.
A daily run starts with the orchestrator pulling the schedule, deciding what is worth publishing today, and writing the script and prompt set for every downstream stage. Decisions are deterministic — same input data produces the same fact selection — so a failed run can be replayed without skewing the calendar.
Read sources, score candidates, write the script.
The schedule lives in SQLite and tracks which facts have already been published, which are reserved for race weekends, and which platform-quota windows are open. The orchestrator queries FastF1 for historical race telemetry and the OpenF1 API for current-season data, scores the candidates against simple recency and engagement heuristics, and writes the chosen fact into a typed Python object that travels through the rest of the pipeline.
From the same fact, the orchestrator generates three artefacts that the downstream stages will consume: the Spanish narration script (used by Qwen3-TTS), the natural-language prompt for Veo 3, and the data-plot specification (channel, range, presenter overlay) for matplotlib. Each is captured in code so the agent never drifts between runs.
Five engines, five interfaces, one record.
The most asymmetric part of the pipeline. Three Google products are driven through a browser via Playwright MCP because they have no public API. Voice synthesis runs locally on a 6 GB GPU because no commercial TTS produced acceptable Latin-American Spanish at the price point. Data plots are rendered with the standard Python plotting stack against real F1 telemetry.
Veo 3 — 8-second presenter clip.
Veo 3 is invoked through the Google AI Studio web UI under a real Chrome profile, driven by Playwright MCP. The prompt is written in natural language because that is the style Veo 3 actually obeys — JSON-schema prompts produce drift. The output is a 16:9 clip capped at 8 seconds per call by the free tier; the 9:16 reframe happens later in Remotion. The orchestrator polls Google's job queue until the render completes, downloads the MP4, and writes a `video_generated` event to telemetry.
prompts/veo3/ so style is consistent across daily runs.NotebookLM — explainer narration.
NotebookLM generates the conversational explainer audio that sits under the data-plot section of the reel. It is driven through the same Playwright MCP + real Chrome path as Veo 3. The orchestrator uploads the relevant source documents, asks for a single-host narration in the matching topic, waits for the render, and downloads the resulting MP3. Output is mixed with the Qwen3-TTS Spanish voice in Remotion.
Nano Banana 2 — still imagery.
Nano Banana 2 produces the still backgrounds and presenter overlays. Prompts are written as JSON here because that is the format the model obeys — natural-language prompts produce drift in the opposite direction from Veo 3. The orchestrator submits the prompt, polls for completion, downloads the PNG, and tags it with the fact ID so Remotion can pick it up.
Qwen3-TTS — Spanish voice, local GPU.
No commercial TTS produced acceptable Latin-American Spanish at the volume this pipeline needs. The fix was to run Qwen3-TTS locally on a 6 GB GPU, using the eric speaker with a Latin-American instruct prompt. Cost dropped to "free after electricity" and quality went up. The synthesis call streams the WAV directly to disk; the orchestrator writes a voice_synthesised event with duration and word count.
Real telemetry, real plots.
The data overlays — race-pace deltas, tyre-degradation curves, sector comparisons — are rendered from real F1 telemetry pulled from FastF1 or the OpenF1 API. Plotting is done with the standard Python stack (matplotlib, plotly, seaborn) writing PNGs at the exact aspect ratio Remotion expects. All data work uses Polars — Pandas is explicitly forbidden in the codebase. The streaming Parquet path keeps the data layer fast and predictable as the catalogue grows.
Remotion stitches the reel.
Once the five generation engines have done their work, the orchestrator hands a folder of MP4s, PNGs and WAVs to Remotion. The editing logic itself is versioned in the repository — a React-style composition tree that turns asset bundles into final 9:16 vertical reels, deterministically.
Composition is code, not GUI.
The Remotion project lives inside the monorepo and is invoked via its CLI. A single composition consumes the asset bundle and produces a final MP4 with: the 16:9 → 9:16 reframe (presenter-centred crop), the Qwen3-TTS Spanish narration on the primary track, the NotebookLM bed on the secondary track, the burned-in TikTok-style subtitles (per-word emphasis on the punchlines), and the royalty-free outro sting. Because the composition is just code, the editing decisions are reviewable in pull requests and rerunnable across the entire back-catalogue if the template changes.
Same reel, three platforms.
The orchestrator hands the final MP4 to three publisher modules, each of which speaks to a single platform's public API. The schedule pre-computes the publish time per platform per timezone so that, for instance, an evening LATAM post happens in the local prime-time window without operator intervention.
Three public APIs, one schedule.
videos.insert.A real Chrome profile, not a bundled bot.
Three of the five generation engines — Veo 3, NotebookLM and Nano Banana 2 — have no public API at the price point this pipeline runs at. The pipeline drives them through their web UI, but with a deliberate twist: it attaches to a real Chrome profile over the remote debugging port instead of Playwright's bundled Chromium, because Google fingerprints and blocks the latter as a bot.
Every step writes a typed event.
Every component in the pipeline writes structured events to data/telemetry/events.jsonl AND broadcasts the same events live over a WebSocket. The dashboard subscribes to the WebSocket; the post-mortem reads the JSONL. One vocabulary, two consumers, zero log archaeology.
// data/telemetry/events.jsonl — excerpt from a daily run {"ts": "2026-05-13T03:00:01Z", "event": "run_started", "run_id": "r_2026-05-13"} {"ts": "2026-05-13T03:00:04Z", "event": "fact_selected", "fact_id": "f_4831", "topic": "race_pace_delta", "source": "FastF1"} {"ts": "2026-05-13T03:00:12Z", "event": "video_generation_queued", "engine": "veo3", "duration_target": 8} {"ts": "2026-05-13T03:03:46Z", "event": "video_generated", "engine": "veo3", "duration_ms": 214000, "bytes": 2147483} {"ts": "2026-05-13T03:04:02Z", "event": "voice_synthesised", "engine": "qwen3_tts", "speaker": "eric", "locale": "es-419", "words": 94} {"ts": "2026-05-13T03:05:18Z", "event": "remotion_render_complete", "output_mp4": "out/f_4831.mp4", "frames": 630} {"ts": "2026-05-13T20:15:00Z", "event": "published", "platform": "instagram", "status": "ok", "post_id": "ig_18024…"} {"ts": "2026-05-13T20:15:18Z", "event": "published", "platform": "youtube", "status": "ok", "video_id": "yt_xK9p…"} {"ts": "2026-05-13T20:16:02Z", "event": "publish_warning", "platform": "tiktok", "reason": "quota_near_limit", "remaining": 3} {"ts": "2026-05-13T20:16:15Z", "event": "run_complete", "duration_ms": 62173000, "status": "ok"}
manual is the special tag for CAPTCHAs and auth walls — the dashboard surfaces it as a banner so a human can step in fast.24 hours, one orchestrator.
The daily run is intentionally split across the calendar day. Generation happens in the early morning when the LATAM market is asleep; publishing happens in the evening prime-time window per platform. Quota resets are honoured at midnight UTC.
Everything in the codebase.
A breakdown by category. Choices favour local execution where possible, typed contracts at every boundary, and append-only telemetry over rotating logs.
Generation
- Veo 38-second video clip
- NotebookLMExplainer narration
- Nano Banana 2Still imagery
- Qwen3-TTSSpanish voice (local)
- matplotlib · plotly · seabornData plots
Data
- FastF1Historic telemetry
- OpenF1 APILive data
- PolarsDataFrames · no Pandas
- SQLiteState store
- JSONLTelemetry stream
Runtime
- PythonOrchestrator language
- Claude CodeDevelopment partner
- Playwright MCPBrowser automation
- Real Chrome (CDP)Anti-bot detection
- WebSocketLive dashboard channel
Output
- RemotionVideo composition (CLI)
- Meta Graph APIInstagram Reels
- YouTube Data API v3YouTube Shorts
- TikTok Content PostingTikTok publish
- MP4 (9:16)Single deliverable
The architecture is shaped by limits.
Every architectural choice traces back to a hard cap. The system is not a creative tool — it is a constraint-solver wearing creative clothes.
eric with a Latin-American instruct prompt. Reviewed weekly to catch any drift toward Iberian pronunciation.Cheap because most of the heavy lifting is local.
The architecture's defining cost choice is running TTS on a local 6 GB GPU instead of a commercial API. That single decision moves Spanish voice synthesis from a monthly bill to an electricity line, while delivering quality that the commercial APIs only hit at premium tiers.
Voice synthesis
Video generation
Data / telemetry
Publishing
What this build taught me.
pl.scan_parquet handles a single race and a full season.