All projects 2026

Clever_PO · F1 Power-Unit Validation System · Solo Engineer · 2026

F1 Power-Unit
Validation System.

A solo engineer R&D project proving what frontier AI can do for F1 powertrain testing and validation — a full Windows-native app shipped end-to-end with Claude Code as the engineering partner.

Surfaces, one backend

Service modules

Engineering-owned Data Folder

Admin perms needed to install

01 · System overview

One backend, four surfaces, one rule.

Clever_PO is a monorepo with three deployment surfaces sharing a single FastAPI backend, plus a fourth deployment target — an MCP server — that exposes the same business logic as tools Claude Code and Claude Desktop can call directly. An external Data Folder, versioned by the engineering team, sits beside the codebase and carries the TR configs, limits, parameter lists, and post-process functions the runtime loads at startup. The rule that holds the whole thing together: business logic lives in FastAPI, never in a UI.

System architecture — engineering owns the rules, FastAPI owns the logic, every surface speaks the same language

The rule is that business logic lives in FastAPI. Once that holds, adding the desktop wrapper, the MCP server, and the REST API stops feeling like product work and starts feeling like wiring.

02 · How it works · seven steps

From dyno telemetry to certified pass-off.

A validation session has a fixed shape. An operator opens a subtest, the app generates the test-session metadata, ingests Parquet telemetry, evaluates engineering limits against the TR config, attributes any failures to either the Rig or the Unit Under Test, and then either certifies the unit, raises a concession with an audit trail, or fires off a structured notification email. Every step is logged through the telemetry layer; every step is callable from the MCP server.

01Configure

Engineering-owned Data Folder loads first.

At startup the app validates that a configured folder — on OneDrive, a NAS, or a local disk — contains the four engineering-owned subfolders: json_configs/ for TR definitions, limits/ for per-channel envelopes, parameter_lists/ for the allowed parameter sets, and post_process_functions/ for any custom analysis scripts the engineering team wants to drop in. Each is OK-checked before the app proceeds. The whole flow is designed so that changing a limit, adding a TR, or revising a parameter list does not require a code release. The right people own the right artefacts.

Clever_PO Data Folder configuration modal — engineering-owned TR configs, limits, parameter lists and post-process functions validated at startup — The startup validator checks all four engineering subfolders before the app proceeds.

Source

json_configs

Per-TR test definitions — what subtests run, in what order, against which channels.

Source

limits

Per-channel envelope rules — t-matrix expressions, scalar bounds, and per-channel Owner attribution (Rig | UUT).

Source

parameter_lists

Allowed parameter sets per test type — the form's drop-downs read from here, so adding a new dyno line doesn't need code.

Source

post_process_functions

Custom Python analysis scripts engineers can drop in — picked up at runtime, runnable per subtest.

02Metadata

The app generates the test-session metadata.

Every test session opens with a structured metadata surface: Serial Number, Sequence, Part #, Mileage, TR Number, Test Type (e.g. Test 02 — Full Pass Off), Dyno location, Operator, Assigned PU, PU Type, Run Number, and Associated Serial Number. The TR Number is auto-looked-up against the Data Folder the moment it is typed, so a wrong code never makes it past the form. Each subtest the TR defines (Subtest_01, Subtest_02, Subtest_03) is gated behind the previous one in the bottom tab strip, preventing an operator from skipping ahead. The right-hand RESULTS panel shows Rig and UUT in Pending until telemetry has been imported and the limits engine has run.

Field

Serial / Part / Mileage

Unit identification — what is being tested, what hardware revision, how many cycles already on it.

Field

TR Number

Auto-looked-up against the Data Folder — a typo never propagates downstream.

Field

Test Type · Dyno · Operator

Session classification — which test, on which rig, by whom.

Field

Assigned PU · Run # · Assoc. Serial

Engineering linkage — connects this run to the powertrain identity, the run sequence, and any related serials.

03Evaluate

Polars parses, the t-matrix evaluates, the Owner column attributes.

The operator selects a Parquet log and the app's high-throughput Parquet processor — Polars with pl.scan_parquet streaming — ingests it. Pandas is explicitly forbidden in the codebase; the streaming path is what lets a dyno-cell workstation handle multi-GB telemetry files without paging out. Once parsed, the limits engine evaluates each engineering channel against its configured envelope per Pass-Off Point, computes the running value, and renders the comparison table. The most distinctive column is Owner — every failure is attributed to either the Rig (test rig issue — environment, instrumentation, harness) or the UUT (unit under test — the actual hardware being validated). That classification is what makes the rest of the workflow legitimate: a Rig failure does not mean the unit is bad, and the concession path treats the two cases categorically differently.

Limit comparison after Parquet import — channel-by-channel evaluation with Running Value vs Limit Value, FAIL/PASS attribution per Owner (Rig vs UUT), RESULTS panel flipped to FAIL — Running Value vs Limit Value per channel, with the Owner column attributing each failure to Rig or UUT.

Ingest

Polars · pl.scan_parquet

Streaming reader — handles multi-GB Parquet on a workstation without paging out. Pandas is forbidden codebase-wide.

Schema

t-matrix JSONB

Per-channel × per-pass-off-point limit expressions stored as JSONB — engineering edits in the Data Folder, the API evaluates against the running value.

Attribution

Rig | UUT owner

The single most important column the system adds. Routes a Rig failure to a different conversation than a UUT failure.

Output

Comparator table

Channel × Pass Off Point × Limit expression × Running Value × Limit Value × Owner — the artefact engineers act on.

04Concession

Audit-trailed exception handling.

When a failure is reviewable rather than rejectable, the engineer raises a concession. The dialog enforces a Standard Concession or Fail Concession type, a reason of at least two sentences (no one-line "looks fine" approvals), and a concession code. The decision is written to the database against the test session, so a later auditor can reconstruct exactly which channel failed, by how much, and on what justification the deviation was accepted. The concession is part of the audit trail, not a paper trail that lives separately in someone's inbox.

Concession Required dialog — Standard vs Fail Concession, structured reason (≥ 2 sentences), and a concession code recorded against the session — Standard vs Fail Concession, structured reason, concession code — all persisted against the session.

Classification

Standard | Fail Concession

Engineering distinguishes between deviations that fit known tolerances and deviations that require explicit failure acceptance.

Reason

≥ 2 sentences enforced

The form refuses one-liners. Forces the engineer to explain the deviation in enough detail that a future auditor can understand.

Identifier

Concession code

Traceable identifier — links the decision to engineering's broader concession ledger.

Persistence

Audit row on the session

Stored against the test session record — reconstructable years later without asking around.

05Notify

Deterministic fail-email notification.

When a fail email is needed, the app generates it for the operator instead of leaving them to write it by hand. The body is deterministic — TR Code, Serial Number, Sub-test, ISO timestamp, Total Failures, then a fixed ASCII table of every breached channel (Channel · NPOTP · Limit · Actual · Expected · Owner), then a summary line splitting Rig vs UUT failure counts. The email opens in Outlook addressed to the engineer and test-lead distribution lists, ready to send. The same email goes out the same way every time, so the receiving engineers can grep their inbox by TR Code and reconstruct a unit's history without having to ask around.

Auto-composed failure email body — TR Code / Serial / Sub-test / timestamp / total-failures notification with ASCII table of failed limits and Rig-vs-UUT summary, opened directly in Outlook — Deterministic body — the same shape every time, so the receiving inbox stays grep-able.

Subject

TR · Serial · Sub-test

A consistent subject line means the receiving engineers can filter, search, and triage by TR Code without parsing the body.

Body

ASCII fail table

Channel · NPOTP · Limit · Actual · Expected · Owner — reads in plain text in any inbox, no rendering surprises.

Summary

Rig vs UUT counts

The triage signal — five Rig failures and seven UUT failures has a different meaning than the reverse.

Destination

Outlook compose

Pre-addressed to the engineer + test-lead distribution lists. Ready to send; no copy-paste fingerprints.

06Compare

Multi-system Comparator — cross-serial trend analysis.

Once a session is recorded, it becomes part of a queryable history. The Comparator surface lets an engineer pick N serial numbers, choose a channel (for example TemperatureChannelNeg), and overlay every selected unit's curve against the configured t-matrix envelope (the dashed yellow upper and lower bounds). The Raw / t-Matrix toggle controls whether the engineer is looking at raw measurements or matrix-normalised values, and the date and sub-test filters narrow the dataset. This is the surface where a development engineer answers "is this drift unit-specific or fleet-wide?" without exporting CSVs or opening Excel.

CleverPO Comparator — select multiple serials, overlay a channel against the t-matrix envelope (dashed yellow), toggle Raw vs t-Matrix view, filter by date range and sub-test — Multi-serial overlay against the t-matrix envelope — the engineer's answer to "unit-specific or fleet-wide?"

Selection

N serial numbers

Side-by-side overlay across units. Hard to do in Excel; trivial here.

View

Raw / t-Matrix toggle

Switch between raw measurements and matrix-normalised values — different questions need different views.

Filter

Date · sub-test · status

Narrow the dataset to the question being asked. Filters stack.

Reference

t-matrix envelope

Dashed yellow upper/lower bounds — the engineering-defined acceptable region, drawn on every chart.

07Install

Native Windows installer — no admin password needed.

The desktop build ships as an NSIS Windows installer that bundles the Python runtime, the FastAPI backend, the Next.js UI assets, and the Tauri shell. The installer extracts everything into the user's profile so an admin password is not required, and the result is a Start-menu app that runs entirely offline against a local SQLite database — same UI, same limits engine, same MCP surface, just on a workstation instead of a server. The dyno cell does not need network connectivity to validate a unit.

Bundle

Python + FastAPI + Next.js + Tauri

Everything the desktop needs in one .exe. The Tauri shell runs the FastAPI backend as a sidecar process.

Install scope

User profile · no admin

Per-user install means a dyno-cell engineer doesn't need to chase IT for a privileged install. Friction removed.

Runtime

SQLite local DB

Single-file database in the user profile — offline-capable, no server connectivity required.

Distribution

NSIS Windows installer

Standard Windows installer flow — predictable for IT teams that need to whitelist the binary.

03 · MCP-native architecture

The MCP server is a first-class deployment target.

Clever_PO is MCP-native by design: the same FastAPI service modules that the web UI and the desktop sidecar call are also wrapped as Model Context Protocol tools the MCP server exposes to Claude Code and Claude Desktop. An engineer can run an entire pass-off conversation from a chat session — "open subtest 1 of TR CBA_001 for serial XXX_001, import the latest Parquet, evaluate limits, tell me which channels failed on the UUT side" — and the MCP server fulfils each step by calling the exact same code path the web UI uses. There is no separate "chat surface" codebase; every new capability added to FastAPI is one thin wrapper away from being conversational.

MCP sequence — Claude Desktop drives a real validation, calling the same FastAPI service modules the web UI uses

Why MCP

Validation that can be driven from a chat

"Open subtest, evaluate limits, raise a concession with reason X" — a complete pass-off without touching the web UI.

Tool surface

Tools follow the workflow

Open subtest, import telemetry, evaluate limits, raise concession, draft fail email, query comparator, dump audit history.

No code duplication

One FastAPI module per business action

Each MCP tool is a thin wrapper around an existing service module. The chat path and the UI path cannot diverge.

Same persistence

Same SQLModel, same audit trail

A concession raised via chat lands in the database identically to one raised via UI. The auditor cannot tell which path was used — by design.

04 · Telemetry + HITL development loop

Observability is the AI coder's nervous system.

A custom telemetry layer is built into the application from the beginning. While the app runs — both in production and during development — it emits structured events to a JSONL log file and broadcasts the same events live over a WebSocket stream. One vocabulary, two consumers. When Playwright drives the UI during development, Claude Code and a human reviewer both subscribe to the WebSocket; when a session passes silently but an internal error event fires, the loop catches the discrepancy in real time.

The dev loop — telemetry collapses "Playwright says green" + "telemetry says error" into one observable stream

05 · Engineering-owned config sample

A TR config, a per-channel limit, and a Polars evaluation.

All three of these artefacts live in the Data Folder — they are owned by the engineering team, versioned in their git repo, and reloaded by the Clever_PO API at startup. The Python at the bottom is what FastAPI runs against the rendered envelope.

// data-folder/json_configs/CBA_001.json — engineering-owned TR definition
{
  "tr_code": "CBA_001",
  "test_type": "Test 02 — Full Pass Off",
  "subtests": [
    { "id": "Subtest_01", "channels": ["TemperatureChannelNetPost", "newNPO"] },
    { "id": "Subtest_02", "channels": ["TemperatureChannelPos_mean"] }
  ]
}

// data-folder/limits/TemperatureChannelNetPost.json — per-channel envelope rule
{
  "channel": "TemperatureChannelNetPost",
  "owner": "UUT",
  "envelope": {
    "type": "tmatrix",
    "pass_off_points": {
      "1":   "x == 2",
      "101": "x == 2",
      "201": "x == 2",
      "202": "x == 2",
      "203": "x == 2"
    }
  }
}

// services/api/limits.py — Polars evaluates the envelope against the running value
df = pl.scan_parquet(parquet_path)
running = df.filter(pl.col("channel") == channel).select("value").collect()
verdict = evaluate_envelope(envelope, running, owner="UUT")
// → pass / fail with Rig|UUT attribution, ready for the comparator table

06 · Engineering constraints

The decisions that shaped the architecture.

Every choice below traces back to a specific constraint or principle that the project committed to from day one. Treating these as architecture, not implementation detail, kept the four-surface build coherent across months.

Data layer

Polars-only · no Pandas anywhere

pl.scan_parquet streaming for multi-GB Parquet on a workstation. The "no Pandas" rule is enforced in CI.

Failure attribution

Rig vs UUT Owner as a primitive

The single most useful column the system adds. Routes Rig failures to a different conversation than UUT failures.

Chat surface

MCP-as-deployment, not chat-skin

The MCP server is a first-class target alongside web + desktop + REST. Same FastAPI modules; cannot drift.

Desktop

Tauri sidecar runs FastAPI

The desktop app wraps the same Next.js UI; the Tauri shell launches the Python backend as a child process. No reimplementation.

Persistence

SQLModel · dual-DB (Postgres + SQLite)

One schema package targets PostgreSQL in server mode and SQLite in desktop mode. Engineering writes models once.

Engineering rules

JSONB t-matrix in DB · files in Data Folder

Engineering edits the limits as JSON files; the API reads them, stores t-matrix configurations in JSONB columns, evaluates them per session.

Distribution

NSIS installer · no admin needed

Per-user install removes IT friction. A dyno-cell engineer installs in 30 seconds; the app runs entirely offline.

Quality gate

400 / 9 / 0 pytest baseline · no regressions

Every bug becomes a failing test before the fix. Every commit holds the baseline. The auditor sees the test counts on every PR.

07 · Full tech stack

Everything in the codebase.

Broken down by category. Choices favour typed contracts at every boundary, local execution where possible, and one backend per business action over per-surface duplication.

Backend

PythonFastAPI orchestrator
FastAPIHTTP + OpenAPI surface
Pydantic v2Boundary types
SQLModelPostgres + SQLite schemas
Celery · Redis 7Long-running jobs

Frontend

Next.js 15Web app
TypeScriptTyped everywhere
Tailwind v4Utility CSS
shadcn/uiComposable primitives
RechartsComparator visualisation

Desktop · Data

Tauri 2.xNative desktop shell
NSISWindows installer
SQLiteLocal DB (desktop mode)
PolarsDataFrames (no Pandas)
DuckDBAd-hoc analytical SQL

Infra · Dev partner

Claude CodeDevelopment partner
MCP serverConversational deployment target
PodmanContainer runtime
AnsibleServer provisioning
CaddyReverse proxy + TLS

08 · Lessons learned

What this build taught me.

Lesson 01

Configuration ownership is architecture, not file-system bureaucracy. Putting TR configs, limits, parameter lists and post-process functions in a Data Folder — owned by engineering, not by developers — collapsed an entire class of "ship a release to change a number" conversations. The right people own the right artefacts.

Lesson 02

Owner attribution is the single most valuable column the system adds. Classifying every breached limit as a Rig issue or a UUT issue is what makes the concession workflow legitimate. Without that split, every failed test ends up in the same blunt bucket and every concession reads like an excuse.

Lesson 03

MCP is a first-class deployment target, not chat-skin novelty. Treating Claude Code and Claude Desktop as legitimate users of the application — not as a wrapper over it — let me drive complete validation workflows from a conversation, against the same FastAPI service modules the web UI uses. Adding a new tool is one wrapper away from any new endpoint.

Lesson 04

Observability is the AI coder's nervous system. Pairing Playwright for visual reality with a structured telemetry stream gave both the agent and the human the same evidence at the same time. Claude could no longer claim "done" if the telemetry contradicted the UI; the human reviewer could no longer feel out of the loop, because every event was streamed live.

Lesson 05

Choose the data layer once, choose it well. Polars from day one kept the pipeline fast and predictable as the catalogue grew. No retrofit, no rewrite, no Pandas fallback — the same call to pl.scan_parquet handles a single subtest and a full multi-GB session.