physicalCombinatorics/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## What this project is

PhysCom (Physical Combinatorics) — innovation discovery engine that generates entity combinations across dimensions (e.g. platform × power_source), filters by physical constraints, scores against domain-specific metrics, and ranks results. Includes a CLI, a Flask/HTMX web UI, and a 5-pass pipeline (constraints → estimation → scoring → LLM review → human review).

## Commands

- **Install**: `pip install -e ".[dev,web]"` (editable install with test and web deps)
- **Tests (all)**: `python -m pytest tests/ -q` (48 tests, ~5s). Run after every change.
- **Single test file**: `python -m pytest tests/test_scorer.py -q`
- **Single test**: `python -m pytest tests/test_scorer.py::test_score_combination -q`
- **Web dev server**: `python -m physcom_web`
- **CLI**: `python -m physcom` (or `physcom` if installed)
- **Docker**: `docker compose up web` / `docker compose run cli physcom seed`
- **Seed data**: loaded automatically on first DB init (SQLite, `physcom.db` or `$PHYSCOM_DB`)

## Architecture

```
src/physcom/               # Core library (no web dependency)
  models/                  # Dataclasses: Entity, Dependency, Combination, Domain, MetricBound
  db/schema.py             # DDL (all CREATE TABLE statements)
  db/repository.py         # All DB access — single Repository class, sqlite3 row_factory=Row
  engine/combinator.py     # Cartesian product of entities across dimensions
  engine/constraint_resolver.py  # Pass 1: requires/excludes/mutex/range/force checks
  engine/scorer.py         # Pass 3: log-normalize raw→0-1, weighted geometric mean composite
  engine/pipeline.py       # Orchestrator: combo-first loop, incremental saves, resume, cancel
  llm/base.py              # LLMProvider ABC (estimate_physics, review_plausibility)
  llm/providers/mock.py    # MockLLMProvider for tests
  seed/transport_example.py  # 9 platforms + 9 power sources, 2 domains

src/physcom_web/           # Flask web UI
  app.py                   # App factory, get_repo(), DB path resolution
  routes/pipeline.py       # Background thread pipeline execution, HTMX status/cancel endpoints
  routes/results.py        # Results browse, detail view, human review submission
  routes/entities.py       # Entity CRUD
  routes/domains.py        # Domain listing
  templates/               # Jinja2, extends base.html, uses HTMX for polling
  static/style.css         # Single stylesheet

tests/                     # pytest, uses seeded_repo fixture from conftest.py
```

## Key patterns

- **Repository is the only DB interface.** No raw SQL outside `repository.py`.
- **Pipeline is combo-first**: each combo goes through all requested passes before the next combo starts. Progress is persisted per-combo (crash-safe, resumable).
- **`pipeline_runs` table** tracks run lifecycle: pending → running → completed/failed/cancelled. The web route creates the record, then starts a background thread with its own `sqlite3.Connection`.
- **`combination_results`** has rows for ALL combos including blocked ones (pass_reached=1, composite_score=0.0). Scored combos get pass_reached=3+.
- **Status guard**: `update_combination_status` refuses to downgrade `reviewed` → `scored`.
- **`save_combination`** loads existing status/block_reason on dedup (important for resume).
- **`ensure_metric`** backfills unit if the row already exists with an empty unit.
- **MetricBound** carries `unit` — flows through seed → ensure_metric → metrics table → get_combination_scores → template display.
- **HTMX polling**: `_run_status.html` partial polls every 2s while run is pending/running; stops polling when terminal.

## Data flow (pipeline passes)

1. **Pass 1 — Constraints**: `ConstraintResolver.resolve()` → blocked/conditional/valid. Blocked combos get a result row and `continue`.
2. **Pass 2 — Estimation**: LLM or `_stub_estimate()` → raw metric values. Saved immediately via `save_raw_estimates()` (normalized_score=NULL).
3. **Pass 3 — Scoring**: `Scorer.score_combination()` → log-normalized scores + weighted geometric mean composite. Saves via `save_scores()` + `save_result()`.
4. **Pass 4 — LLM Review**: Only for above-threshold combos with an LLM provider. No real provider yet (only `MockLLMProvider`).
5. **Pass 5 — Human Review**: Manual via web UI results page.

## Testing

- Tests use `seeded_repo` fixture (in-memory SQLite with transport seed data: 9 platforms, 9 power sources, 2 domains). There's also a bare `repo` fixture for tests that seed their own data.
- Individual entity fixtures (walking, bicycle, spaceship, solar_sail, etc.) are defined in `conftest.py`.

## Conventions

- Python 3.11+, `from __future__ import annotations` everywhere.
- Dataclasses for models, no ORM.
- Don't use `cd` in Bash commands — run from the working directory so pre-approved permission patterns match.
- Don't add docstrings/comments/type annotations to code you didn't change.
- `INSERT OR IGNORE` won't update existing rows — if adding a new column/field to seed data, also add an UPDATE for backfill.
- Jinja2 `0.0` is falsy — use `is not none` not `if value` when displaying scores that can legitimately be zero.