bold beginnings
This commit is contained in:
624
IMPLEMENTATION_PLAN.md
Normal file
624
IMPLEMENTATION_PLAN.md
Normal file
@@ -0,0 +1,624 @@
|
||||
# Physical Combinatorics — Implementation Plan
|
||||
|
||||
## Context
|
||||
|
||||
This project systematically generates novel concepts by computing the Cartesian product of real-world attribute dimensions (platforms, power sources, and future dimensions), then filters the combinatorial explosion through a multi-pass viability pipeline. The goal is to surface "bizarre but plausible" innovations (like hydrogen-powered bicycles) while eliminating the vast majority of nonsensical pairings. The core insight is that attributes are real things — the risk isn't bad input, it's how much noise survives the filters.
|
||||
|
||||
**Stack:** Python, SQLite, abstract LLM interface, CLI-first.
|
||||
|
||||
---
|
||||
|
||||
## 1. Project Structure
|
||||
|
||||
```
|
||||
physicalCombinatorics/
|
||||
├── README.md
|
||||
├── IMPLEMENTATION_PLAN.md
|
||||
├── pyproject.toml # Package config, dependencies
|
||||
├── src/
|
||||
│ └── physcom/
|
||||
│ ├── __init__.py
|
||||
│ ├── cli.py # CLI entry point (argparse/click)
|
||||
│ ├── db/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── schema.py # DDL, table creation, migrations
|
||||
│ │ └── repository.py # CRUD operations for all entities
|
||||
│ ├── models/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── entity.py # Entity, Dependency dataclasses
|
||||
│ │ ├── domain.py # Domain, MetricWeight dataclasses
|
||||
│ │ └── combination.py # Combination, Score dataclasses
|
||||
│ ├── engine/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── combinator.py # Cartesian product generator
|
||||
│ │ ├── constraint_resolver.py # Dependency contradiction detection
|
||||
│ │ ├── scorer.py # Multiplicative logarithmic scoring
|
||||
│ │ └── pipeline.py # Multi-pass orchestrator
|
||||
│ ├── llm/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── base.py # Abstract LLM interface
|
||||
│ │ ├── prompts.py # Prompt templates for physics/social passes
|
||||
│ │ └── providers/ # Concrete implementations (future)
|
||||
│ │ └── __init__.py
|
||||
│ └── seed/
|
||||
│ └── transport_example.py # Seed data from the README example
|
||||
├── tests/
|
||||
│ ├── test_constraint_resolver.py
|
||||
│ ├── test_scorer.py
|
||||
│ ├── test_combinator.py
|
||||
│ ├── test_pipeline.py
|
||||
│ └── test_repository.py
|
||||
└── data/
|
||||
└── physcom.db # SQLite database (gitignored)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Database Schema
|
||||
|
||||
### Table: `dimensions`
|
||||
Defines attribute categories (e.g., "platform", "power_source"). Adding a new dimension is just inserting a row — no schema change needed.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `name` | TEXT UNIQUE NOT NULL | e.g., "platform", "power_source" |
|
||||
| `description` | TEXT | Human-readable purpose |
|
||||
|
||||
### Table: `entities`
|
||||
Individual attributes within a dimension.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `dimension_id` | INTEGER FK → dimensions.id | Which dimension this belongs to |
|
||||
| `name` | TEXT NOT NULL | e.g., "Bicycle", "Solar Sail" |
|
||||
| `description` | TEXT | Longer description |
|
||||
| UNIQUE | (dimension_id, name) | No duplicate names within a dimension |
|
||||
|
||||
### Table: `dependencies`
|
||||
The core metadata on every entity. Uses a flexible category/key/value/unit structure so any type of dependency can be expressed without schema changes.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `entity_id` | INTEGER FK → entities.id | Owner entity |
|
||||
| `category` | TEXT NOT NULL | One of: "environment", "force", "material", "physical", "infrastructure" |
|
||||
| `key` | TEXT NOT NULL | e.g., "requires_ground", "min_mass_kg", "force_output_watts" |
|
||||
| `value` | TEXT NOT NULL | e.g., "true", "500", "vacuum" |
|
||||
| `unit` | TEXT | Optional unit: "kg", "watts", "celsius", etc. |
|
||||
| `constraint_type` | TEXT NOT NULL | One of: "requires", "provides", "range_min", "range_max", "excludes" |
|
||||
|
||||
**Constraint types explained:**
|
||||
- `requires` — this entity needs this condition to function (walking requires ground=true)
|
||||
- `provides` — this entity supplies this condition (a sealed cabin provides atmosphere=true)
|
||||
- `range_min` / `range_max` — numeric bounds (nuclear reactor: min_mass_kg=2000)
|
||||
- `excludes` — this entity cannot coexist with this condition (solar sail excludes atmosphere=dense)
|
||||
|
||||
### Table: `domains`
|
||||
Context frames that define what "good" means.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `name` | TEXT UNIQUE NOT NULL | e.g., "urban_commuting", "interplanetary_travel" |
|
||||
| `description` | TEXT | What this domain represents |
|
||||
|
||||
### Table: `metrics`
|
||||
The measurable dimensions of viability.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `name` | TEXT UNIQUE NOT NULL | e.g., "speed", "cost_efficiency", "safety" |
|
||||
| `unit` | TEXT | e.g., "km/h", "usd/km", "score_0_1" |
|
||||
| `description` | TEXT | What this measures |
|
||||
|
||||
### Table: `domain_metric_weights`
|
||||
Per-domain weighting and normalization bounds for each metric.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `domain_id` | INTEGER FK → domains.id | |
|
||||
| `metric_id` | INTEGER FK → metrics.id | |
|
||||
| `weight` | REAL NOT NULL | 0.0–1.0, weights within a domain should sum to 1.0 |
|
||||
| `norm_min` | REAL | Lower bound for normalization (below this → score 0) |
|
||||
| `norm_max` | REAL | Upper bound for normalization (above this → score 1) |
|
||||
| UNIQUE | (domain_id, metric_id) | |
|
||||
|
||||
### Table: `combinations`
|
||||
Each generated combination of entities.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `hash` | TEXT UNIQUE NOT NULL | Deterministic hash of sorted entity IDs (dedup) |
|
||||
| `status` | TEXT NOT NULL DEFAULT 'pending' | One of: "pending", "valid", "blocked", "scored", "reviewed" |
|
||||
| `block_reason` | TEXT | If blocked, why (which dependencies contradicted) |
|
||||
| `created_at` | TIMESTAMP | |
|
||||
|
||||
### Table: `combination_entities`
|
||||
Junction table linking combinations to their constituent entities.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `combination_id` | INTEGER FK → combinations.id | |
|
||||
| `entity_id` | INTEGER FK → entities.id | |
|
||||
| PRIMARY KEY | (combination_id, entity_id) | |
|
||||
|
||||
### Table: `combination_scores`
|
||||
Per-metric scores for each combination within a domain.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `combination_id` | INTEGER FK → combinations.id | |
|
||||
| `domain_id` | INTEGER FK → domains.id | |
|
||||
| `metric_id` | INTEGER FK → metrics.id | |
|
||||
| `raw_value` | REAL | The estimated raw metric value (e.g., 40.0 km/h) |
|
||||
| `normalized_score` | REAL | 0.0–1.0 after log-normalization against domain bounds |
|
||||
| `estimation_method` | TEXT | "physics_calc", "llm_estimate", "human_input" |
|
||||
| `confidence` | REAL | 0.0–1.0 confidence in the estimate |
|
||||
| UNIQUE | (combination_id, domain_id, metric_id) | |
|
||||
|
||||
### Table: `combination_results`
|
||||
Final composite viability scores per domain.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `combination_id` | INTEGER FK → combinations.id | |
|
||||
| `domain_id` | INTEGER FK → domains.id | |
|
||||
| `composite_score` | REAL | Weighted geometric mean of normalized metric scores |
|
||||
| `novelty_flag` | TEXT | "novel", "exists", "researched", or NULL |
|
||||
| `llm_review` | TEXT | LLM-generated plausibility summary |
|
||||
| `human_notes` | TEXT | Human reviewer notes |
|
||||
| `pass_reached` | INTEGER | Highest pass this combination survived (1–5) |
|
||||
| UNIQUE | (combination_id, domain_id) | |
|
||||
|
||||
### Indexes
|
||||
```sql
|
||||
CREATE INDEX idx_deps_entity ON dependencies(entity_id);
|
||||
CREATE INDEX idx_deps_category_key ON dependencies(category, key);
|
||||
CREATE INDEX idx_combo_status ON combinations(status);
|
||||
CREATE INDEX idx_scores_combo_domain ON combination_scores(combination_id, domain_id);
|
||||
CREATE INDEX idx_results_domain_score ON combination_results(domain_id, composite_score DESC);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Entity & Dependency Data Model (Python)
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class Dependency:
|
||||
category: str # "environment", "force", "material", "physical", "infrastructure"
|
||||
key: str # "requires_ground", "force_output_watts", "min_mass_kg"
|
||||
value: str # "true", "75", "vacuum"
|
||||
unit: str | None # "kg", "watts", etc.
|
||||
constraint_type: str # "requires", "provides", "range_min", "range_max", "excludes"
|
||||
|
||||
@dataclass
|
||||
class Entity:
|
||||
id: int | None
|
||||
dimension: str # "platform", "power_source"
|
||||
name: str # "Bicycle"
|
||||
description: str
|
||||
dependencies: list[Dependency]
|
||||
|
||||
@dataclass
|
||||
class Combination:
|
||||
id: int | None
|
||||
entities: list[Entity] # One per dimension
|
||||
status: str # "pending" → "valid"/"blocked" → "scored" → "reviewed"
|
||||
block_reason: str | None
|
||||
```
|
||||
|
||||
### Example: Entities with Full Dependencies
|
||||
|
||||
```python
|
||||
Entity(
|
||||
name="Solar Sail",
|
||||
dimension="power_source",
|
||||
description="Propulsion via radiation pressure from a star",
|
||||
dependencies=[
|
||||
Dependency("environment", "atmosphere", "vacuum_or_thin", None, "requires"),
|
||||
Dependency("environment", "star_proximity", "true", None, "requires"),
|
||||
Dependency("physical", "surface_area", "100", "m^2", "range_min"),
|
||||
Dependency("force", "force_output_watts", "0.001", "N", "provides"),
|
||||
Dependency("force", "thrust_profile", "continuous_low", None, "provides"),
|
||||
]
|
||||
)
|
||||
|
||||
Entity(
|
||||
name="Walking",
|
||||
dimension="platform",
|
||||
description="Bipedal locomotion",
|
||||
dependencies=[
|
||||
Dependency("environment", "ground_surface", "true", None, "requires"),
|
||||
Dependency("environment", "gravity", "true", None, "requires"),
|
||||
Dependency("physical", "max_mass_kg", "150", "kg", "range_max"),
|
||||
Dependency("force", "force_output_watts", "75", "watts", "provides"),
|
||||
Dependency("infrastructure", "fuel_infrastructure", "none", None, "requires"),
|
||||
]
|
||||
)
|
||||
|
||||
Entity(
|
||||
name="Modular Nuclear Reactor",
|
||||
dimension="power_source",
|
||||
description="Small-scale fission reactor for sustained high power output",
|
||||
dependencies=[
|
||||
Dependency("physical", "min_mass_kg", "2000", "kg", "range_min"),
|
||||
Dependency("material", "radiation_shielding", "true", None, "requires"),
|
||||
Dependency("material", "coolant_system", "true", None, "requires"),
|
||||
Dependency("force", "force_output_watts", "1000000", "watts", "provides"),
|
||||
Dependency("infrastructure", "nuclear_fuel", "enriched_uranium", None, "requires"),
|
||||
Dependency("infrastructure", "regulatory_approval", "nuclear", None, "requires"),
|
||||
]
|
||||
)
|
||||
|
||||
Entity(
|
||||
name="Bicycle",
|
||||
dimension="platform",
|
||||
description="Two-wheeled human-scale vehicle",
|
||||
dependencies=[
|
||||
Dependency("environment", "ground_surface", "true", None, "requires"),
|
||||
Dependency("environment", "atmosphere", "standard", None, "requires"),
|
||||
Dependency("physical", "max_mass_kg", "30", "kg", "range_max"),
|
||||
Dependency("physical", "max_payload_kg", "120", "kg", "range_max"),
|
||||
Dependency("force", "force_required_watts", "50", "watts", "range_min"),
|
||||
Dependency("force", "force_required_watts", "500", "watts", "range_max"),
|
||||
]
|
||||
)
|
||||
```
|
||||
|
||||
Note how **Bicycle + Nuclear Reactor** is caught by Rule 3: the reactor's `min_mass_kg=2000` exceeds the bicycle's `max_mass_kg=30`. Meanwhile **Bicycle + Human Pedalling** passes all checks — the force ranges overlap, the mass is compatible, and no environmental contradictions exist.
|
||||
|
||||
---
|
||||
|
||||
## 4. Constraint Resolution Engine
|
||||
|
||||
**File:** `src/physcom/engine/constraint_resolver.py`
|
||||
|
||||
The resolver takes a `Combination` (a set of entities, one per dimension) and checks all dependencies for contradictions. It returns `VALID`, `BLOCKED`, or `CONDITIONAL` with reasons.
|
||||
|
||||
### Contradiction Rules
|
||||
|
||||
**Rule 1: Requires vs. Excludes**
|
||||
If entity A `requires` key=X and entity B `excludes` key=X → BLOCKED.
|
||||
|
||||
> Walking requires `ground_surface=true`; if a hypothetical power source excluded ground operation, the combination is impossible.
|
||||
|
||||
**Rule 2: Mutual Exclusion**
|
||||
If entity A `requires` key=X and entity B `requires` key=Y where X and Y are mutually exclusive values of the same key → BLOCKED.
|
||||
|
||||
> Solar sail requires `atmosphere=vacuum_or_thin`; a ground platform requires `atmosphere=standard` → contradiction on the `atmosphere` key.
|
||||
|
||||
This requires a mutual exclusion registry:
|
||||
|
||||
```python
|
||||
MUTEX_VALUES = {
|
||||
"atmosphere": [{"vacuum", "vacuum_or_thin"}, {"dense", "standard"}],
|
||||
"medium": [{"ground"}, {"water"}, {"air"}, {"space"}],
|
||||
}
|
||||
```
|
||||
|
||||
**Rule 3: Range Incompatibility**
|
||||
If entity A has `range_min` for key=K at value V1 and entity B has `range_max` for the same key at value V2, and V1 > V2 → BLOCKED.
|
||||
|
||||
> Nuclear reactor `range_min` mass=2000kg vs. Bicycle `range_max` mass=30kg → the reactor cannot physically fit on the bicycle.
|
||||
|
||||
**Rule 4: Force Scale Mismatch**
|
||||
A specialized range check on force-related keys. If a platform requires a minimum force that the power source cannot provide, or the power source outputs force orders of magnitude beyond what the platform can structurally handle → flag. This may be a soft constraint (CONDITIONAL with warning) rather than hard BLOCKED, since some edge cases are debatable.
|
||||
|
||||
**Rule 5: Unmet Requirements**
|
||||
If entity A `requires` key=K but no other entity in the combination `provides` that key, AND it is not an ambient environmental assumption → `CONDITIONAL`. The combination works only if the missing requirement is externally supplied.
|
||||
|
||||
> A hydrogen engine requires `fuel_infrastructure=hydrogen_station`. If no other entity provides it, the combination is conditionally viable — it works where hydrogen stations exist.
|
||||
|
||||
### Resolver Interface
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ConstraintResult:
|
||||
status: str # "valid", "blocked", "conditional"
|
||||
violations: list[str] # Human-readable descriptions of hard blocks
|
||||
warnings: list[str] # Soft constraint notes
|
||||
|
||||
class ConstraintResolver:
|
||||
def __init__(self, mutex_registry: dict): ...
|
||||
def resolve(self, combination: Combination) -> ConstraintResult: ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Scoring Pipeline
|
||||
|
||||
**File:** `src/physcom/engine/scorer.py`
|
||||
|
||||
### Normalization — Logarithmic Scaling
|
||||
|
||||
Raw metric values are normalized to 0.0–1.0 against domain-specific bounds. The log scale reflects the expected behavior: most combinations cluster near 0 (useless) or near 1 (fully competitive) within a given domain.
|
||||
|
||||
```python
|
||||
def normalize(raw_value: float, norm_min: float, norm_max: float) -> float:
|
||||
"""Log-normalize a raw value to 0-1 within domain bounds.
|
||||
|
||||
Values at or below norm_min → 0.0
|
||||
Values at or above norm_max → 1.0
|
||||
Values between are log-interpolated.
|
||||
"""
|
||||
if raw_value <= norm_min:
|
||||
return 0.0
|
||||
if raw_value >= norm_max:
|
||||
return 1.0
|
||||
log_min = math.log1p(norm_min)
|
||||
log_max = math.log1p(norm_max)
|
||||
log_val = math.log1p(raw_value)
|
||||
return (log_val - log_min) / (log_max - log_min)
|
||||
```
|
||||
|
||||
### Composite Score — Weighted Geometric Mean
|
||||
|
||||
Scores are **multiplied**, not averaged. This is the key design decision: a single near-zero metric kills the overall viability, filtering out "technically possible but completely pointless in practice" concepts.
|
||||
|
||||
```python
|
||||
def composite_score(scores: list[float], weights: list[float]) -> float:
|
||||
"""Weighted geometric mean. Any score near 0 drives the result toward 0.
|
||||
|
||||
composite = product(score_i ^ weight_i) for all metrics i
|
||||
"""
|
||||
result = 1.0
|
||||
for score, weight in zip(scores, weights):
|
||||
result *= score ** weight
|
||||
return result
|
||||
```
|
||||
|
||||
**Properties:**
|
||||
- If any `score = 0.0` → composite = 0.0 regardless of other scores
|
||||
- If all scores = 1.0 → composite = 1.0
|
||||
- The resulting distribution is heavily skewed toward 0 — this is intended as the primary noise filter
|
||||
- A person pushing a car: speed ≈ 0 in ground transport domain → composite ≈ 0 → eliminated
|
||||
- A rocket-powered car: speed ≈ 1 in ground transport domain, but safety ≈ 0 → composite ≈ 0 → also eliminated
|
||||
- Only combinations that score reasonably across ALL metrics survive
|
||||
|
||||
### Scorer Interface
|
||||
|
||||
```python
|
||||
class Scorer:
|
||||
def __init__(self, domain: Domain): ...
|
||||
def score_combination(self, combination: Combination,
|
||||
raw_metrics: dict[str, float]) -> ScoredResult: ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Domain System
|
||||
|
||||
**File:** `src/physcom/models/domain.py`
|
||||
|
||||
Domains define the context in which combinations are evaluated. The same combination may be viable in one domain and worthless in another.
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class MetricBound:
|
||||
metric_name: str
|
||||
weight: float # 0.0–1.0, all weights in a domain sum to 1.0
|
||||
norm_min: float # Below this = score 0 in this domain
|
||||
norm_max: float # Above this = score 1 in this domain
|
||||
|
||||
@dataclass
|
||||
class Domain:
|
||||
name: str
|
||||
description: str
|
||||
metric_bounds: list[MetricBound]
|
||||
```
|
||||
|
||||
### Example Domains
|
||||
|
||||
**Urban Commuting** (daily city travel, 1–50km):
|
||||
| Metric | Weight | Min (score=0) | Max (score=1) |
|
||||
|---|---|---|---|
|
||||
| speed | 0.25 | 5 km/h | 120 km/h |
|
||||
| cost_efficiency | 0.25 | $0.01/km | $2.00/km |
|
||||
| safety | 0.25 | 0.0 | 1.0 |
|
||||
| availability | 0.15 | 0.0 | 1.0 |
|
||||
| range_fuel | 0.10 | 5 km | 500 km |
|
||||
|
||||
**Interplanetary Travel** (between planets in a solar system):
|
||||
| Metric | Weight | Min (score=0) | Max (score=1) |
|
||||
|---|---|---|---|
|
||||
| speed | 0.30 | 1,000 km/s | 300,000 km/s |
|
||||
| range_fuel | 0.30 | 1M km | 10B km |
|
||||
| safety | 0.20 | 0.0 | 1.0 |
|
||||
| cost_efficiency | 0.10 | $1K/km | $1B/km |
|
||||
| range_degradation | 0.10 | 100 days | 36,500 days |
|
||||
|
||||
A car at 100 km/h scores ~0.83 for speed in urban commuting but ~0.0 in interplanetary travel. This is by design — domain context determines relevance.
|
||||
|
||||
---
|
||||
|
||||
## 7. Multi-Pass Pipeline
|
||||
|
||||
**File:** `src/physcom/engine/pipeline.py`
|
||||
|
||||
```
|
||||
All Combinations (Cartesian product)
|
||||
│
|
||||
Pass 1: Constraint Resolution (hard physics, deterministic)
|
||||
│ filter: BLOCKED removed
|
||||
▼
|
||||
Valid + Conditional Combinations
|
||||
│
|
||||
Pass 2: Physics Estimation (compute or LLM-assisted)
|
||||
│ estimates raw metric values per combination
|
||||
▼
|
||||
Estimated Combinations
|
||||
│
|
||||
Pass 3: Scoring & Ranking (per domain)
|
||||
│ filter: composite_score < threshold removed
|
||||
▼
|
||||
High-Scoring Shortlist
|
||||
│
|
||||
Pass 4: LLM Review (social factors, novelty, plausibility)
|
||||
│ annotates with natural-language assessment
|
||||
▼
|
||||
Annotated Shortlist
|
||||
│
|
||||
Pass 5: Human Review (manual, via CLI)
|
||||
│ human adds notes, approves/rejects
|
||||
▼
|
||||
Final Curated Concepts
|
||||
```
|
||||
|
||||
### Design Notes
|
||||
|
||||
- **Pass 1 is deterministic and cheap.** It prunes the bulk of the combinatorial explosion using only dependency logic. No LLM calls, no estimation. This is the first and most aggressive filter.
|
||||
- **Pass 2 is the most expensive.** Physics estimation (whether formula-based or LLM-assisted) runs only on combinations that survived Pass 1. For the initial transport example (81 combinations, many blocked), this might mean ~20–40 surviving combinations need estimation.
|
||||
- **Pass 3 applies the multiplicative filter.** The logarithmic scoring distribution means most estimated combinations still score near 0. Only a handful survive the threshold.
|
||||
- **Passes 4–5 are human-scale.** By the time concepts reach LLM review and human review, the list should be small enough for thoughtful individual assessment.
|
||||
|
||||
### Pipeline Interface
|
||||
|
||||
```python
|
||||
class Pipeline:
|
||||
def __init__(self, db: Repository, resolver: ConstraintResolver,
|
||||
scorer: Scorer, llm: LLMProvider | None): ...
|
||||
|
||||
def run(self, domain: Domain, dimensions: list[str],
|
||||
score_threshold: float = 0.1,
|
||||
passes: list[int] = [1, 2, 3, 4, 5]) -> PipelineResult: ...
|
||||
```
|
||||
|
||||
The `passes` parameter allows partial runs (e.g., `[1, 2, 3]` skips LLM and human review).
|
||||
|
||||
---
|
||||
|
||||
## 8. LLM Interface (Abstract)
|
||||
|
||||
**File:** `src/physcom/llm/base.py`
|
||||
|
||||
```python
|
||||
from abc import ABC, abstractmethod
|
||||
|
||||
class LLMProvider(ABC):
|
||||
"""Provider-agnostic LLM interface."""
|
||||
|
||||
@abstractmethod
|
||||
def estimate_physics(self, combination_description: str,
|
||||
metrics: list[str]) -> dict[str, float]:
|
||||
"""Estimate raw metric values for a combination using
|
||||
order-of-magnitude physics reasoning.
|
||||
Returns {metric_name: estimated_value}."""
|
||||
...
|
||||
|
||||
@abstractmethod
|
||||
def review_plausibility(self, combination_description: str,
|
||||
scores: dict[str, float]) -> str:
|
||||
"""Return a natural-language assessment of plausibility,
|
||||
novelty, social viability, and practical barriers."""
|
||||
...
|
||||
```
|
||||
|
||||
**File:** `src/physcom/llm/prompts.py`
|
||||
|
||||
Structured prompt templates for:
|
||||
- `PHYSICS_ESTIMATION_PROMPT` — asks for order-of-magnitude metric estimates with reasoning
|
||||
- `PLAUSIBILITY_REVIEW_PROMPT` — asks for social viability, barriers, novelty, and prior art
|
||||
|
||||
A `MockLLMProvider` is included for deterministic testing. Concrete providers (Anthropic, OpenAI, local) are implemented in `src/physcom/llm/providers/` and registered by name.
|
||||
|
||||
---
|
||||
|
||||
## 9. CLI Interface
|
||||
|
||||
**File:** `src/physcom/cli.py`
|
||||
|
||||
```
|
||||
physcom init # Create database, initialize schema
|
||||
physcom seed <seed_name> # Load seed data (e.g., "transport")
|
||||
physcom entity add <dim> <name> # Add an entity interactively
|
||||
physcom entity list [--dimension X] # List entities with dependencies
|
||||
physcom domain add <name> # Add a domain interactively
|
||||
physcom domain list # List domains with metric weights
|
||||
physcom run <domain> [--passes 1,2,3] [--threshold 0.1]
|
||||
physcom results <domain> [--top N] # View ranked results
|
||||
physcom review <combination_id> # Interactive human review
|
||||
physcom export <domain> --format md # Export to markdown report
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Implementation Phases
|
||||
|
||||
### Phase A: Foundation (database + models)
|
||||
1. Set up `pyproject.toml` with dependencies (`click` for CLI, `sqlite3` stdlib)
|
||||
2. Implement `db/schema.py` — all table DDL, `init_db()` function, indexes
|
||||
3. Implement `models/` — all dataclasses (`Entity`, `Dependency`, `Domain`, `Combination`, etc.)
|
||||
4. Implement `db/repository.py` — CRUD for entities, dependencies, domains, metrics
|
||||
5. Implement `seed/transport_example.py` — the README example with full dependency metadata on all 18 entities
|
||||
6. Implement `cli.py` — `init` and `seed` commands
|
||||
7. **Verify:** Create DB, load seed, query entities, confirm all dependency data persists correctly
|
||||
|
||||
### Phase B: Constraint Engine
|
||||
1. Implement `engine/combinator.py` — Cartesian product generator across N dimensions
|
||||
2. Implement mutual exclusion registry (dict initially, DB table later)
|
||||
3. Implement `engine/constraint_resolver.py` — all five contradiction rules
|
||||
4. Wire Pass 1 into the pipeline and CLI
|
||||
5. **Verify:** Solar sail + walking → BLOCKED (atmosphere contradiction). Hydrogen engine + bicycle → VALID. Nuclear reactor + bicycle → BLOCKED (mass range). Confirm expected blocked/valid counts for full 81-combination transport example.
|
||||
|
||||
### Phase C: Scoring
|
||||
1. Implement `engine/scorer.py` — log normalization + weighted geometric mean
|
||||
2. Implement domain metric weights in the database
|
||||
3. Wire Pass 2 with stub physics estimates (hardcoded for the transport seed)
|
||||
4. Wire Pass 3 into the pipeline
|
||||
5. **Verify:** Score distribution is heavily logarithmic. A single zero-metric kills composite score. Domain bounds shift rankings (same combination scores differently across domains).
|
||||
|
||||
### Phase D: LLM Integration
|
||||
1. Implement `llm/base.py` — abstract interface
|
||||
2. Implement `llm/prompts.py` — prompt templates
|
||||
3. Implement `MockLLMProvider` for testing
|
||||
4. Wire Pass 2 to fall back to LLM when no hardcoded estimate exists
|
||||
5. Wire Pass 4 (plausibility review)
|
||||
6. **Verify:** Mock provider returns structured data. Pipeline processes LLM output correctly. Full pipeline runs end-to-end with mock.
|
||||
|
||||
### Phase E: Human Review & Output
|
||||
1. Implement Pass 5 — interactive CLI review workflow
|
||||
2. Implement `export` command — markdown report generation
|
||||
3. Implement `results` and `review` CLI commands
|
||||
4. **Verify:** Full pipeline from seed data through all 5 passes to final curated output.
|
||||
|
||||
### Phase F: Extension & Refinement
|
||||
1. Add new attribute dimensions (terrain, passenger count, cargo type)
|
||||
2. Add new evaluation domains (military logistics, recreational, emergency services)
|
||||
3. Refine dependency metadata based on pipeline output analysis
|
||||
4. Implement a concrete LLM provider
|
||||
5. Optional: Streamlit dashboard for interactive exploration
|
||||
|
||||
---
|
||||
|
||||
## 11. Testing Strategy
|
||||
|
||||
| Test Type | Scope | Purpose |
|
||||
|---|---|---|
|
||||
| **Unit** | Each engine module | Constraint rules, scoring math, combinator logic |
|
||||
| **Integration** | Full pipeline with seed + mock LLM | End-to-end data flow |
|
||||
| **Property-based** | Scorer | Multiplicative zero-kill, score bounds [0,1], geometric mean identity |
|
||||
| **Snapshot** | Transport example | Known combinations produce known constraint/score results |
|
||||
| **Interface contract** | LLM providers | Structured output schema compliance (not output quality) |
|
||||
|
||||
LLM output quality is inherently non-deterministic and is NOT tested. Only the interface contract (correct types, valid JSON, expected keys) is validated.
|
||||
|
||||
---
|
||||
|
||||
## 12. Verification Checkpoints
|
||||
|
||||
After each phase, run:
|
||||
|
||||
1. `python -m pytest tests/` — all tests green
|
||||
2. `physcom init && physcom seed transport` — seed loads without error
|
||||
3. `physcom run urban_commuting --passes 1` — constraint pass produces expected blocked/valid split
|
||||
4. `physcom run urban_commuting --passes 1,2,3 --threshold 0.05` — scoring produces a ranked shortlist
|
||||
5. `physcom results urban_commuting --top 10` — top concepts are plausible, not nonsense
|
||||
6. **Manual smell test:** Do obviously-absurd combinations survive? If so, dependency metadata or constraint rules need refinement — that's where the real signal-to-noise quality lives.
|
||||
Reference in New Issue
Block a user