Project Status
pnxt Project Status
Last updated: 2026-04-06 (Phase 7 complete — Sprint 15)
Current State
Phase 7 (“Self-Hosting Paradigm”) is complete. All three milestones achieved: M2 (External Task Expression), M3 (LLM-Native Programming), and M4 (Self-Modification). Sprint 15 (“Verified Self-Modification + Research Frontier”) delivered a Causal Impact Analyzer, Modification Confidence Scorer, Self-Modification Orchestrator, 5 Real Self-Modification Scenarios, and Phase 7 Comprehensive Evaluation. The system can now express tasks in VPIR, generate VPIR autonomously, and modify its own pipeline with verified correctness. Total: 21 formally verified Z3 properties, 83 test suites, 1485+ tests. Advisory panel composite score: 9.5/10. See status.md for full details.
Completed Work
| Phase | Focus | Deliverables |
|---|---|---|
| Phase 1 | Core Architecture, State Separation & FFI | Foundational architecture design (external) |
| Phase 2 | Bridge Layer & Mathematical Spec | Mathematical formalization, bridge grammar spec (external) |
| Phase 3 | Deep analysis of pillars, patterns, and architecture | Six research documents covering ACI, memory, coordination, trust, comparative analysis, and reference architecture |
Phase 3 Deliverables (Complete)
- Agent-Computer Interface Specification — Protocol layers, message taxonomy, capability discovery, error handling
- Semantic Memory Architecture — Three-layer memory model (working, semantic, episodic) with lifecycle management
- Multi-Agent Coordination Patterns — Topology models, task decomposition, conflict resolution
- Trust, Safety, and Governance Framework — Graduated trust model, capability-based permissions, sandboxing
- Comparative Analysis — ANP positioned against OOP, Actor Model, Microservices, EDA, FP
- Implementation Reference Architecture — Concrete system design with deployment topologies and migration strategy
Phase 4: Infrastructure Prototype (Complete)
Priority 1: Core Infrastructure
- Project scaffolding — Initialize package.json, TypeScript config, test infrastructure, and CI pipeline
- Memory Service prototype — Three-layer memory model with pluggable
StorageBackendinterface,InMemoryStorageBackendfor testing, andFileStorageBackendfor persistent JSON-file storage across sessions - ACI Gateway prototype — Structured protocol layer with graduated trust checking (5 levels, side-effect-based requirements),
TrustResolverfor agent trust lookup, append-onlyAuditLoggerrecording all invocations/denials, andInMemoryAuditLoggerimplementation
Priority 2: Agent Runtime
- Agent runtime environment — Basic agent lifecycle management (registration, execution, teardown)
- Capability negotiation — Versioned capability discovery with 3-phase handshake, semantic versioning, trust-based constraint tightening, revocation, and expiry support
- Trust engine — Graduated trust model (5 levels) with multi-dimensional trust, observable metric-based scoring (0–100), automatic calibration, per-dimension overrides, and trust reset/manual adjustment
Priority 3: Validation and Evaluation
- Empirical evaluation — Multi-agent coordination scenarios (delegation pattern, trust escalation, failure recovery) exercising full system integration (runtime + trust + ACI + capabilities + memory)
- Benchmark development —
BenchmarkSuiteframework with standardized benchmarks for agent registration, trust calibration, ACI invocation, capability negotiation, memory store/query, and agent lifecycle throughput - Security hardening —
SecurityTestSuitewith adversarial tests across 5 categories: privilege escalation, trust manipulation, capability abuse, audit integrity, and resource exhaustion
Phase 5: Paradigm Foundation (Complete)
Following the Advisory Review Panel’s alignment assessment (3/10), Phase 5 implements the core paradigm components that distinguish pnxt from conventional agent frameworks.
Sprint 1: DPN + IFC + VPIR (Complete)
- Channel<T> and DPN primitives — Typed async FIFO channels with backpressure, Process actors, DataflowGraph composition
- IFC security labels —
SecurityLabeltype with lattice-based flow control - VPIR node types and validator —
VPIRNode,VPIRGraphtypes with structural validator - Runtime integration — AgentRuntime supports channel-based inter-agent communication
Sprint 2: Bridge Grammar + Formal Verification (Complete)
- Bridge Grammar JSON Schema — Constrained-decoding schemas forcing LLMs to output valid VPIR nodes
- Constrained output formatters — LLM-specific schema formats (function calling, Anthropic tools, structured output)
- Z3 SMT integration — Four verified properties: capability grant consistency, trust transition monotonicity, IFC flow lattice, side-effect trust requirements
- IFC label enforcement completion — Extended IFC checking to ACI tool invocations and Channel sends
- Causal trust scoring — Difficulty-weighted trust scoring replacing fixed-weight scorer
Sprint 3: VPIR Execution + NL Protocols + Visualization (Complete)
- VPIR Interpreter — Executes validated VPIR graphs in topological order with IFC enforcement
- Natural Language Protocol Design — Three protocol state machines: task-delegation, capability-negotiation, conflict-resolution
- VPIR Visualization (Text-Based) — Human-readable rendering of VPIR graphs and execution traces
Sprint 4: Protocol-Channel Integration + VPIR Optimizations (Complete)
- Protocol-Channel Integration — Bidirectional protocol channels with IFC enforcement
- VPIR Parallel Execution — Wave-based execution with Kahn’s algorithm and semaphore concurrency
- VPIR Result Caching — Deterministic node caching by ID + input hash
Sprint 5: HoTT Foundations + Knowledge Graph + End-to-End Pipeline (Complete)
- HoTT Type Foundations —
HoTTObject,Morphism,HoTTPath, andCategorytypes with categorical structure - Tree-sitter DKB Knowledge Graph — Typed graph with 8 entity kinds, 8 relation types, traversal, and HoTT conversion
- VPIR-to-HoTT Bridge — Converts VPIR reasoning DAGs into HoTT categories
- Z3 Categorical Verification — Two new properties: morphism composition associativity, identity morphism laws. Total: 6 properties
- End-to-End Pipeline Scenarios — Three integration scenarios proving paradigm pillars work together
Phase 6: Integration & Deepening (Complete — 9 Sprints)
Phase 6 focused on connecting and validating the paradigm pillars together with real-world inputs.
Sprint 6: Type Identity — Univalence Axiom + LLMbda Decision (Complete)
- Univalence Axiom Encoding — True HoTT univalence with equivalence-to-path and inverse
- Transport Along Paths — Property transfer between equivalent VPIR graphs without re-verification
- LLMbda as Semantic Foundation — VPIR nodes carry lambda calculus denotations
- Typed LLMbda Calculus ADR — Formal justification for typed over untyped lambda calculus
- Z3 Univalence Verification — Total: 15 formally verified Z3 properties
Sprint 7: Verification Maturity — User-Program Verification + Bisimulation (Complete)
- User-Program Property Verification —
ProgramVerifierwith preconditions, postconditions, invariants, assertions - CVC5 Integration — Alternative solver via subprocess with
MultiSolverVerifierorchestration - DPN Bisimulation Checking — Strong bisimulation + observational equivalence via partition refinement
- Multi-Agent Delegation Benchmark — Three agents coordinating with trust boundaries and IFC enforcement
- Secure Data Pipeline Benchmark — PII redaction with IFC analysis and label propagation
- Z3 Properties — Total: 17 formally verified properties
Sprint 8: Neurosymbolic Bridge — P-ASP + Active Inference (Complete)
- P-ASP Integration Prototype — Probabilistic ASP for VPIR node confidence scoring
- Active Inference Engine — Free-energy minimization for iterative VPIR graph patching
- Refinement Pipeline — Combines P-ASP scoring with Active Inference in an iterative loop
Sprint 9: Categorical Frontier — Native Tokenization + Self-Hosting Vision (Complete)
- Categorical Tokenization Experiment — 42-token vocabulary with 23 morphism composition rules
- Self-Hosting Proof of Concept — pnxt describes, validates, categorizes, and executes itself as VPIR (M1)
- Paradigm Transition Roadmap — M1-M5 milestones from self-description to self-hosting
- Advisory Review Alignment Package — All 10 advisor concerns addressed. Score: 7.5 → 9.2
Phase 7: Self-Hosting Paradigm (In Progress)
Phase 7 transitions pnxt from verified prototype to self-modifying, LLM-programmable system.
See docs/roadmap/paradigm-transition.md for the complete transition roadmap.
Sprint 10: Handler Library + Tool Registry (Complete)
- Standard Handler Library — 8 pre-built tool handlers (http-fetch, json-transform, file-read, file-write, string-format, math-eval, data-validate, unit-convert)
- Declarative Tool Registry — Operation-to-handler mapping with auto-registration, discovery API, and trust pre-validation
- DPN Supervisor — Supervisor actor pattern with bounded restart strategies, priority mailbox, full event log
- DPN Runtime Integration — Tool registry support in inference and action nodes; backward compatible
Sprint 11: VPIR Authoring + External Tasks — M2 Complete (Complete)
- VPIR Graph Builder — Fluent API and
fromJSON()for constructing validatedVPIRGraphfrom pure JSON. Auto-computes roots/terminals, validates tool availability via registry - External Task Runner —
TaskRunnerorchestrating JSON spec → build → verify → DPN execute pipeline - Task-Aware Bridge Grammar — Enhanced LLM generation with handler documentation and validation
- External Task Benchmarks — Temperature Conversion and Math Expression end-to-end benchmarks
Sprint 12: Reliable Bridge Grammar + Error Recovery — M3 Foundation (Complete)
- Bridge Grammar Error Taxonomy — 6 error categories with repair hints and structured LLM feedback
- Auto-Repair Engine — 6 repair strategies: truncated JSON, missing fields, fuzzy enums, duplicate IDs, topology, default labels
- Confidence Scorer — 4-dimension P-ASP-inspired scoring (structural, semantic, handler coverage, topological)
- Z3 Graph Pre-Verification — 4 formal properties: acyclicity, input completeness, IFC monotonicity, handler trust
- Reliable Generation Pipeline — 7-stage orchestration: generate → diagnose → repair → re-validate → score → verify
- Error Recovery Benchmark — 7 scenarios covering all error categories
- Z3 Properties — Total: 21 formally verified properties
Test Coverage
| Sprint | Test Suites | Tests | LOC (tests) |
|---|---|---|---|
| Phase 4 | 12 | 194 | 2,736 |
| Sprint 1 | 14 | 194+ | — |
| Sprint 2 | 17 | 292 | ~3,800 |
| Sprint 3 | 20 | 355 | ~5,200 |
| Sprint 4 | 22 | ~415 | ~6,600 |
| Sprint 5 | 26 | 479 | ~8,200 |
| Sprint 6 | 30 | 557 | ~9,400 |
| Sprint 7 | 49 | 882 | ~18,100 |
| Sprint 8 | 53 | 932 | ~19,800 |
| Sprint 9 | 55 | 974 | ~21,000 |
| Sprint 10 | 58 | 1073+ | ~23,000 |
| Sprint 11 | 62 | 1128+ | ~25,000 |
| Sprint 12 | 68 | 1220+ | ~27,000 |
Future Goals
Phase 7 Remaining (M4–M5)
- M4: Self-Modification — pnxt modifies its own pipeline through VPIR
- M5: Self-Hosting — pnxt’s core components expressed in pnxt
Long-Term (Phase 8+)
- Web-based visualization frontend — Interactive node-graph renderer consuming the JSON export format
- Multi-language Tree-sitter parsers — Extend KG parsing beyond TypeScript to Python, Rust, Go
- Categorical token embeddings — Transformer fine-tuning with morphism-structured embeddings
- Distributed DPN — Multi-node actor execution for scale
- Community and ecosystem — Open specification, reference implementations, and adoption tooling
Key Decisions and Constraints
- Research-first approach: Theoretical soundness before implementation speed
- Incremental adoption: Every component designed for phased introduction
- Structural safety: Correct behavior made easy by design, not by discipline
- No legacy syntax: This is a new paradigm for LLMs, not a wrapper around existing languages
Repository Structure
pnxt/├── AGENTS.md # Agent development guidelines (CLAUDE.md symlinks here)├── README.md # Project overview├── QuickStart.md # Hands-on getting started guide├── status.md # This file — project status and roadmap├── package.json # Node.js project configuration├── tsconfig.json # TypeScript compiler configuration├── jest.config.js # Jest test configuration├── eslint.config.js # ESLint configuration├── .prettierrc # Prettier configuration├── .github/workflows/│ ├── ci.yml # CI pipeline (typecheck, lint, test, build)│ ├── deploy-website.yml # Website deployment│ └── validate-website.yml├── src/│ ├── index.ts # Package entry point│ ├── types/ # Shared type definitions (18 files)│ ├── memory/ # Memory Service — three-layer model with IFC│ ├── aci/ # ACI Gateway — trust + IFC checking, audit logging│ ├── agent/ # Agent Runtime — lifecycle management│ ├── capability/ # Capability Negotiation — 3-phase handshake│ ├── trust/ # Trust Engine — 5-level graduated trust, causal scoring│ ├── vpir/ # VPIR — validator, interpreter, optimizer, renderer, export│ ├── bridge-grammar/ # Bridge Grammar — JSON Schema + Claude API integration│ ├── channel/ # DPN — channels, processes, DPN runtime, bisimulation│ ├── hott/ # HoTT — categories, higher paths, univalence, transport│ ├── knowledge-graph/ # Tree-sitter DKB — typed graph + TypeScript parser│ ├── lambda/ # LLMbda Calculus — typed lambda with IFC, VPIR bridge│ ├── protocol/ # NL Protocols — state machines over DPN channels│ ├── verification/ # Formal Verification — Z3, noninterference, liveness, CVC5│ ├── benchmarks/ # Benchmarks — weather API, multi-agent delegation, pipeline│ ├── evaluation/ # Evaluation — integration scenarios, security tests│ ├── neurosymbolic/ # Neurosymbolic — P-ASP, Active Inference, refinement│ ├── experiments/ # Experiments — categorical tokenizer, self-hosting PoC│ └── errors/ # Error hierarchy├── docs/│ ├── research/ # Research documents (original prompt, Phase 3)│ ├── decisions/ # Architecture Decision Records│ ├── reviews/ # Advisory panel reviews│ ├── roadmap/ # Paradigm transition roadmap (M1-M5)│ └── sprints/ # Sprint documentation (4-12)└── website/ # Astro Starlight documentation site