Project Status

pnxt Project Status

Last updated: 2026-04-06 (Phase 7 complete — Sprint 15)

Current State

Phase 7 (“Self-Hosting Paradigm”) is complete. All three milestones achieved: M2 (External Task Expression), M3 (LLM-Native Programming), and M4 (Self-Modification). Sprint 15 (“Verified Self-Modification + Research Frontier”) delivered a Causal Impact Analyzer, Modification Confidence Scorer, Self-Modification Orchestrator, 5 Real Self-Modification Scenarios, and Phase 7 Comprehensive Evaluation. The system can now express tasks in VPIR, generate VPIR autonomously, and modify its own pipeline with verified correctness. Total: 21 formally verified Z3 properties, 83 test suites, 1485+ tests. Advisory panel composite score: 9.5/10. See status.md for full details.

Completed Work

Phase	Focus	Deliverables
Phase 1	Core Architecture, State Separation & FFI	Foundational architecture design (external)
Phase 2	Bridge Layer & Mathematical Spec	Mathematical formalization, bridge grammar spec (external)
Phase 3	Deep analysis of pillars, patterns, and architecture	Six research documents covering ACI, memory, coordination, trust, comparative analysis, and reference architecture

Phase 3 Deliverables (Complete)

Agent-Computer Interface Specification — Protocol layers, message taxonomy, capability discovery, error handling
Semantic Memory Architecture — Three-layer memory model (working, semantic, episodic) with lifecycle management
Multi-Agent Coordination Patterns — Topology models, task decomposition, conflict resolution
Trust, Safety, and Governance Framework — Graduated trust model, capability-based permissions, sandboxing
Comparative Analysis — ANP positioned against OOP, Actor Model, Microservices, EDA, FP
Implementation Reference Architecture — Concrete system design with deployment topologies and migration strategy

Phase 4: Infrastructure Prototype (Complete)

Priority 1: Core Infrastructure

Project scaffolding — Initialize package.json, TypeScript config, test infrastructure, and CI pipeline
Memory Service prototype — Three-layer memory model with pluggable StorageBackend interface, InMemoryStorageBackend for testing, and FileStorageBackend for persistent JSON-file storage across sessions
ACI Gateway prototype — Structured protocol layer with graduated trust checking (5 levels, side-effect-based requirements), TrustResolver for agent trust lookup, append-only AuditLogger recording all invocations/denials, and InMemoryAuditLogger implementation

Priority 2: Agent Runtime

Agent runtime environment — Basic agent lifecycle management (registration, execution, teardown)
Capability negotiation — Versioned capability discovery with 3-phase handshake, semantic versioning, trust-based constraint tightening, revocation, and expiry support
Trust engine — Graduated trust model (5 levels) with multi-dimensional trust, observable metric-based scoring (0–100), automatic calibration, per-dimension overrides, and trust reset/manual adjustment

Priority 3: Validation and Evaluation

Empirical evaluation — Multi-agent coordination scenarios (delegation pattern, trust escalation, failure recovery) exercising full system integration (runtime + trust + ACI + capabilities + memory)
Benchmark development — BenchmarkSuite framework with standardized benchmarks for agent registration, trust calibration, ACI invocation, capability negotiation, memory store/query, and agent lifecycle throughput
Security hardening — SecurityTestSuite with adversarial tests across 5 categories: privilege escalation, trust manipulation, capability abuse, audit integrity, and resource exhaustion

Phase 5: Paradigm Foundation (Complete)

Following the Advisory Review Panel’s alignment assessment (3/10), Phase 5 implements the core paradigm components that distinguish pnxt from conventional agent frameworks.

Sprint 1: DPN + IFC + VPIR (Complete)

Channel<T> and DPN primitives — Typed async FIFO channels with backpressure, Process actors, DataflowGraph composition
IFC security labels — SecurityLabel type with lattice-based flow control
VPIR node types and validator — VPIRNode, VPIRGraph types with structural validator
Runtime integration — AgentRuntime supports channel-based inter-agent communication

Sprint 2: Bridge Grammar + Formal Verification (Complete)

Bridge Grammar JSON Schema — Constrained-decoding schemas forcing LLMs to output valid VPIR nodes
Constrained output formatters — LLM-specific schema formats (function calling, Anthropic tools, structured output)
Z3 SMT integration — Four verified properties: capability grant consistency, trust transition monotonicity, IFC flow lattice, side-effect trust requirements
IFC label enforcement completion — Extended IFC checking to ACI tool invocations and Channel sends
Causal trust scoring — Difficulty-weighted trust scoring replacing fixed-weight scorer

Sprint 3: VPIR Execution + NL Protocols + Visualization (Complete)

VPIR Interpreter — Executes validated VPIR graphs in topological order with IFC enforcement
Natural Language Protocol Design — Three protocol state machines: task-delegation, capability-negotiation, conflict-resolution
VPIR Visualization (Text-Based) — Human-readable rendering of VPIR graphs and execution traces

Sprint 4: Protocol-Channel Integration + VPIR Optimizations (Complete)

Protocol-Channel Integration — Bidirectional protocol channels with IFC enforcement
VPIR Parallel Execution — Wave-based execution with Kahn’s algorithm and semaphore concurrency
VPIR Result Caching — Deterministic node caching by ID + input hash

Sprint 5: HoTT Foundations + Knowledge Graph + End-to-End Pipeline (Complete)

HoTT Type Foundations — HoTTObject, Morphism, HoTTPath, and Category types with categorical structure
Tree-sitter DKB Knowledge Graph — Typed graph with 8 entity kinds, 8 relation types, traversal, and HoTT conversion
VPIR-to-HoTT Bridge — Converts VPIR reasoning DAGs into HoTT categories
Z3 Categorical Verification — Two new properties: morphism composition associativity, identity morphism laws. Total: 6 properties
End-to-End Pipeline Scenarios — Three integration scenarios proving paradigm pillars work together

Phase 6: Integration & Deepening (Complete — 9 Sprints)

Phase 6 focused on connecting and validating the paradigm pillars together with real-world inputs.

Sprint 6: Type Identity — Univalence Axiom + LLMbda Decision (Complete)

Univalence Axiom Encoding — True HoTT univalence with equivalence-to-path and inverse
Transport Along Paths — Property transfer between equivalent VPIR graphs without re-verification
LLMbda as Semantic Foundation — VPIR nodes carry lambda calculus denotations
Typed LLMbda Calculus ADR — Formal justification for typed over untyped lambda calculus
Z3 Univalence Verification — Total: 15 formally verified Z3 properties

Sprint 7: Verification Maturity — User-Program Verification + Bisimulation (Complete)

User-Program Property Verification — ProgramVerifier with preconditions, postconditions, invariants, assertions
CVC5 Integration — Alternative solver via subprocess with MultiSolverVerifier orchestration
DPN Bisimulation Checking — Strong bisimulation + observational equivalence via partition refinement
Multi-Agent Delegation Benchmark — Three agents coordinating with trust boundaries and IFC enforcement
Secure Data Pipeline Benchmark — PII redaction with IFC analysis and label propagation
Z3 Properties — Total: 17 formally verified properties

Sprint 8: Neurosymbolic Bridge — P-ASP + Active Inference (Complete)

P-ASP Integration Prototype — Probabilistic ASP for VPIR node confidence scoring
Active Inference Engine — Free-energy minimization for iterative VPIR graph patching
Refinement Pipeline — Combines P-ASP scoring with Active Inference in an iterative loop

Sprint 9: Categorical Frontier — Native Tokenization + Self-Hosting Vision (Complete)

Categorical Tokenization Experiment — 42-token vocabulary with 23 morphism composition rules
Self-Hosting Proof of Concept — pnxt describes, validates, categorizes, and executes itself as VPIR (M1)
Paradigm Transition Roadmap — M1-M5 milestones from self-description to self-hosting
Advisory Review Alignment Package — All 10 advisor concerns addressed. Score: 7.5 → 9.2

Phase 7: Self-Hosting Paradigm (In Progress)

Phase 7 transitions pnxt from verified prototype to self-modifying, LLM-programmable system.

See docs/roadmap/paradigm-transition.md for the complete transition roadmap.

Sprint 10: Handler Library + Tool Registry (Complete)

Standard Handler Library — 8 pre-built tool handlers (http-fetch, json-transform, file-read, file-write, string-format, math-eval, data-validate, unit-convert)
Declarative Tool Registry — Operation-to-handler mapping with auto-registration, discovery API, and trust pre-validation
DPN Supervisor — Supervisor actor pattern with bounded restart strategies, priority mailbox, full event log
DPN Runtime Integration — Tool registry support in inference and action nodes; backward compatible

Sprint 11: VPIR Authoring + External Tasks — M2 Complete (Complete)

VPIR Graph Builder — Fluent API and fromJSON() for constructing validated VPIRGraph from pure JSON. Auto-computes roots/terminals, validates tool availability via registry
External Task Runner — TaskRunner orchestrating JSON spec → build → verify → DPN execute pipeline
Task-Aware Bridge Grammar — Enhanced LLM generation with handler documentation and validation
External Task Benchmarks — Temperature Conversion and Math Expression end-to-end benchmarks

Sprint 12: Reliable Bridge Grammar + Error Recovery — M3 Foundation (Complete)

Bridge Grammar Error Taxonomy — 6 error categories with repair hints and structured LLM feedback
Auto-Repair Engine — 6 repair strategies: truncated JSON, missing fields, fuzzy enums, duplicate IDs, topology, default labels
Confidence Scorer — 4-dimension P-ASP-inspired scoring (structural, semantic, handler coverage, topological)
Z3 Graph Pre-Verification — 4 formal properties: acyclicity, input completeness, IFC monotonicity, handler trust
Reliable Generation Pipeline — 7-stage orchestration: generate → diagnose → repair → re-validate → score → verify
Error Recovery Benchmark — 7 scenarios covering all error categories
Z3 Properties — Total: 21 formally verified properties

Test Coverage

Sprint	Test Suites	Tests	LOC (tests)
Phase 4	12	194	2,736
Sprint 1	14	194+	—
Sprint 2	17	292	~3,800
Sprint 3	20	355	~5,200
Sprint 4	22	~415	~6,600
Sprint 5	26	479	~8,200
Sprint 6	30	557	~9,400
Sprint 7	49	882	~18,100
Sprint 8	53	932	~19,800
Sprint 9	55	974	~21,000
Sprint 10	58	1073+	~23,000
Sprint 11	62	1128+	~25,000
Sprint 12	68	1220+	~27,000

Future Goals

Phase 7 Remaining (M4–M5)

M4: Self-Modification — pnxt modifies its own pipeline through VPIR
M5: Self-Hosting — pnxt’s core components expressed in pnxt

Long-Term (Phase 8+)

Web-based visualization frontend — Interactive node-graph renderer consuming the JSON export format
Multi-language Tree-sitter parsers — Extend KG parsing beyond TypeScript to Python, Rust, Go
Categorical token embeddings — Transformer fine-tuning with morphism-structured embeddings
Distributed DPN — Multi-node actor execution for scale
Community and ecosystem — Open specification, reference implementations, and adoption tooling

Key Decisions and Constraints

Research-first approach: Theoretical soundness before implementation speed
Incremental adoption: Every component designed for phased introduction
Structural safety: Correct behavior made easy by design, not by discipline
No legacy syntax: This is a new paradigm for LLMs, not a wrapper around existing languages

Repository Structure

pnxt/
├── AGENTS.md              # Agent development guidelines (CLAUDE.md symlinks here)
├── README.md              # Project overview
├── QuickStart.md          # Hands-on getting started guide
├── status.md              # This file — project status and roadmap
├── package.json           # Node.js project configuration
├── tsconfig.json          # TypeScript compiler configuration
├── jest.config.js         # Jest test configuration
├── eslint.config.js       # ESLint configuration
├── .prettierrc            # Prettier configuration
├── .github/workflows/
│   ├── ci.yml             # CI pipeline (typecheck, lint, test, build)
│   ├── deploy-website.yml # Website deployment
│   └── validate-website.yml
├── src/
│   ├── index.ts           # Package entry point
│   ├── types/             # Shared type definitions (18 files)
│   ├── memory/            # Memory Service — three-layer model with IFC
│   ├── aci/               # ACI Gateway — trust + IFC checking, audit logging
│   ├── agent/             # Agent Runtime — lifecycle management
│   ├── capability/        # Capability Negotiation — 3-phase handshake
│   ├── trust/             # Trust Engine — 5-level graduated trust, causal scoring
│   ├── vpir/              # VPIR — validator, interpreter, optimizer, renderer, export
│   ├── bridge-grammar/    # Bridge Grammar — JSON Schema + Claude API integration
│   ├── channel/           # DPN — channels, processes, DPN runtime, bisimulation
│   ├── hott/              # HoTT — categories, higher paths, univalence, transport
│   ├── knowledge-graph/   # Tree-sitter DKB — typed graph + TypeScript parser
│   ├── lambda/            # LLMbda Calculus — typed lambda with IFC, VPIR bridge
│   ├── protocol/          # NL Protocols — state machines over DPN channels
│   ├── verification/      # Formal Verification — Z3, noninterference, liveness, CVC5
│   ├── benchmarks/        # Benchmarks — weather API, multi-agent delegation, pipeline
│   ├── evaluation/        # Evaluation — integration scenarios, security tests
│   ├── neurosymbolic/     # Neurosymbolic — P-ASP, Active Inference, refinement
│   ├── experiments/       # Experiments — categorical tokenizer, self-hosting PoC
│   └── errors/            # Error hierarchy
├── docs/
│   ├── research/          # Research documents (original prompt, Phase 3)
│   ├── decisions/         # Architecture Decision Records
│   ├── reviews/           # Advisory panel reviews
│   ├── roadmap/           # Paradigm transition roadmap (M1-M5)
│   └── sprints/           # Sprint documentation (4-12)
└── website/               # Astro Starlight documentation site