Skip to content

Project Status

pnxt Project Status

Last updated: 2026-04-06 (Phase 7 complete — Sprint 15)


Current State

Phase 7 (“Self-Hosting Paradigm”) is complete. All three milestones achieved: M2 (External Task Expression), M3 (LLM-Native Programming), and M4 (Self-Modification). Sprint 15 (“Verified Self-Modification + Research Frontier”) delivered a Causal Impact Analyzer, Modification Confidence Scorer, Self-Modification Orchestrator, 5 Real Self-Modification Scenarios, and Phase 7 Comprehensive Evaluation. The system can now express tasks in VPIR, generate VPIR autonomously, and modify its own pipeline with verified correctness. Total: 21 formally verified Z3 properties, 83 test suites, 1485+ tests. Advisory panel composite score: 9.5/10. See status.md for full details.

Completed Work

PhaseFocusDeliverables
Phase 1Core Architecture, State Separation & FFIFoundational architecture design (external)
Phase 2Bridge Layer & Mathematical SpecMathematical formalization, bridge grammar spec (external)
Phase 3Deep analysis of pillars, patterns, and architectureSix research documents covering ACI, memory, coordination, trust, comparative analysis, and reference architecture

Phase 3 Deliverables (Complete)

  1. Agent-Computer Interface Specification — Protocol layers, message taxonomy, capability discovery, error handling
  2. Semantic Memory Architecture — Three-layer memory model (working, semantic, episodic) with lifecycle management
  3. Multi-Agent Coordination Patterns — Topology models, task decomposition, conflict resolution
  4. Trust, Safety, and Governance Framework — Graduated trust model, capability-based permissions, sandboxing
  5. Comparative Analysis — ANP positioned against OOP, Actor Model, Microservices, EDA, FP
  6. Implementation Reference Architecture — Concrete system design with deployment topologies and migration strategy

Phase 4: Infrastructure Prototype (Complete)

Priority 1: Core Infrastructure

  • Project scaffolding — Initialize package.json, TypeScript config, test infrastructure, and CI pipeline
  • Memory Service prototype — Three-layer memory model with pluggable StorageBackend interface, InMemoryStorageBackend for testing, and FileStorageBackend for persistent JSON-file storage across sessions
  • ACI Gateway prototype — Structured protocol layer with graduated trust checking (5 levels, side-effect-based requirements), TrustResolver for agent trust lookup, append-only AuditLogger recording all invocations/denials, and InMemoryAuditLogger implementation

Priority 2: Agent Runtime

  • Agent runtime environment — Basic agent lifecycle management (registration, execution, teardown)
  • Capability negotiation — Versioned capability discovery with 3-phase handshake, semantic versioning, trust-based constraint tightening, revocation, and expiry support
  • Trust engine — Graduated trust model (5 levels) with multi-dimensional trust, observable metric-based scoring (0–100), automatic calibration, per-dimension overrides, and trust reset/manual adjustment

Priority 3: Validation and Evaluation

  • Empirical evaluation — Multi-agent coordination scenarios (delegation pattern, trust escalation, failure recovery) exercising full system integration (runtime + trust + ACI + capabilities + memory)
  • Benchmark developmentBenchmarkSuite framework with standardized benchmarks for agent registration, trust calibration, ACI invocation, capability negotiation, memory store/query, and agent lifecycle throughput
  • Security hardeningSecurityTestSuite with adversarial tests across 5 categories: privilege escalation, trust manipulation, capability abuse, audit integrity, and resource exhaustion

Phase 5: Paradigm Foundation (Complete)

Following the Advisory Review Panel’s alignment assessment (3/10), Phase 5 implements the core paradigm components that distinguish pnxt from conventional agent frameworks.

Sprint 1: DPN + IFC + VPIR (Complete)

  • Channel<T> and DPN primitives — Typed async FIFO channels with backpressure, Process actors, DataflowGraph composition
  • IFC security labelsSecurityLabel type with lattice-based flow control
  • VPIR node types and validatorVPIRNode, VPIRGraph types with structural validator
  • Runtime integration — AgentRuntime supports channel-based inter-agent communication

Sprint 2: Bridge Grammar + Formal Verification (Complete)

  • Bridge Grammar JSON Schema — Constrained-decoding schemas forcing LLMs to output valid VPIR nodes
  • Constrained output formatters — LLM-specific schema formats (function calling, Anthropic tools, structured output)
  • Z3 SMT integration — Four verified properties: capability grant consistency, trust transition monotonicity, IFC flow lattice, side-effect trust requirements
  • IFC label enforcement completion — Extended IFC checking to ACI tool invocations and Channel sends
  • Causal trust scoring — Difficulty-weighted trust scoring replacing fixed-weight scorer

Sprint 3: VPIR Execution + NL Protocols + Visualization (Complete)

  • VPIR Interpreter — Executes validated VPIR graphs in topological order with IFC enforcement
  • Natural Language Protocol Design — Three protocol state machines: task-delegation, capability-negotiation, conflict-resolution
  • VPIR Visualization (Text-Based) — Human-readable rendering of VPIR graphs and execution traces

Sprint 4: Protocol-Channel Integration + VPIR Optimizations (Complete)

  • Protocol-Channel Integration — Bidirectional protocol channels with IFC enforcement
  • VPIR Parallel Execution — Wave-based execution with Kahn’s algorithm and semaphore concurrency
  • VPIR Result Caching — Deterministic node caching by ID + input hash

Sprint 5: HoTT Foundations + Knowledge Graph + End-to-End Pipeline (Complete)

  • HoTT Type FoundationsHoTTObject, Morphism, HoTTPath, and Category types with categorical structure
  • Tree-sitter DKB Knowledge Graph — Typed graph with 8 entity kinds, 8 relation types, traversal, and HoTT conversion
  • VPIR-to-HoTT Bridge — Converts VPIR reasoning DAGs into HoTT categories
  • Z3 Categorical Verification — Two new properties: morphism composition associativity, identity morphism laws. Total: 6 properties
  • End-to-End Pipeline Scenarios — Three integration scenarios proving paradigm pillars work together

Phase 6: Integration & Deepening (Complete — 9 Sprints)

Phase 6 focused on connecting and validating the paradigm pillars together with real-world inputs.

Sprint 6: Type Identity — Univalence Axiom + LLMbda Decision (Complete)

  • Univalence Axiom Encoding — True HoTT univalence with equivalence-to-path and inverse
  • Transport Along Paths — Property transfer between equivalent VPIR graphs without re-verification
  • LLMbda as Semantic Foundation — VPIR nodes carry lambda calculus denotations
  • Typed LLMbda Calculus ADR — Formal justification for typed over untyped lambda calculus
  • Z3 Univalence Verification — Total: 15 formally verified Z3 properties

Sprint 7: Verification Maturity — User-Program Verification + Bisimulation (Complete)

  • User-Program Property VerificationProgramVerifier with preconditions, postconditions, invariants, assertions
  • CVC5 Integration — Alternative solver via subprocess with MultiSolverVerifier orchestration
  • DPN Bisimulation Checking — Strong bisimulation + observational equivalence via partition refinement
  • Multi-Agent Delegation Benchmark — Three agents coordinating with trust boundaries and IFC enforcement
  • Secure Data Pipeline Benchmark — PII redaction with IFC analysis and label propagation
  • Z3 Properties — Total: 17 formally verified properties

Sprint 8: Neurosymbolic Bridge — P-ASP + Active Inference (Complete)

  • P-ASP Integration Prototype — Probabilistic ASP for VPIR node confidence scoring
  • Active Inference Engine — Free-energy minimization for iterative VPIR graph patching
  • Refinement Pipeline — Combines P-ASP scoring with Active Inference in an iterative loop

Sprint 9: Categorical Frontier — Native Tokenization + Self-Hosting Vision (Complete)

  • Categorical Tokenization Experiment — 42-token vocabulary with 23 morphism composition rules
  • Self-Hosting Proof of Concept — pnxt describes, validates, categorizes, and executes itself as VPIR (M1)
  • Paradigm Transition Roadmap — M1-M5 milestones from self-description to self-hosting
  • Advisory Review Alignment Package — All 10 advisor concerns addressed. Score: 7.5 → 9.2

Phase 7: Self-Hosting Paradigm (In Progress)

Phase 7 transitions pnxt from verified prototype to self-modifying, LLM-programmable system.

See docs/roadmap/paradigm-transition.md for the complete transition roadmap.

Sprint 10: Handler Library + Tool Registry (Complete)

  • Standard Handler Library — 8 pre-built tool handlers (http-fetch, json-transform, file-read, file-write, string-format, math-eval, data-validate, unit-convert)
  • Declarative Tool Registry — Operation-to-handler mapping with auto-registration, discovery API, and trust pre-validation
  • DPN Supervisor — Supervisor actor pattern with bounded restart strategies, priority mailbox, full event log
  • DPN Runtime Integration — Tool registry support in inference and action nodes; backward compatible

Sprint 11: VPIR Authoring + External Tasks — M2 Complete (Complete)

  • VPIR Graph Builder — Fluent API and fromJSON() for constructing validated VPIRGraph from pure JSON. Auto-computes roots/terminals, validates tool availability via registry
  • External Task RunnerTaskRunner orchestrating JSON spec → build → verify → DPN execute pipeline
  • Task-Aware Bridge Grammar — Enhanced LLM generation with handler documentation and validation
  • External Task Benchmarks — Temperature Conversion and Math Expression end-to-end benchmarks

Sprint 12: Reliable Bridge Grammar + Error Recovery — M3 Foundation (Complete)

  • Bridge Grammar Error Taxonomy — 6 error categories with repair hints and structured LLM feedback
  • Auto-Repair Engine — 6 repair strategies: truncated JSON, missing fields, fuzzy enums, duplicate IDs, topology, default labels
  • Confidence Scorer — 4-dimension P-ASP-inspired scoring (structural, semantic, handler coverage, topological)
  • Z3 Graph Pre-Verification — 4 formal properties: acyclicity, input completeness, IFC monotonicity, handler trust
  • Reliable Generation Pipeline — 7-stage orchestration: generate → diagnose → repair → re-validate → score → verify
  • Error Recovery Benchmark — 7 scenarios covering all error categories
  • Z3 Properties — Total: 21 formally verified properties

Test Coverage

SprintTest SuitesTestsLOC (tests)
Phase 4121942,736
Sprint 114194+
Sprint 217292~3,800
Sprint 320355~5,200
Sprint 422~415~6,600
Sprint 526479~8,200
Sprint 630557~9,400
Sprint 749882~18,100
Sprint 853932~19,800
Sprint 955974~21,000
Sprint 10581073+~23,000
Sprint 11621128+~25,000
Sprint 12681220+~27,000

Future Goals

Phase 7 Remaining (M4–M5)

  • M4: Self-Modification — pnxt modifies its own pipeline through VPIR
  • M5: Self-Hosting — pnxt’s core components expressed in pnxt

Long-Term (Phase 8+)

  • Web-based visualization frontend — Interactive node-graph renderer consuming the JSON export format
  • Multi-language Tree-sitter parsers — Extend KG parsing beyond TypeScript to Python, Rust, Go
  • Categorical token embeddings — Transformer fine-tuning with morphism-structured embeddings
  • Distributed DPN — Multi-node actor execution for scale
  • Community and ecosystem — Open specification, reference implementations, and adoption tooling

Key Decisions and Constraints

  • Research-first approach: Theoretical soundness before implementation speed
  • Incremental adoption: Every component designed for phased introduction
  • Structural safety: Correct behavior made easy by design, not by discipline
  • No legacy syntax: This is a new paradigm for LLMs, not a wrapper around existing languages

Repository Structure

pnxt/
├── AGENTS.md # Agent development guidelines (CLAUDE.md symlinks here)
├── README.md # Project overview
├── QuickStart.md # Hands-on getting started guide
├── status.md # This file — project status and roadmap
├── package.json # Node.js project configuration
├── tsconfig.json # TypeScript compiler configuration
├── jest.config.js # Jest test configuration
├── eslint.config.js # ESLint configuration
├── .prettierrc # Prettier configuration
├── .github/workflows/
│ ├── ci.yml # CI pipeline (typecheck, lint, test, build)
│ ├── deploy-website.yml # Website deployment
│ └── validate-website.yml
├── src/
│ ├── index.ts # Package entry point
│ ├── types/ # Shared type definitions (18 files)
│ ├── memory/ # Memory Service — three-layer model with IFC
│ ├── aci/ # ACI Gateway — trust + IFC checking, audit logging
│ ├── agent/ # Agent Runtime — lifecycle management
│ ├── capability/ # Capability Negotiation — 3-phase handshake
│ ├── trust/ # Trust Engine — 5-level graduated trust, causal scoring
│ ├── vpir/ # VPIR — validator, interpreter, optimizer, renderer, export
│ ├── bridge-grammar/ # Bridge Grammar — JSON Schema + Claude API integration
│ ├── channel/ # DPN — channels, processes, DPN runtime, bisimulation
│ ├── hott/ # HoTT — categories, higher paths, univalence, transport
│ ├── knowledge-graph/ # Tree-sitter DKB — typed graph + TypeScript parser
│ ├── lambda/ # LLMbda Calculus — typed lambda with IFC, VPIR bridge
│ ├── protocol/ # NL Protocols — state machines over DPN channels
│ ├── verification/ # Formal Verification — Z3, noninterference, liveness, CVC5
│ ├── benchmarks/ # Benchmarks — weather API, multi-agent delegation, pipeline
│ ├── evaluation/ # Evaluation — integration scenarios, security tests
│ ├── neurosymbolic/ # Neurosymbolic — P-ASP, Active Inference, refinement
│ ├── experiments/ # Experiments — categorical tokenizer, self-hosting PoC
│ └── errors/ # Error hierarchy
├── docs/
│ ├── research/ # Research documents (original prompt, Phase 3)
│ ├── decisions/ # Architecture Decision Records
│ ├── reviews/ # Advisory panel reviews
│ ├── roadmap/ # Paradigm transition roadmap (M1-M5)
│ └── sprints/ # Sprint documentation (4-12)
└── website/ # Astro Starlight documentation site