ORK
Internals of the @mdk/ork kernel — modules, the pull-only model, and system recovery
@mdk/ork is the trusted coordination layer of the stack. It splits internal responsibilities across single-purpose modules with
their own state machines, so domains can evolve independently without coupling to each other.
The design is inspired by Kubernetes: a pull-only model bounds the pace of execution so the kernel cannot be overwhelmed by upstream pressure.
Module overview
Module catalogue
@mdk/ork's coordination splits across single-purpose modules. Each owns its own state machine, persistence boundary,
and scaling characteristics. Six modules ship in v0.0.1; two more are deferred to a later release.
| Module | Role |
|---|---|
| Command Dispatcher | Validates and resolves the destination Worker for incoming commands. |
| Command State Machine | Tracks command lifecycle from QUEUED to SUCCESS or FAILED. |
| Worker Registry | Authoritative lookup of active Workers, their RPC keys, and managed devices. |
| Telemetry Collector | Stateless proxy between callers and Worker-local telemetry stores. |
| Scheduler | System metronome; drives all interval-based pulls (telemetry, state, health). |
| Health Monitor | Liveness probes against Workers; reports status to the Registry. |
| Fault Supervisor (deferred) | Circuit-breaker patterns to contain cascading failures. |
| Concurrency Manager (deferred) | Per-device locking and queue-depth limits. |
For the full state machines, transition rules, interface signatures, and recovery details, see the ORK modules reference.
System recovery
On a full system crash and restart, @mdk/ork modules orchestrate recovery without user intervention:
- Worker Registry loads last known Worker and device states from Hyperbee.
- Command State Machine sweeps the WAL for stranded
EXECUTINGtasks and forces them to timeout or retry. - Health Monitor begins firing immediate pings to verify which Workers are still active.
- Connections: the network layer awaits incoming HRPC reconnect storms from persistent Workers.
Recovery is local and predictable. Worker crashes do not bring down the runtime; supervisors (PM2, Docker, Kubernetes) handle process restarts in multi-process deployments, and Workers rejoin the system after recovery.
Next steps
- Architecture — how ORK fits the broader MDK stack
- ORK modules reference — per-module state machines, interfaces, and transition rules