# Architecture Deltas — Recent Hardening Phase This document catalogs all architectural and design changes made in the recent hardening phase. ## Runtime Model: Daemon-Like Resident Process **Status**: Completed in `src/main.rs` **Change**: Converted from startup-IPL-then-exit to a resident, coordinated-shutdown runtime. **Details**: - IPL loads config, validates services, initializes broker pools, then enters an event loop. - Loop waits for either: - Global shutdown signal (broadcast from dispatcher when AMQP `shutdown` command received). - User interrupt (Ctrl+C). - On signal, loop cleanly shuts down Tokio tasks and exits with status code 0. **Why**: Aligns with operational daemon expectations (systemd, orchestrators). Ensures graceful lifecycle rather than abrupt termination. Supports hot-reload/redeployment workflows. --- ## Broker Dispatch: Unified Consumer with Shutdown Semantics **Status**: Completed in `src/brokers/dispatcher.rs` and `src/brokers/mod.rs` **Change**: Integrated shutdown command handling into the unified dispatcher consumer. **Details**: - Dispatcher pool now receives a global `shutdown_tx` channel at spawn time. - Each dispatcher consumer listens for AMQP `shutdown` operation. - On `shutdown`: acknowledge the message, broadcast shutdown signal to all peers, and exit cleanly. - All dispatchers also listen on the global shutdown channel and exit if signaled externally. **Why**: Enables coordinated, multi-node shutdown without forceful process kill. Aligns with AMQP message semantics (shutdown is a standard operation, not a runtime hack). --- ## Logger: Explicit IPL Persistence to MongoDB **Status**: Completed in `src/main.rs` and `src/brokers/logger_store.rs` **Change**: IPL startup/failure events now explicitly persisted to `msLogs` collection with structured context. **Details**: - Root GUID generated at IPL start; all startup events tagged with this root ID. - Structured log entries include: - `root_event_id`: chains all startup events to a single root. - `timestamp`: human-readable ISO 8601 format. - `node_id`: configured node name/role. - `event_type`: IPL phase (e.g., "ipl_start", "service_validated", "broker_pool_spawned", "ipl_complete"). - `message`: human-readable summary. - `metadata`: optional structured context (validation results, latency, etc.). - If IPL fails, best-effort logging of failure event to Mongo before process exit. - After IPL success, showcase log-level examples (INFO, WARN, ERROR) for visibility. **Why**: Startup is traditionally hardest to debug (logs often lost). Persistent, queryable startup context enables post-mortem analysis of deployment/initialization issues. Root GUID enables chain-crawl diagnostics across distributed startup events. --- ## Developer Diagnostics: Root GUID Lineage and Chain Tracing **Status**: Completed in `src/brokers/logger_store.rs` and `src/bin/log_dumper.rs` **Change**: Added root GUID-based event chain tracing and query layer. **Details**: - `logger_store::fetch_chain(root_event_id, limit)`: retrieve all events tagged with a root ID, sorted by timestamp. - `logger_store::fetch_root_record(root_event_id)`: retrieve the initiating root event. - `log_dumper` web UI exposes: - Root GUID input field to query and visualize entire event chain. - Single-record view at `/record?root_event_id=...` to inspect individual startup context. - Arrow-trigger UX for expanding compact row summaries without constant page reload. **Why**: Enables developers to rapidly correlate events across a single startup sequence or transaction. Reduces manual log sifting. Scales from single node to multi-node deployments. --- ## Configuration: Trace-On and Logger Admin Controls **Status**: Completed in `src/config/structs.rs` and `config/env_dev.toml` **Change**: Added two new config namespaces for developer and administrative control. **Details**: ### `[runtime.trace_on]` - Boolean flag (default: false in production, true in `env_dev.toml`). - When true, logs method entry/exit at TRACE level for all broker consumers and core trait implementations. - Enables dev to narrow causality in complex message flows without instrumenting code. ### `[logger_admin]` - `purge_on_ipl` (boolean, default: false): on successful IPL, automatically purge named collections before startup logging begins. - `purge_collections` (array of strings): list of collection names to purge (e.g., `["msLogs", "msErrors"]`). - Enables clean dev iteration: each `cargo run` in dev automatically resets logger state. **Why**: Reduces friction in dev loops. Trace-on avoids printf debugging. Purge-on-IPL ensures each test iteration starts fresh without manual `mongo` CLI cleanup. --- ## Observability Utility: Modern Logger Reader (log_dumper) **Status**: Completed in `src/bin/log_dumper.rs` **Change**: Built a modern Rust equivalent to legacy PHP `utilities/dumper.php` for browsing `msLogs`. **Details**: - **Web UI** (Axum): - Dashboard route `/` with seed-write action, quick filter by level/node, root GUID chain input. - Compact row layout: timestamp | level | node | message snippet | arrow (expand). - Single-record view `/record?root_event_id=...` showing full event context. - Arrow-trigger expansion shows full message without full-page refresh. - **Features**: - Human-readable timestamps (ISO 8601 formatted). - Seed-write to create test events and validate logger pipeline. - Root chain traversal via GUID input. - Dev-centric UX: minimal clicks, maximum information density. **Why**: Centralizes all observability into a single web interface. Replaces CLI-based manual querying. Makes startup diagnostics visible to entire team without MongoDB knowledge. --- ## Operational Safety: Dev-Only Purge Controls **Status**: Completed in `src/main.rs` and config system **Change**: Added dev-only purge logic to reset logger collections on IPL in non-production environments. **Details**: - IPL checks `config.logger_admin.purge_on_ipl` flag. - If true and node is not production, purges collections listed in `config.logger_admin.purge_collections` before logging startup events. - Prevents accidental production data loss (flag only honored in non-prod node roles). - `env_dev.toml` enables this by default for frictionless dev iteration. **Why**: Closes dev/prod gap. Enables safe, repeatable testing without manual intervention. Prevents stale logger state from polluting diagnostics. --- ## Commit Summary This hardening phase encompasses: 1. **Runtime lifecycle**: Daemon model, coordinated shutdown, graceful exit. 2. **Broker semantics**: Shutdown operation integration, channel-based signaling. 3. **Logging infrastructure**: Persistent IPL events, root GUID lineage, structured context. 4. **Developer experience**: Trace control, purge controls, web-based observability. 5. **Configuration**: New `trace_on` and `logger_admin` namespaces. 6. **Tooling**: Modern Rust observability utility replacing legacy PHP dumper. **Files Changed**: - `src/main.rs`: resident runtime loop, IPL logging, shutdown coordination, trace control. - `src/brokers/dispatcher.rs`: shutdown operation handling, global shutdown listening. - `src/brokers/mod.rs`: dispatcher pool accepts shutdown channels. - `src/brokers/logger_store.rs`: root GUID chain fetch operations, structured logging helpers. - `src/config/structs.rs`: `trace_on`, `logger_admin` config types. - `src/bin/log_dumper.rs`: new modern observability utility (Axum web UI). - `config/env_dev.toml`: dev overrides enabling trace/purge controls. - `Cargo.toml` / `Cargo.lock`: added `axum`, `chrono`, `uuid` dependencies. - Wiki updates: `Home.md`, `04-ipl.md`, `06-queue-topology.md`, `10-modernization-roadmap.md`, new `11-beds-architecture-visual-brief.md`. **Next Phase**: Autoscaling heuristics, metric collection, and cross-node coordinator election (deferred).