Catalogs all architectural changes from resident runtime implementation: - Runtime model: daemon-like process with coordinated shutdown - Broker dispatch: shutdown operation integration - Logger persistence: explicit IPL logging to MongoDB with root GUID lineage - Developer diagnostics: chain tracing and web-based observability - Config system: trace_on and logger_admin controls - Observability utility: modern log_dumper web UI (replaces legacy PHP dumper) - Operational safety: dev-only purge-on-IPL controls Files modified: 13 (src/main.rs, brokers/*, config/*, bin/log_dumper.rs, Cargo.*, wiki/*) Dependencies added: axum, chrono, uuid See wiki/12-architecture-deltas.md for full details.
7.7 KiB
Architecture Deltas — Recent Hardening Phase
This document catalogs all architectural and design changes made in the recent hardening phase.
Runtime Model: Daemon-Like Resident Process
Status: Completed in src/main.rs
Change: Converted from startup-IPL-then-exit to a resident, coordinated-shutdown runtime.
Details:
- IPL loads config, validates services, initializes broker pools, then enters an event loop.
- Loop waits for either:
- Global shutdown signal (broadcast from dispatcher when AMQP
shutdowncommand received). - User interrupt (Ctrl+C).
- Global shutdown signal (broadcast from dispatcher when AMQP
- On signal, loop cleanly shuts down Tokio tasks and exits with status code 0.
Why: Aligns with operational daemon expectations (systemd, orchestrators). Ensures graceful lifecycle rather than abrupt termination. Supports hot-reload/redeployment workflows.
Broker Dispatch: Unified Consumer with Shutdown Semantics
Status: Completed in src/brokers/dispatcher.rs and src/brokers/mod.rs
Change: Integrated shutdown command handling into the unified dispatcher consumer.
Details:
- Dispatcher pool now receives a global
shutdown_txchannel at spawn time. - Each dispatcher consumer listens for AMQP
shutdownoperation. - On
shutdown: acknowledge the message, broadcast shutdown signal to all peers, and exit cleanly. - All dispatchers also listen on the global shutdown channel and exit if signaled externally.
Why: Enables coordinated, multi-node shutdown without forceful process kill. Aligns with AMQP message semantics (shutdown is a standard operation, not a runtime hack).
Logger: Explicit IPL Persistence to MongoDB
Status: Completed in src/main.rs and src/brokers/logger_store.rs
Change: IPL startup/failure events now explicitly persisted to msLogs collection with structured context.
Details:
- Root GUID generated at IPL start; all startup events tagged with this root ID.
- Structured log entries include:
root_event_id: chains all startup events to a single root.timestamp: human-readable ISO 8601 format.node_id: configured node name/role.event_type: IPL phase (e.g., "ipl_start", "service_validated", "broker_pool_spawned", "ipl_complete").message: human-readable summary.metadata: optional structured context (validation results, latency, etc.).
- If IPL fails, best-effort logging of failure event to Mongo before process exit.
- After IPL success, showcase log-level examples (INFO, WARN, ERROR) for visibility.
Why: Startup is traditionally hardest to debug (logs often lost). Persistent, queryable startup context enables post-mortem analysis of deployment/initialization issues. Root GUID enables chain-crawl diagnostics across distributed startup events.
Developer Diagnostics: Root GUID Lineage and Chain Tracing
Status: Completed in src/brokers/logger_store.rs and src/bin/log_dumper.rs
Change: Added root GUID-based event chain tracing and query layer.
Details:
logger_store::fetch_chain(root_event_id, limit): retrieve all events tagged with a root ID, sorted by timestamp.logger_store::fetch_root_record(root_event_id): retrieve the initiating root event.log_dumperweb UI exposes:- Root GUID input field to query and visualize entire event chain.
- Single-record view at
/record?root_event_id=...to inspect individual startup context. - Arrow-trigger UX for expanding compact row summaries without constant page reload.
Why: Enables developers to rapidly correlate events across a single startup sequence or transaction. Reduces manual log sifting. Scales from single node to multi-node deployments.
Configuration: Trace-On and Logger Admin Controls
Status: Completed in src/config/structs.rs and config/env_dev.toml
Change: Added two new config namespaces for developer and administrative control.
Details:
[runtime.trace_on]
- Boolean flag (default: false in production, true in
env_dev.toml). - When true, logs method entry/exit at TRACE level for all broker consumers and core trait implementations.
- Enables dev to narrow causality in complex message flows without instrumenting code.
[logger_admin]
purge_on_ipl(boolean, default: false): on successful IPL, automatically purge named collections before startup logging begins.purge_collections(array of strings): list of collection names to purge (e.g.,["msLogs", "msErrors"]).- Enables clean dev iteration: each
cargo runin dev automatically resets logger state.
Why: Reduces friction in dev loops. Trace-on avoids printf debugging. Purge-on-IPL ensures each test iteration starts fresh without manual mongo CLI cleanup.
Observability Utility: Modern Logger Reader (log_dumper)
Status: Completed in src/bin/log_dumper.rs
Change: Built a modern Rust equivalent to legacy PHP utilities/dumper.php for browsing msLogs.
Details:
-
Web UI (Axum):
- Dashboard route
/with seed-write action, quick filter by level/node, root GUID chain input. - Compact row layout: timestamp | level | node | message snippet | arrow (expand).
- Single-record view
/record?root_event_id=...showing full event context. - Arrow-trigger expansion shows full message without full-page refresh.
- Dashboard route
-
Features:
- Human-readable timestamps (ISO 8601 formatted).
- Seed-write to create test events and validate logger pipeline.
- Root chain traversal via GUID input.
- Dev-centric UX: minimal clicks, maximum information density.
Why: Centralizes all observability into a single web interface. Replaces CLI-based manual querying. Makes startup diagnostics visible to entire team without MongoDB knowledge.
Operational Safety: Dev-Only Purge Controls
Status: Completed in src/main.rs and config system
Change: Added dev-only purge logic to reset logger collections on IPL in non-production environments.
Details:
- IPL checks
config.logger_admin.purge_on_iplflag. - If true and node is not production, purges collections listed in
config.logger_admin.purge_collectionsbefore logging startup events. - Prevents accidental production data loss (flag only honored in non-prod node roles).
env_dev.tomlenables this by default for frictionless dev iteration.
Why: Closes dev/prod gap. Enables safe, repeatable testing without manual intervention. Prevents stale logger state from polluting diagnostics.
Commit Summary
This hardening phase encompasses:
- Runtime lifecycle: Daemon model, coordinated shutdown, graceful exit.
- Broker semantics: Shutdown operation integration, channel-based signaling.
- Logging infrastructure: Persistent IPL events, root GUID lineage, structured context.
- Developer experience: Trace control, purge controls, web-based observability.
- Configuration: New
trace_onandlogger_adminnamespaces. - Tooling: Modern Rust observability utility replacing legacy PHP dumper.
Files Changed:
src/main.rs: resident runtime loop, IPL logging, shutdown coordination, trace control.src/brokers/dispatcher.rs: shutdown operation handling, global shutdown listening.src/brokers/mod.rs: dispatcher pool accepts shutdown channels.src/brokers/logger_store.rs: root GUID chain fetch operations, structured logging helpers.src/config/structs.rs:trace_on,logger_adminconfig types.src/bin/log_dumper.rs: new modern observability utility (Axum web UI).config/env_dev.toml: dev overrides enabling trace/purge controls.Cargo.toml/Cargo.lock: addedaxum,chrono,uuiddependencies.- Wiki updates:
Home.md,04-ipl.md,06-queue-topology.md,10-modernization-roadmap.md, new11-beds-architecture-visual-brief.md.
Next Phase: Autoscaling heuristics, metric collection, and cross-node coordinator election (deferred).