Files
rustybeds/wiki/12-architecture-deltas.md
gramps 3c54635924 docs: comprehensive architecture delta record for hardening phase
Catalogs all architectural changes from resident runtime implementation:
- Runtime model: daemon-like process with coordinated shutdown
- Broker dispatch: shutdown operation integration
- Logger persistence: explicit IPL logging to MongoDB with root GUID lineage
- Developer diagnostics: chain tracing and web-based observability
- Config system: trace_on and logger_admin controls
- Observability utility: modern log_dumper web UI (replaces legacy PHP dumper)
- Operational safety: dev-only purge-on-IPL controls

Files modified: 13 (src/main.rs, brokers/*, config/*, bin/log_dumper.rs, Cargo.*, wiki/*)
Dependencies added: axum, chrono, uuid

See wiki/12-architecture-deltas.md for full details.
2026-04-10 17:12:01 -07:00

162 lines
7.7 KiB
Markdown

# Architecture Deltas — Recent Hardening Phase
This document catalogs all architectural and design changes made in the recent hardening phase.
## Runtime Model: Daemon-Like Resident Process
**Status**: Completed in `src/main.rs`
**Change**: Converted from startup-IPL-then-exit to a resident, coordinated-shutdown runtime.
**Details**:
- IPL loads config, validates services, initializes broker pools, then enters an event loop.
- Loop waits for either:
- Global shutdown signal (broadcast from dispatcher when AMQP `shutdown` command received).
- User interrupt (Ctrl+C).
- On signal, loop cleanly shuts down Tokio tasks and exits with status code 0.
**Why**: Aligns with operational daemon expectations (systemd, orchestrators). Ensures graceful lifecycle rather than abrupt termination. Supports hot-reload/redeployment workflows.
---
## Broker Dispatch: Unified Consumer with Shutdown Semantics
**Status**: Completed in `src/brokers/dispatcher.rs` and `src/brokers/mod.rs`
**Change**: Integrated shutdown command handling into the unified dispatcher consumer.
**Details**:
- Dispatcher pool now receives a global `shutdown_tx` channel at spawn time.
- Each dispatcher consumer listens for AMQP `shutdown` operation.
- On `shutdown`: acknowledge the message, broadcast shutdown signal to all peers, and exit cleanly.
- All dispatchers also listen on the global shutdown channel and exit if signaled externally.
**Why**: Enables coordinated, multi-node shutdown without forceful process kill. Aligns with AMQP message semantics (shutdown is a standard operation, not a runtime hack).
---
## Logger: Explicit IPL Persistence to MongoDB
**Status**: Completed in `src/main.rs` and `src/brokers/logger_store.rs`
**Change**: IPL startup/failure events now explicitly persisted to `msLogs` collection with structured context.
**Details**:
- Root GUID generated at IPL start; all startup events tagged with this root ID.
- Structured log entries include:
- `root_event_id`: chains all startup events to a single root.
- `timestamp`: human-readable ISO 8601 format.
- `node_id`: configured node name/role.
- `event_type`: IPL phase (e.g., "ipl_start", "service_validated", "broker_pool_spawned", "ipl_complete").
- `message`: human-readable summary.
- `metadata`: optional structured context (validation results, latency, etc.).
- If IPL fails, best-effort logging of failure event to Mongo before process exit.
- After IPL success, showcase log-level examples (INFO, WARN, ERROR) for visibility.
**Why**: Startup is traditionally hardest to debug (logs often lost). Persistent, queryable startup context enables post-mortem analysis of deployment/initialization issues. Root GUID enables chain-crawl diagnostics across distributed startup events.
---
## Developer Diagnostics: Root GUID Lineage and Chain Tracing
**Status**: Completed in `src/brokers/logger_store.rs` and `src/bin/log_dumper.rs`
**Change**: Added root GUID-based event chain tracing and query layer.
**Details**:
- `logger_store::fetch_chain(root_event_id, limit)`: retrieve all events tagged with a root ID, sorted by timestamp.
- `logger_store::fetch_root_record(root_event_id)`: retrieve the initiating root event.
- `log_dumper` web UI exposes:
- Root GUID input field to query and visualize entire event chain.
- Single-record view at `/record?root_event_id=...` to inspect individual startup context.
- Arrow-trigger UX for expanding compact row summaries without constant page reload.
**Why**: Enables developers to rapidly correlate events across a single startup sequence or transaction. Reduces manual log sifting. Scales from single node to multi-node deployments.
---
## Configuration: Trace-On and Logger Admin Controls
**Status**: Completed in `src/config/structs.rs` and `config/env_dev.toml`
**Change**: Added two new config namespaces for developer and administrative control.
**Details**:
### `[runtime.trace_on]`
- Boolean flag (default: false in production, true in `env_dev.toml`).
- When true, logs method entry/exit at TRACE level for all broker consumers and core trait implementations.
- Enables dev to narrow causality in complex message flows without instrumenting code.
### `[logger_admin]`
- `purge_on_ipl` (boolean, default: false): on successful IPL, automatically purge named collections before startup logging begins.
- `purge_collections` (array of strings): list of collection names to purge (e.g., `["msLogs", "msErrors"]`).
- Enables clean dev iteration: each `cargo run` in dev automatically resets logger state.
**Why**: Reduces friction in dev loops. Trace-on avoids printf debugging. Purge-on-IPL ensures each test iteration starts fresh without manual `mongo` CLI cleanup.
---
## Observability Utility: Modern Logger Reader (log_dumper)
**Status**: Completed in `src/bin/log_dumper.rs`
**Change**: Built a modern Rust equivalent to legacy PHP `utilities/dumper.php` for browsing `msLogs`.
**Details**:
- **Web UI** (Axum):
- Dashboard route `/` with seed-write action, quick filter by level/node, root GUID chain input.
- Compact row layout: timestamp | level | node | message snippet | arrow (expand).
- Single-record view `/record?root_event_id=...` showing full event context.
- Arrow-trigger expansion shows full message without full-page refresh.
- **Features**:
- Human-readable timestamps (ISO 8601 formatted).
- Seed-write to create test events and validate logger pipeline.
- Root chain traversal via GUID input.
- Dev-centric UX: minimal clicks, maximum information density.
**Why**: Centralizes all observability into a single web interface. Replaces CLI-based manual querying. Makes startup diagnostics visible to entire team without MongoDB knowledge.
---
## Operational Safety: Dev-Only Purge Controls
**Status**: Completed in `src/main.rs` and config system
**Change**: Added dev-only purge logic to reset logger collections on IPL in non-production environments.
**Details**:
- IPL checks `config.logger_admin.purge_on_ipl` flag.
- If true and node is not production, purges collections listed in `config.logger_admin.purge_collections` before logging startup events.
- Prevents accidental production data loss (flag only honored in non-prod node roles).
- `env_dev.toml` enables this by default for frictionless dev iteration.
**Why**: Closes dev/prod gap. Enables safe, repeatable testing without manual intervention. Prevents stale logger state from polluting diagnostics.
---
## Commit Summary
This hardening phase encompasses:
1. **Runtime lifecycle**: Daemon model, coordinated shutdown, graceful exit.
2. **Broker semantics**: Shutdown operation integration, channel-based signaling.
3. **Logging infrastructure**: Persistent IPL events, root GUID lineage, structured context.
4. **Developer experience**: Trace control, purge controls, web-based observability.
5. **Configuration**: New `trace_on` and `logger_admin` namespaces.
6. **Tooling**: Modern Rust observability utility replacing legacy PHP dumper.
**Files Changed**:
- `src/main.rs`: resident runtime loop, IPL logging, shutdown coordination, trace control.
- `src/brokers/dispatcher.rs`: shutdown operation handling, global shutdown listening.
- `src/brokers/mod.rs`: dispatcher pool accepts shutdown channels.
- `src/brokers/logger_store.rs`: root GUID chain fetch operations, structured logging helpers.
- `src/config/structs.rs`: `trace_on`, `logger_admin` config types.
- `src/bin/log_dumper.rs`: new modern observability utility (Axum web UI).
- `config/env_dev.toml`: dev overrides enabling trace/purge controls.
- `Cargo.toml` / `Cargo.lock`: added `axum`, `chrono`, `uuid` dependencies.
- Wiki updates: `Home.md`, `04-ipl.md`, `06-queue-topology.md`, `10-modernization-roadmap.md`, new `11-beds-architecture-visual-brief.md`.
**Next Phase**: Autoscaling heuristics, metric collection, and cross-node coordinator election (deferred).