Files
rustybeds/wiki/12-architecture-deltas.md
gramps 3c54635924 docs: comprehensive architecture delta record for hardening phase
Catalogs all architectural changes from resident runtime implementation:
- Runtime model: daemon-like process with coordinated shutdown
- Broker dispatch: shutdown operation integration
- Logger persistence: explicit IPL logging to MongoDB with root GUID lineage
- Developer diagnostics: chain tracing and web-based observability
- Config system: trace_on and logger_admin controls
- Observability utility: modern log_dumper web UI (replaces legacy PHP dumper)
- Operational safety: dev-only purge-on-IPL controls

Files modified: 13 (src/main.rs, brokers/*, config/*, bin/log_dumper.rs, Cargo.*, wiki/*)
Dependencies added: axum, chrono, uuid

See wiki/12-architecture-deltas.md for full details.
2026-04-10 17:12:01 -07:00

7.7 KiB

Architecture Deltas — Recent Hardening Phase

This document catalogs all architectural and design changes made in the recent hardening phase.

Runtime Model: Daemon-Like Resident Process

Status: Completed in src/main.rs

Change: Converted from startup-IPL-then-exit to a resident, coordinated-shutdown runtime.

Details:

  • IPL loads config, validates services, initializes broker pools, then enters an event loop.
  • Loop waits for either:
    • Global shutdown signal (broadcast from dispatcher when AMQP shutdown command received).
    • User interrupt (Ctrl+C).
  • On signal, loop cleanly shuts down Tokio tasks and exits with status code 0.

Why: Aligns with operational daemon expectations (systemd, orchestrators). Ensures graceful lifecycle rather than abrupt termination. Supports hot-reload/redeployment workflows.


Broker Dispatch: Unified Consumer with Shutdown Semantics

Status: Completed in src/brokers/dispatcher.rs and src/brokers/mod.rs

Change: Integrated shutdown command handling into the unified dispatcher consumer.

Details:

  • Dispatcher pool now receives a global shutdown_tx channel at spawn time.
  • Each dispatcher consumer listens for AMQP shutdown operation.
  • On shutdown: acknowledge the message, broadcast shutdown signal to all peers, and exit cleanly.
  • All dispatchers also listen on the global shutdown channel and exit if signaled externally.

Why: Enables coordinated, multi-node shutdown without forceful process kill. Aligns with AMQP message semantics (shutdown is a standard operation, not a runtime hack).


Logger: Explicit IPL Persistence to MongoDB

Status: Completed in src/main.rs and src/brokers/logger_store.rs

Change: IPL startup/failure events now explicitly persisted to msLogs collection with structured context.

Details:

  • Root GUID generated at IPL start; all startup events tagged with this root ID.
  • Structured log entries include:
    • root_event_id: chains all startup events to a single root.
    • timestamp: human-readable ISO 8601 format.
    • node_id: configured node name/role.
    • event_type: IPL phase (e.g., "ipl_start", "service_validated", "broker_pool_spawned", "ipl_complete").
    • message: human-readable summary.
    • metadata: optional structured context (validation results, latency, etc.).
  • If IPL fails, best-effort logging of failure event to Mongo before process exit.
  • After IPL success, showcase log-level examples (INFO, WARN, ERROR) for visibility.

Why: Startup is traditionally hardest to debug (logs often lost). Persistent, queryable startup context enables post-mortem analysis of deployment/initialization issues. Root GUID enables chain-crawl diagnostics across distributed startup events.


Developer Diagnostics: Root GUID Lineage and Chain Tracing

Status: Completed in src/brokers/logger_store.rs and src/bin/log_dumper.rs

Change: Added root GUID-based event chain tracing and query layer.

Details:

  • logger_store::fetch_chain(root_event_id, limit): retrieve all events tagged with a root ID, sorted by timestamp.
  • logger_store::fetch_root_record(root_event_id): retrieve the initiating root event.
  • log_dumper web UI exposes:
    • Root GUID input field to query and visualize entire event chain.
    • Single-record view at /record?root_event_id=... to inspect individual startup context.
    • Arrow-trigger UX for expanding compact row summaries without constant page reload.

Why: Enables developers to rapidly correlate events across a single startup sequence or transaction. Reduces manual log sifting. Scales from single node to multi-node deployments.


Configuration: Trace-On and Logger Admin Controls

Status: Completed in src/config/structs.rs and config/env_dev.toml

Change: Added two new config namespaces for developer and administrative control.

Details:

[runtime.trace_on]

  • Boolean flag (default: false in production, true in env_dev.toml).
  • When true, logs method entry/exit at TRACE level for all broker consumers and core trait implementations.
  • Enables dev to narrow causality in complex message flows without instrumenting code.

[logger_admin]

  • purge_on_ipl (boolean, default: false): on successful IPL, automatically purge named collections before startup logging begins.
  • purge_collections (array of strings): list of collection names to purge (e.g., ["msLogs", "msErrors"]).
  • Enables clean dev iteration: each cargo run in dev automatically resets logger state.

Why: Reduces friction in dev loops. Trace-on avoids printf debugging. Purge-on-IPL ensures each test iteration starts fresh without manual mongo CLI cleanup.


Observability Utility: Modern Logger Reader (log_dumper)

Status: Completed in src/bin/log_dumper.rs

Change: Built a modern Rust equivalent to legacy PHP utilities/dumper.php for browsing msLogs.

Details:

  • Web UI (Axum):

    • Dashboard route / with seed-write action, quick filter by level/node, root GUID chain input.
    • Compact row layout: timestamp | level | node | message snippet | arrow (expand).
    • Single-record view /record?root_event_id=... showing full event context.
    • Arrow-trigger expansion shows full message without full-page refresh.
  • Features:

    • Human-readable timestamps (ISO 8601 formatted).
    • Seed-write to create test events and validate logger pipeline.
    • Root chain traversal via GUID input.
    • Dev-centric UX: minimal clicks, maximum information density.

Why: Centralizes all observability into a single web interface. Replaces CLI-based manual querying. Makes startup diagnostics visible to entire team without MongoDB knowledge.


Operational Safety: Dev-Only Purge Controls

Status: Completed in src/main.rs and config system

Change: Added dev-only purge logic to reset logger collections on IPL in non-production environments.

Details:

  • IPL checks config.logger_admin.purge_on_ipl flag.
  • If true and node is not production, purges collections listed in config.logger_admin.purge_collections before logging startup events.
  • Prevents accidental production data loss (flag only honored in non-prod node roles).
  • env_dev.toml enables this by default for frictionless dev iteration.

Why: Closes dev/prod gap. Enables safe, repeatable testing without manual intervention. Prevents stale logger state from polluting diagnostics.


Commit Summary

This hardening phase encompasses:

  1. Runtime lifecycle: Daemon model, coordinated shutdown, graceful exit.
  2. Broker semantics: Shutdown operation integration, channel-based signaling.
  3. Logging infrastructure: Persistent IPL events, root GUID lineage, structured context.
  4. Developer experience: Trace control, purge controls, web-based observability.
  5. Configuration: New trace_on and logger_admin namespaces.
  6. Tooling: Modern Rust observability utility replacing legacy PHP dumper.

Files Changed:

  • src/main.rs: resident runtime loop, IPL logging, shutdown coordination, trace control.
  • src/brokers/dispatcher.rs: shutdown operation handling, global shutdown listening.
  • src/brokers/mod.rs: dispatcher pool accepts shutdown channels.
  • src/brokers/logger_store.rs: root GUID chain fetch operations, structured logging helpers.
  • src/config/structs.rs: trace_on, logger_admin config types.
  • src/bin/log_dumper.rs: new modern observability utility (Axum web UI).
  • config/env_dev.toml: dev overrides enabling trace/purge controls.
  • Cargo.toml / Cargo.lock: added axum, chrono, uuid dependencies.
  • Wiki updates: Home.md, 04-ipl.md, 06-queue-topology.md, 10-modernization-roadmap.md, new 11-beds-architecture-visual-brief.md.

Next Phase: Autoscaling heuristics, metric collection, and cross-node coordinator election (deferred).