Add MariaDB IPL validation, topology docs in beds.toml, and developer wiki

- Add MariaDB (REL) IPL validation — master required, secondary non-fatal
- Add RelNodeConfig / RelInstanceConfig structs with master/secondary pattern
- Add rel_services section to beds.toml and test fixture
- Add detailed topology commentary to beds.toml covering standalone,
  master/replica, Galera cluster, and multi-DB-per-node configurations
- Add developer wiki (wiki/) covering:
    - Origin story — PHP Namaste history, production record, why Rust
    - Architecture overview — full system diagram, all layers explained
    - The four nodes — appServer, admin, segundo, tercero with real-world context
    - IPL sequence — every step documented with rationale for ordering
    - Configuration system — layering, env selection, adding new sections
    - Queue topology — exchanges, routing keys, broker bindings, vhost isolation
    - Template system — REC/REL, TLA convention, cache map, warehousing
    - Event lineage — compound event IDs, parent/child tracking, msLogs schema
    - Glossary
- Update README with wiki index and MariaDB status

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-04 15:41:28 -07:00
parent 2ce87710ff
commit 2a9afe7d77
19 changed files with 1798 additions and 12 deletions

View File

@@ -88,6 +88,7 @@ rustybeds/
│ │ ├── mod.rs # Loader — load() and load_from() for testability │ │ ├── mod.rs # Loader — load() and load_from() for testability
│ │ └── structs.rs # Typed config structs (serde Deserialize) │ │ └── structs.rs # Typed config structs (serde Deserialize)
│ ├── amqp.rs # RabbitMQ transport — validate(), future channel/queue ops │ ├── amqp.rs # RabbitMQ transport — validate(), future channel/queue ops
│ ├── mariadb.rs # MariaDB transport — validate_all(), future adapter ops
│ ├── mongo.rs # MongoDB transport — validate_all(), future adapter ops │ ├── mongo.rs # MongoDB transport — validate_all(), future adapter ops
│ ├── lib.rs # Public API surface for integration test harness │ ├── lib.rs # Public API surface for integration test harness
│ ├── logging.rs # tracing + journald init │ ├── logging.rs # tracing + journald init
@@ -139,7 +140,7 @@ The `config` crate deep-merges these at startup. Only keys present in the env fi
| RabbitMQ reachability validation | Done | | RabbitMQ reachability validation | Done |
| Unit test scaffolding + config fixture pattern | Done | | Unit test scaffolding + config fixture pattern | Done |
| MongoDB reachability validation | Done | | MongoDB reachability validation | Done |
| MariaDB reachability validation | Next | | MariaDB reachability validation | Done |
| Shared filesystem validation | Next | | Shared filesystem validation | Next |
| AMQP channel / queue declaration | Planned | | AMQP channel / queue declaration | Planned |
| Broker pool (Tokio tasks) | Planned | | Broker pool (Tokio tasks) | Planned |
@@ -150,6 +151,22 @@ The `config` crate deep-merges these at startup. Only keys present in the env fi
--- ---
## Developer Wiki
Full framework documentation lives in [`wiki/`](wiki/Home.md):
- [Origin Story](wiki/01-origin-story.md) — Where BEDS came from and why it was built the way it was
- [Architecture Overview](wiki/02-architecture.md) — Full system design and core principles
- [The Four Nodes](wiki/03-nodes.md) — appServer, admin, segundo, tercero
- [IPL — Initial Program Load](wiki/04-ipl.md) — Bootstrap sequence, step by step
- [Configuration System](wiki/05-configuration.md) — Layered TOML, env files, topology options
- [Queue Topology](wiki/06-queue-topology.md) — AMQP exchanges, queues, routing keys
- [Template System](wiki/08-template-system.md) — REC and REL templates, TLA convention
- [Event Lineage](wiki/09-event-lineage.md) — Compound event IDs, parent/child tracking
- [Glossary](wiki/glossary.md) — Terms and abbreviations
---
## Performance Baseline ## Performance Baseline
The PHP predecessor achieved: The PHP predecessor achieved:

View File

@@ -32,6 +32,7 @@ rustybeds/
│ │ ├── mod.rs # load() + load_from() — layered TOML config │ │ ├── mod.rs # load() + load_from() — layered TOML config
│ │ └── structs.rs # Typed config structs (serde Deserialize) │ │ └── structs.rs # Typed config structs (serde Deserialize)
│ ├── amqp.rs # RabbitMQ transport — validate(), future channel/queue ops │ ├── amqp.rs # RabbitMQ transport — validate(), future channel/queue ops
│ ├── mariadb.rs # MariaDB transport — validate_all(), master/secondary pattern
│ ├── mongo.rs # MongoDB transport — validate_all(), future adapter ops │ ├── mongo.rs # MongoDB transport — validate_all(), future adapter ops
│ ├── lib.rs # Public API surface for integration test harness │ ├── lib.rs # Public API surface for integration test harness
│ ├── logging.rs # tracing + journald + console mirror init │ ├── logging.rs # tracing + journald + console mirror init

View File

@@ -1,41 +1,376 @@
# =============================================================================
# beds.toml — BEDS Base Configuration
# =============================================================================
#
# This is the base configuration file for a BEDS node. It contains
# production-safe defaults and is checked into source control. Sensitive
# values (passwords, hostnames) live in the env override file (env_dev.toml,
# env_qa.toml, env_prod.toml) which is NEVER committed.
#
# HOW LAYERING WORKS:
# -------------------
# BEDS loads this file first, then deep-merges the env override on top.
# Only keys present in the env file are overridden — everything else
# inherits from here. This means you can override a single password without
# duplicating the entire config.
#
# Set the BEDS_ENV environment variable to select the override file:
# BEDS_ENV=dev → loads config/env_dev.toml (default if unset)
# BEDS_ENV=qa → loads config/env_qa.toml
# BEDS_ENV=prod → loads config/env_prod.toml
#
# AUTHOR: mks
# VERSION: 1.0
#
# HISTORY:
# ========
# 2026-04-02 mks original coding
# 2026-04-04 mks added rel_services and rec_services sections
# added topology commentary blocks
# =============================================================================
# =============================================================================
# ROOT FLAGS
# =============================================================================
# debug: enables debug-level log output. Never true in production — the volume
# of output will bury real events. Your dev env file should set this to true.
debug = false debug = false
# syslog: routes log output to journald (systemd journal) instead of stdout.
# Set true in production so logs survive process restarts and are queryable
# with journalctl. Your dev env file will typically set this to false.
syslog = true syslog = true
# syslog_mirror_console: when syslog is true, also echo log output to the
# console. Useful during staging when you want both. In production on a
# headless server, set to false to avoid redundant output.
syslog_mirror_console = true syslog_mirror_console = true
# audit_on: master switch for the auditing micro-service. When true, BEDS
# records an audit trail for every declared auditable operation. The template
# controls what is audited — this flag is the global override. Set false
# during development to reduce noise.
audit_on = false audit_on = false
# journal_on: master switch for the journaling micro-service. When true, BEDS
# records a journal entry for every destructive operation (create, update,
# delete) on collections that declare journaling=true. Set false during
# development to reduce noise.
journal_on = false journal_on = false
# =============================================================================
# NODE IDENTITY
# =============================================================================
[id] [id]
# env_name: the name of this environment. Used in log output, routing keys,
# and the env-aware IPL error handling (non-fatal failures in non-production
# environments). Must be one of: development | qa | production
env_name = "production" env_name = "production"
# version: the BEDS framework version running on this node. Used in the node
# self-identification record and log output. Match this to your git tag.
version = "1.0" version = "1.0"
# wbid: your 2-character corporate identifier. Prepended to every MongoDB
# collection name to namespace your data. Example: wbid "ms" + collection
# "Users" = "msUsers". Choose something meaningful and keep it consistent
# across all nodes in your cluster.
wbid = "ms" wbid = "ms"
# =============================================================================
# RABBITMQ BROKER SERVICES
# =============================================================================
#
# BEDS is AMQP-first. Every data operation flows through RabbitMQ. No
# component in the application layer touches a database directly — ever.
# The broker layer is the heart of the framework.
#
# TOPOLOGY NOTE:
# --------------
# In a homelab or single-server setup, all nodes share one RabbitMQ instance.
# In production, you typically run RabbitMQ in a cluster. BEDS connects to
# a single broker endpoint per node role — if you're running a RabbitMQ
# cluster, point this at your load balancer or a HAProxy frontend, not
# directly at a cluster node.
#
# VHOST CONVENTION:
# -----------------
# Use a separate vhost per environment to keep traffic isolated:
# prod_ / prod → production
# qa_ / qa → QA / staging
# dev_ / dev → development
#
# The queue_tag prefix is prepended to every queue name so queues from
# different environments can coexist on the same broker without collision.
# =============================================================================
[broker_services] [broker_services]
# queue_tag: prefix applied to every queue name. Keeps envs isolated on a
# shared broker. Example: "prod_rec.read", "dev_rec.read"
queue_tag = "prod_" queue_tag = "prod_"
# vhost: the RabbitMQ virtual host for this environment. Create one vhost
# per env in the RabbitMQ management UI and grant your BEDS user access.
vhost = "prod" vhost = "prod"
# timer_violation: milliseconds before a broker round-trip is flagged as
# a slow operation. Violations are logged as warnings. Tune to your
# expected p95 latency. 3000ms is conservative for production.
timer_violation = 3000 timer_violation = 3000
# records_per_xfer: maximum records returned in a single broker response.
# Acts as a circuit breaker against runaway fetch queries. Override per
# collection in the template if you need larger transfers (e.g. migrations).
records_per_xfer = 5000 records_per_xfer = 5000
# keepalive: enables TCP keepalive on broker connections. Always true in
# production — brokers behind a firewall or NAT will drop idle connections
# without this.
keepalive = true keepalive = true
# heartbeat: AMQP heartbeat interval in seconds. RabbitMQ will close a
# connection that misses two consecutive heartbeats. 60 seconds is the
# recommended default. Do not set below 30.
heartbeat = 60 heartbeat = 60
# use_ssl: enables TLS for broker connections. Set true in production if
# your RabbitMQ instance is not on localhost or a trusted private network.
use_ssl = false use_ssl = false
# cert_path: path to TLS certificates when use_ssl = true. Ignored when
# use_ssl = false. Do not change this path unless you have a specific reason.
cert_path = "/etc/rabbitmq" cert_path = "/etc/rabbitmq"
# -----------------------------------------------------------------------------
# Broker node: app_server
# -----------------------------------------------------------------------------
# This is the primary application broker — the one your appServer node
# connects to for all client-facing CRUD operations.
#
# SINGLE BROKER INSTANCE (homelab / development):
# host = "localhost"
# port = 5672
#
# REMOTE BROKER (separate server):
# host = "192.168.1.50" # or DNS name: rmq.internal
# port = 5672
#
# RABBITMQ CLUSTER (point at HAProxy or load balancer — NOT a cluster node):
# host = "rmq-lb.internal"
# port = 5672
#
# rpi: Records Per Interval — throttle applied to this broker's fetch rate.
# Prevents a single broker from overwhelming the database. 50 is a
# conservative default; tune upward once you have baseline metrics.
# -----------------------------------------------------------------------------
[broker_services.app_server] [broker_services.app_server]
host = "localhost" host = "localhost"
port = 5672 port = 5672
api_port = 15672 api_port = 15672 # RabbitMQ management UI port — used for health checks
user = "beds" user = "beds"
pass = "changeme" pass = "changeme" # override in your env file — never commit real passwords
rpi = 50 rpi = 50
# Instance counts control how many concurrent broker tasks BEDS spawns per
# role. Scale these up as your throughput grows. A good starting point is
# 2 read brokers for every 1 write broker, since reads typically outnumber
# writes by a wide margin.
#
# r_broker: read brokers — handle all non-destructive fetch queries
# w_broker: write brokers — handle all create / update / delete operations
# m_broker: migration brokers — handle bulk data migration events (0 = disabled)
[broker_services.app_server.instances] [broker_services.app_server.instances]
r_broker = 2 r_broker = 2
w_broker = 2 w_broker = 2
m_broker = 0 m_broker = 0 # enable only during active data migrations
# =============================================================================
# MONGODB (REC) SERVICES
# =============================================================================
#
# REC (Record) collections live in MongoDB. BEDS uses MongoDB for document
# storage — schema-flexible, high-throughput, append-friendly collections
# like logs, events, user profiles, and audit records.
#
# One rec_services entry per BEDS node role that needs MongoDB access.
# In a single-node homelab setup you'll have one entry. In a distributed
# cluster you'll have one per service (app_server, admin, segundo, tercero)
# pointing at different databases or even different MongoDB instances.
#
# TOPOLOGY OPTIONS:
# -----------------
#
# 1. STANDALONE (homelab / development) — simplest setup, single mongod:
#
# [rec_services.app_server]
# host = "localhost"
# port = 27017
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
# use_ssl = false
#
# 2. REMOTE STANDALONE (mongod on a separate server):
#
# [rec_services.app_server]
# host = "192.168.1.60" # or mongohost.internal
# port = 27017
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
# use_ssl = true # strongly recommended over a network
#
# 3. REPLICA SET — 3-node minimum, provides automatic failover.
# Connect to the primary or use a connection string in the adapter.
# Point BEDS at the primary for now; the full replica set URI is
# handled at the adapter layer (not yet implemented):
#
# [rec_services.app_server]
# host = "mongo-primary.internal"
# port = 27017
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
# use_ssl = true
#
# 4. SHARDED CLUSTER — mongos router sits in front of the shards.
# Connect BEDS to the mongos endpoint, not to individual shards.
# The cluster topology is transparent to BEDS:
#
# [rec_services.app_server]
# host = "mongos.internal" # the mongos router — NOT a shard node
# port = 27019 # mongos default port is 27019
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
# use_ssl = true
#
# TIP: Want to run your admin node on a separate MongoDB database?
# Just add a second entry:
#
# [rec_services.admin]
# host = "localhost"
# port = 27017
# database = "beds_admin" # separate DB, same mongod
# ...
# =============================================================================
[rec_services.app_server] [rec_services.app_server]
host = "localhost" host = "localhost"
port = 27017 port = 27017
user = "beds" user = "beds"
pass = "changeme" pass = "changeme" # override in your env file
database = "beds_app" database = "beds_app"
use_ssl = false use_ssl = false
# =============================================================================
# MARIADB (REL) SERVICES
# =============================================================================
#
# REL (Relational) collections live in MariaDB. BEDS uses MariaDB for
# structured relational data — anything that benefits from SQL joins,
# transactions, or strict schema enforcement.
#
# Each rel_services entry has a master (required) and an optional secondary
# (read replica). BEDS routes write operations to the master and can route
# reads to the secondary to distribute load. Secondary failure is non-fatal
# — BEDS logs a warning and continues with master-only operation.
#
# TOPOLOGY OPTIONS:
# -----------------
#
# 1. STANDALONE (homelab / development) — single MariaDB instance.
# Set master and omit secondary, or point secondary at the same host
# to keep the config structure consistent:
#
# [rel_services.app_server.master]
# host = "localhost"
# port = 3306
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
#
# # secondary is optional — comment out if you have only one instance
# # [rel_services.app_server.secondary]
# # host = "localhost"
# # ...
#
# 2. PRIMARY + READ REPLICA — the most common production topology.
# MariaDB replication is asynchronous; secondary reads may be slightly
# behind primary writes. BEDS does not guarantee read-after-write
# consistency when reads are routed to the secondary:
#
# [rel_services.app_server.master]
# host = "db-primary.internal"
# port = 3306
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
#
# [rel_services.app_server.secondary]
# host = "db-replica.internal" # read replica — receives binlog from master
# port = 3306
# user = "beds_readonly" # read-only user is good practice here
# pass = "yourpassword"
# database = "beds_app"
#
# 3. GALERA CLUSTER — multi-master synchronous replication.
# All nodes accept writes. Point master at one node and secondary at
# another, or use a ProxySQL / HAProxy frontend and point both at it:
#
# [rel_services.app_server.master]
# host = "galera-node1.internal" # or proxysql.internal
# port = 3306
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
#
# [rel_services.app_server.secondary]
# host = "galera-node2.internal"
# port = 3306
# user = "beds"
# pass = "yourpassword"
# database = "beds_app"
#
# 4. SEPARATE DATABASE PER NODE ROLE — mirrors the PHP production setup
# where admin, segundo, and tercero each had their own database:
#
# [rel_services.app_server.master]
# database = "beds_app"
#
# [rel_services.admin.master]
# database = "beds_admin"
#
# [rel_services.segundo.master]
# database = "beds_warehouse"
#
# TIP: In a homelab running everything on one box, all four entries can
# point at localhost:3306 with different database names. MariaDB handles
# the namespace separation — BEDS just needs a valid connection.
# =============================================================================
[rel_services.app_server.master]
host = "localhost"
port = 3306
user = "beds"
pass = "changeme" # override in your env file
database = "beds_app"
[rel_services.app_server.secondary]
host = "localhost"
port = 3306
user = "beds"
pass = "changeme" # override in your env file
database = "beds_app"

View File

@@ -26,7 +26,7 @@
//! * `2026-04-02` - mks - refactored into load() + load_from() for testability //! * `2026-04-02` - mks - refactored into load() + load_from() for testability
mod structs; mod structs;
pub use structs::{BedsConfig, BrokerServicesConfig, RecNodeConfig}; pub use structs::{BedsConfig, BrokerServicesConfig, RecNodeConfig, RelNodeConfig, RelInstanceConfig};
use config::{Config, File, FileFormat}; use config::{Config, File, FileFormat};

View File

@@ -11,6 +11,7 @@ pub struct BedsConfig {
pub journal_on: bool, pub journal_on: bool,
pub broker_services: BrokerServicesConfig, pub broker_services: BrokerServicesConfig,
pub rec_services: HashMap<String, RecNodeConfig>, pub rec_services: HashMap<String, RecNodeConfig>,
pub rel_services: HashMap<String, RelNodeConfig>,
} }
#[derive(Debug, Deserialize)] #[derive(Debug, Deserialize)]
@@ -60,3 +61,18 @@ pub struct RecNodeConfig {
pub database: String, pub database: String,
pub use_ssl: bool, pub use_ssl: bool,
} }
#[derive(Debug, Deserialize)]
pub struct RelNodeConfig {
pub master: RelInstanceConfig,
pub secondary: Option<RelInstanceConfig>,
}
#[derive(Debug, Deserialize, Clone)]
pub struct RelInstanceConfig {
pub host: String,
pub port: u16,
pub user: String,
pub pass: String,
pub database: String,
}

View File

@@ -16,4 +16,5 @@
pub mod amqp; pub mod amqp;
pub mod config; pub mod config;
pub mod logging; pub mod logging;
pub mod mariadb;
pub mod mongo; pub mod mongo;

View File

@@ -24,6 +24,7 @@
mod amqp; mod amqp;
mod config; mod config;
mod logging; mod logging;
mod mariadb;
mod mongo; mod mongo;
/// Executes the BEDS Initial Program Load (IPL) sequence. /// Executes the BEDS Initial Program Load (IPL) sequence.
@@ -81,6 +82,18 @@ fn ipl() -> Result<(), String> {
} }
} }
// validate MariaDB reachability — fatal in production, non-fatal in all other envs
// secondary instance failures are always non-fatal (handled inside validate_all)
match mariadb::validate_all(&cfg.rel_services) {
Ok(()) => tracing::info!("MariaDB reachable"),
Err(e) => {
if cfg.id.env_name == "production" {
return Err(e);
}
tracing::warn!("MariaDB unreachable (non-fatal in {}): {}", cfg.id.env_name, e);
}
}
Ok(()) Ok(())
} }

122
src/mariadb.rs Normal file
View File

@@ -0,0 +1,122 @@
//! # mariadb.rs — MariaDB (REL) Transport Layer
//!
//! Manages all MariaDB interactions for the BEDS node. At IPL, validates that
//! the master instance of each configured REL service node is reachable before
//! the node proceeds. The secondary instance is optional — its absence or
//! unreachability is logged but never fatal.
//!
//! Future phases will add connection pooling, authentication, and query
//! dispatch via the adapter layer.
//!
//! ## Calling Agents
//! - `ipl()` in main.rs — calls `validate_all()` during the IPL sequence
//!
//! ## Inputs
//! - `HashMap<String, RelNodeConfig>` from the loaded BEDS configuration
//!
//! ## Outputs
//! - `Ok(())` if all configured REL master nodes are reachable
//! - `Err(String)` with node name, host:port, and OS error on first master failure
//!
//! **Author:** mks
//! **Version:** 1.0
//!
//! ## History
//! * `2026-04-04` - mks - original coding
use std::collections::HashMap;
use std::net::TcpStream;
use std::time::Duration;
use crate::config::{RelInstanceConfig, RelNodeConfig};
/// Validates that all configured MariaDB master nodes are reachable.
///
/// Iterates every entry in the `rel_services` config block and validates the
/// master instance. Secondary instances are checked but their failure is
/// non-fatal — a missing or unreachable secondary is logged as a warning.
/// Fails on the first unreachable master.
///
/// # Arguments
///
/// * `nodes` — map of service name → `RelNodeConfig` from `BedsConfig`
///
/// # Returns
///
/// `Ok(())` if every master node responds to a TCP connect within the timeout.
/// `Err(String)` with the service name and address of the first master failure.
///
/// # History
///
/// * `2026-04-04` - mks - original coding
pub fn validate_all(nodes: &HashMap<String, RelNodeConfig>) -> Result<(), String> {
for (name, node) in nodes {
// master is required — failure is propagated to the caller
validate(&format!("{}.master", name), &node.master)?;
// secondary is optional — log absence but do not fail
if let Some(secondary) = &node.secondary {
if let Err(e) = validate(&format!("{}.secondary", name), secondary) {
tracing::warn!("MariaDB secondary unreachable (non-fatal): {}", e);
}
}
}
Ok(())
}
/// Validates that a single MariaDB instance is reachable.
///
/// # Arguments
///
/// * `label` — descriptive label for error messages (e.g. "app_server.master")
/// * `instance` — `RelInstanceConfig` for this instance
///
/// # Returns
///
/// `Ok(())` if the TCP handshake succeeds within 5 seconds.
/// `Err(String)` with a descriptive message on failure.
///
/// # History
///
/// * `2026-04-04` - mks - original coding
pub fn validate(label: &str, instance: &RelInstanceConfig) -> Result<(), String> {
let addr_str = format!("{}:{}", instance.host, instance.port);
let addr: std::net::SocketAddr = addr_str
.parse()
.map_err(|e| format!("Invalid MariaDB address for rel_services.{} ({}): {}", label, addr_str, e))?;
TcpStream::connect_timeout(&addr, Duration::from_secs(5))
.map_err(|e| format!("MariaDB unreachable at rel_services.{} ({}): {}", label, addr_str, e))?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use crate::config::load_from;
fn test_cfg() -> crate::config::BedsConfig {
load_from("tests/fixtures/beds_test.toml", "")
.expect("test fixture beds_test.toml failed to load")
}
#[test]
fn validate_err_on_closed_port() {
let cfg = test_cfg();
let node = cfg.rel_services.get("app_server").unwrap();
let mut bad = node.master.clone();
bad.port = 1;
assert!(validate("app_server.master", &bad).is_err());
}
#[test]
fn validate_err_on_bad_address() {
let cfg = test_cfg();
let node = cfg.rel_services.get("app_server").unwrap();
let mut bad = node.master.clone();
bad.host = "not_a_valid_host!!!".to_string();
assert!(validate("app_server.master", &bad).is_err());
}
}

View File

@@ -57,3 +57,17 @@ user = "beds"
pass = "changeme" pass = "changeme"
database = "beds_test" database = "beds_test"
use_ssl = false use_ssl = false
[rel_services.app_server.master]
host = "127.0.0.1"
port = 3306
user = "beds"
pass = "changeme"
database = "beds_test"
[rel_services.app_server.secondary]
host = "127.0.0.1"
port = 3306
user = "beds"
pass = "changeme"
database = "beds_test"

67
wiki/01-origin-story.md Normal file
View File

@@ -0,0 +1,67 @@
# Origin Story
## The Problem That Started Everything
In 2017, a PHP backend framework called **Namaste** was built for Giving Assistant, a charitable shopping platform based in California. The business had a deceptively simple technical problem: a single application server handling thousands of concurrent user sessions, all hammering a MySQL database through a conventional ORM layer.
The ORM was the problem. Every request spawned its own database connection, held it open for the duration of the request lifecycle, and released it on completion — if it completed cleanly. Memory leaks accumulated. Under load, the connection pool exhausted. Queries that could have been satisfied by a cached result went to the database anyway. There was no circuit breaker, no backpressure, no way to distinguish a read that could tolerate slight staleness from a write that could not.
The standard PHP answer — throw more servers at it — was tried. It worked until it didn't. Horizontal scaling moved the bottleneck from the web tier to the database tier without solving the underlying architectural problem.
## The Design Decision
The core insight was this: **the application layer should never touch the database directly**. Not through an ORM, not through a raw PDO connection, not through any mechanism that gives the application layer visibility into the database topology.
Everything goes through a message broker. A client request hits the application layer, gets packaged as an AMQP event, and is dispatched to a queue. A broker process — completely independent of the web tier — picks it up, executes the database operation, and routes the result back. The application layer never waits on a database connection. It waits on a message.
This had several consequences that turned out to be features:
**Decoupling.** The web tier and the database tier became operationally independent. A slow database didn't block the web tier — it built a queue. The queue was observable, manageable, and bounded.
**Backpressure.** The broker pool was the throttle. You could tune how many concurrent database operations ran by adjusting broker instance counts in a config file, without touching a line of code.
**Database agnosticism.** Because the application layer never called the database directly, the database could be swapped. The same broker call that hit MySQL could be routed to MongoDB by changing a template config. This wasn't theoretical — it was used in production to migrate collections from MySQL to MongoDB without application downtime.
**Planned obsolescence.** PHP worker processes leak memory. This is a known, accepted fact in PHP production operations. The conventional solution is to restart workers periodically — the infamous `SIGCHLD` dance. In Namaste, broker processes were intentionally designed to accept a kill signal, complete their in-flight work, and exit gracefully. A supervisor process immediately spawned a replacement. Memory leaks were managed by design, not fought against.
## The Name
**Namaste** was an internal codename. The framework was formally called **BEDS** — Back End Data System. The name Namaste stuck in the codebase because it was the class prefix (`gaaNamasteCore`, `gacMongoDB`, etc.) and changing it would have broken too many things too early.
When the Rust rewrite began, the codebase was renamed **rustybeds** — a nod to both the language and the framework's history.
## Production History
Namaste ran in production at Giving Assistant from mid-2017. At peak it handled **40,000+ transactions per second** on a single application server node. Round-trip latency from Baja California to West Virginia — across dozens of concurrent database fanout calls per transaction — was consistently **~200 milliseconds**.
It ran for **952 days without an unplanned outage**.
The framework was later deployed in a different configuration at **Pathway Genomics** in California, where the `tercero` node handled user and session management for a patient portal. The separation of PII and PHI from user records — a compliance requirement — was implemented as a configuration choice, not a code change. The `tercero` node ran against a separate database with separate credentials, isolated by AMQP routing.
## The PHP Codebase
The PHP implementation lives in the `namaste` repository. It is the authoritative reference for BEDS architecture and should be consulted when the *intent* behind a design decision is unclear. The Rust rewrite does not copy PHP code — it reimplements the same architecture with Rust's type system, async runtime, and zero-cost abstractions.
Key reference files in the PHP codebase:
| File | What it shows |
|---|---|
| `config/namaste.xml` | Full production config structure — the gold standard for what config covers |
| `config/env.admin.xml` | Admin node env override — shows how node-specific config layering works |
| `classes/templates/gatTestMongo.class.inc` | Canonical REC template — the pattern every data domain follows |
| `common/errorCatalog.php` | Log level constants and integer values — replicated in BEDS Rust |
| `common/functions.php` | `consoleLog` format — the console output format BEDS Rust follows |
| `scripts/startBrokers.php` | Broker startup sequence — the origin of the IPL concept |
| `common/dbCatalog.php` | TLA naming convention — confirmed source of the three-letter abbreviation system |
## Why Rust
The PHP implementation worked. The decision to rewrite in Rust was not driven by a production failure — it was driven by what the framework could become:
1. **Memory leaks, eliminated.** Tokio async tasks do not leak. The `SIGCHLD` planned-obsolescence pattern becomes unnecessary.
2. **Throughput ceiling, raised.** PHP on a single process is fundamentally limited. Rust async on a multi-core machine is not. The expectation is a 510x throughput improvement on equivalent hardware.
3. **Single binary deployment.** No PHP interpreter, no extension dependencies, no version conflicts. One binary, copy it to the server, run it.
4. **IP protection.** A compiled binary does not expose source code on deployment.
5. **AI layer.** Phase 2 of BEDS Rust includes an AI-driven database object generation layer — a DBA describes a data domain in natural language and the AI generates the schema, stored procedures, and BEDS template. This is the primary market differentiator and was not feasible in PHP.
The architecture is proven. The Rust rewrite exists to go further.

160
wiki/02-architecture.md Normal file
View File

@@ -0,0 +1,160 @@
# Architecture Overview
## The Central Principle
**BEDS is AMQP-first. No component in the application layer ever touches a database directly. Ever.**
This is not a guideline. It is the architectural constraint that makes everything else possible. If you find yourself writing code that calls a database adapter directly from outside the broker layer, you are breaking the framework.
## System Diagram
```
External Client
│ HTTP / WebSocket / REST
┌─────────────┐
│ appServer │ ← your application logic lives here
│ node │
└──────┬──────┘
│ AMQP event (routing key: rec.write, rel.read, log, etc.)
┌─────────────────────────────────┐
│ RabbitMQ Broker │
│ │
│ Exchange: beds.events (topic) │
│ Exchange: beds.logs (topic) │
└──────┬──────────────────┬───────┘
│ │
▼ ▼
┌────────────┐ ┌────────────┐
│ Broker │ │ admin │
│ Pool │ │ node │
│ (Tokio │ │ │
│ tasks) │ │ logging │
└──────┬─────┘ │ auditing │
│ │ metrics │
▼ └─────┬──────┘
┌────────────┐ │
│ Factory │ ▼
│ Dispatch │ ┌────────────┐
└──────┬─────┘ │ MongoDB │
│ │ msLogs │
▼ └────────────┘
┌────────────────────────┐
│ NamasteCore Trait │
│ (unified CRUD iface) │
└──────┬─────────────────┘
├──────────────────────────┐
▼ ▼
┌────────────┐ ┌────────────┐
│ MongoDB │ │ MariaDB │
│ Adapter │ │ Adapter │
│ (REC) │ │ (REL) │
└──────┬─────┘ └──────┬─────┘
│ │
▼ ▼
┌────────────┐ ┌────────────┐
│ MongoDB │ │ MariaDB │
│ Collections│ │ Tables / │
│ │ │ Procs / │
│ │ │ Views │
└────────────┘ └────────────┘
```
## Layers
### 1. Transport Layer (AMQP)
RabbitMQ is the backbone. All inter-component communication flows through it. This includes:
- Client data requests (read, write, update, delete)
- Log events from all nodes
- Audit records
- Migration jobs
- Warehouse operations
The transport layer knows nothing about databases. It routes messages. That is all.
### 2. Broker Pool (Tokio tasks)
Each node runs a pool of async broker tasks. Each task listens on one queue, processes one message at a time, and routes the result back via AMQP. The pool size is configured per broker type in `beds.toml`.
The broker pool is the throttle for the entire system. By adjusting instance counts, you control how many concurrent database operations the node performs — without changing a line of code.
Broker tasks are supervised. A panicked task is logged and replaced. The pool does not shrink on failure.
### 3. Factory Dispatch
A broker task receives an event containing a template name (e.g. `"Users"`, `"Sessions"`). The factory maps that name to the correct adapter — MongoDB for REC templates, MariaDB for REL templates. The factory does not know which template will be requested at compile time; dispatch is runtime.
### 4. NamasteCore Trait
The unified CRUD interface. Every database adapter implements it. Every template is a struct that selects an adapter and delegates to it. The application layer calls `NamasteCore` methods — it never calls adapter methods directly.
```rust
pub trait NamasteCore {
async fn create_record(&self, payload: &Payload) -> Result<Response, BedsError>;
async fn fetch_records(&self, query: &Query) -> Result<Vec<Response>, BedsError>;
async fn update_record(&self, payload: &Payload) -> Result<Response, BedsError>;
async fn delete_record(&self, id: &str) -> Result<Response, BedsError>;
}
```
### 5. Database Adapters
Two adapters, one interface:
- **REC adapter** — MongoDB. Document store. Schema-flexible. High-throughput appends. Used for logs, events, user profiles, audit records, anything that benefits from document structure.
- **REL adapter** — MariaDB. Relational store. SQL joins, transactions, strict schema. Used for anything that benefits from referential integrity.
Adapters do not write SQL or MongoDB queries. They call named database objects — stored procedures, views, functions — that the DBA owns. The adapter layer calls the object by name and passes parameters. It does not construct queries.
### 6. DBA-Owned Schema
The application layer never writes a query. All data access goes through named database objects. This is the separation of concerns that made Namaste maintainable across years and multiple development teams.
Adding a new data domain means:
1. DBA writes the schema (table/collection, views, stored procedures)
2. Developer writes a BEDS template (a TOML config file)
3. BEDS generates the adapter binding
Nothing else changes.
## The CALGON Pattern
Some operations cannot return an immediate result — long-running aggregations, migration jobs, warehouse operations. BEDS handles these with the **CALGON** pattern (async ticket):
1. Client submits a request
2. BEDS immediately returns a GUID ticket
3. The operation executes asynchronously
4. Client polls with the GUID to retrieve the result when ready
The client is never blocked on a long operation. The broker absorbs the work. This is the same pattern used by every major async job queue system, implemented natively in BEDS.
## Event Lineage
Every BEDS event carries a compound identifier:
```
event_id = "{node}.{env}.{guid}" # e.g. "ms.production.a1b2c3d4..."
parent_id = "" # empty string if this is a root event
depth = 0 # levels from the root event
```
A root event (an incoming client request) has `depth=0` and no parent. Every event it spawns (database calls, log events, audit records) carries the root's `event_id` as its `parent_id` and increments `depth`. This creates a complete, queryable tree of every operation triggered by a single client request.
Event lineage is how you answer "what actually happened when request X came in?" — without distributed tracing infrastructure.
## Configuration Drives Everything
BEDS has no node types in code. All nodes run the same binary. The configuration file determines:
- Which services this node runs (`is_local` per service)
- How many brokers of each type to spawn
- Which databases to connect to
- Whether this node is in production mode (fatal IPL failures) or development mode (non-fatal)
Changing a node's role means changing its config file and restarting. No code changes. No redeployment.

167
wiki/03-nodes.md Normal file
View File

@@ -0,0 +1,167 @@
# The Four Nodes
BEDS defines four node roles. All nodes run the same binary — role is determined entirely by configuration. In a homelab or development environment, all four roles run on a single machine. In production, they typically run on separate servers.
The `isLocal` flag in the env config file is the declaration: "this service runs on this physical machine." Brokers are only started for services declared as local.
---
## appServer
**The primary application node.** This is where your business logic lives. In the PHP implementation this was also called "namaste" — the application layer that handled all client-facing CRUD operations.
### Responsibilities
- Receives all incoming client requests via AMQP
- Dispatches to the factory layer for database operations
- Returns results to clients
### Broker Types
| Broker | Queue | Purpose |
|---|---|---|
| `rBroker` | `rec.read`, `rel.read` | Non-destructive fetch queries |
| `wBroker` | `rec.write`, `rel.write` | Create / update / delete operations |
| `mBroker` | `rec.obj`, `rel.obj` | Migration and bulk transfer events — disabled by default |
### Databases
- MongoDB: primary application document store
- MariaDB: primary relational store
### Real-world deployment note
In the Giving Assistant production deployment, appServer handled 40,000+ transactions per second on a single node. The broker pool absorbed burst traffic; the queue was the backpressure mechanism. When the database was slow, the queue grew — it did not drop requests.
---
## admin
**The administrative and observability node.** This is the most critical node in the cluster from an operations standpoint. It is the logger, the auditor, the metrics collector, and the system health monitor.
All other nodes route their log events to admin over AMQP. Admin is the single point of truth for what happened in the cluster.
### Responsibilities
- Receives and persists all log events from all nodes
- Routes log events to syslog when configured
- Records audit trails for auditable operations
- Collects and publishes performance metrics and timer data
- Handles administrative AMQP events (node management, config reloads)
### Broker Types
| Broker | Queue | Purpose |
|---|---|---|
| `adminBrokerIn` | `adm` | Inbound administrative events |
| `adminBrokerOut` | `adm` | Outbound administrative responses |
| `adminLogsBroker` | `log` | Log events from all nodes |
| `adminSyslogBroker` | `log` | Syslog routing for log events |
| `adminGraphBroker` | `log` | Metrics and graph data collection |
### Databases
- MongoDB: `msLogs` collection (log event store), audit records
- MariaDB: administrative relational data
### Important: admin is the logger
Non-admin nodes do not write logs directly to MongoDB. They publish log events to the `log` exchange over AMQP. Admin consumes them and writes to `msLogs`. This means:
- If admin is down, log events queue in RabbitMQ — they are not lost
- If MongoDB is down on admin, the queue backs up until it recovers
- No other node needs a direct MongoDB connection for logging
This design was battle-tested: in the Namaste homelab, the admin node was run on a Raspberry Pi to deliberately stress-test the queue backlog behaviour. The Pi was slower than the appServer — logs queued during spikes and drained during lulls. Nothing dropped.
---
## segundo
**The warehousing and cool storage node.** Segundo handles the data lifecycle — moving records from HOT (live production) storage to COOL (warehoused) storage on a defined schedule.
"Segundo" is Spanish for "second" — this was the second node added to the framework after appServer, originally to handle the warehousing workload that was creating performance problems in the primary database.
### Responsibilities
- Automated warehousing — moves eligible records from HOT to COOL storage on a schedule
- On-demand warehousing — responds to explicit warehouse requests
- Manages COOL storage (warehoused data that maintains schema and indexing)
- Data migration support
### Broker Types
| Broker | Queue | Purpose |
|---|---|---|
| `whBroker` | `mig` | Warehouse operations — scheduled and on-demand |
| `cBroker` | `mig` | Consolidation broker — bulk data operations |
### Databases
- MongoDB: COOL storage document collections
- MariaDB: `beds_warehouse` — warehoused relational data
### HOT / COOL / COLD storage model
| Tier | Description | Index changes | Schema changes |
|---|---|---|---|
| HOT | Live production data | No | No |
| COOL | Warehoused, full schema preserved | Allowed | Allowed |
| COLD | Archived, reformatted (typically CSV) | N/A | N/A |
| WARM | Being restored from COLD to HOT | In progress | In progress |
---
## tercero
**The user and session management node.** Tercero was the third node added to the framework, originally driven by a compliance requirement at Pathway Genomics in California.
"Tercero" is Spanish for "third."
### The compliance backstory
Pathway Genomics ran a patient portal for genetic test kits. Patient data included both PII (Personally Identifiable Information) and PHI (Protected Health Information under HIPAA). The compliance requirement was clear: PII and PHI must be physically separated — different databases, different credentials, different access controls.
The solution was to route all user and session data through a dedicated node (tercero) with its own MongoDB instance and MariaDB database. The appServer node never touched the user database directly. It sent AMQP events to tercero and received session tokens back.
This is the canonical demonstration of BEDS' separation-of-concerns design: a compliance requirement that would have required significant application refactoring in a conventional architecture was implemented as a configuration choice.
### Responsibilities
- User record management (registration, profile updates, deactivation)
- Session management (login, logout, session validation, expiry)
- Authentication token lifecycle
### Broker Types
| Broker | Queue | Purpose |
|---|---|---|
| `uBroker` | `rec.read`, `rec.write` | User record operations |
| `sBroker` | `rec.read`, `rec.write` | Session record operations |
### Databases
- MongoDB: `msUsers` (user profiles), `msSessions` (session records)
- MariaDB: `beds_users` — relational user data where joins are needed
---
## Node Configuration in Practice
In `beds.toml`, all four nodes share the same RabbitMQ instance but connect to different queues. The env file declares which services are local to this machine:
```toml
# env_dev.toml — all four on one machine (development)
[app_server]
is_local = true
[admin]
is_local = true
[segundo]
is_local = true
[tercero]
is_local = true
```
```toml
# env_prod.toml — dedicated servers (production)
[app_server]
is_local = true # this file lives on the appServer machine
[admin]
is_local = false # admin runs on a separate server
[segundo]
is_local = false # segundo runs on a separate server
[tercero]
is_local = false # tercero runs on a separate server
```
The binary on each server reads the same `beds.toml` base config but a different env file, which tells it which role to assume.

148
wiki/04-ipl.md Normal file
View File

@@ -0,0 +1,148 @@
# IPL — Initial Program Load
## What Is IPL?
IPL (Initial Program Load) is the BEDS bootstrap sequence. The term comes from IBM mainframe terminology — the process of loading the operating system from disk into memory and starting it. BEDS borrows the term because the concept is identical: a strict, ordered sequence of steps that must all succeed before the node is considered operational.
`ipl()` is the first function called from `main()`. If IPL completes successfully, the node is green and enters its operational state. If any required step fails, IPL aborts and the process exits with a console error report.
## Why Order Matters
The IPL sequence is not arbitrary. Each step depends on the previous one:
1. **Configuration must load first** — every subsequent step reads from it
2. **Logging must initialize second** — every subsequent step may emit log events
3. **RabbitMQ must be reachable third** — it is the transport for everything, including log event routing to admin
4. **MongoDB must be reachable fourth** — it is the log persistence store on the admin node, and the primary document store on appServer
5. **MariaDB must be reachable fifth** — it is the relational store; non-critical in dev but required in production
You cannot initialize logging before loading config because the log destination (syslog vs console, mirror settings) is in the config. You cannot validate RabbitMQ before initializing logging because you need logging to report the result. The order is a dependency chain, not a preference.
## The IPL Sequence
### Step 1: Load Configuration
```rust
let cfg = config::load().map_err(|e| format!("Failed to load config: {}", e))?;
```
Loads `config/beds.toml` as the base, then merges `config/env_{BEDS_ENV}.toml` on top. The `?` operator short-circuits on failure — if the config cannot be loaded, nothing else runs. This is the only step that is always fatal in every environment, including development. A node without a valid config cannot make any correct decision about anything.
**Why fatal everywhere:** A missing config is not a recoverable error. It means the node cannot know what it is, where its services are, or how to behave. Continuing would produce undefined behaviour. Fail fast, fail loudly.
### Step 2: Initialize Logging
```rust
logging::init_from_config(cfg.syslog, cfg.syslog_mirror_console);
```
Initializes the `tracing` subscriber with journald and/or console output based on config flags. This must happen before any `tracing::info!` / `tracing::warn!` / `tracing::error!` calls — the tracing macros are no-ops until a subscriber is registered.
**Why second:** Config is loaded. Logging destination is known. Every step from here on can emit structured log output.
**Note on log routing:** At this point, log output goes to the local console and/or journald. Log events are not yet routed to the admin node's MongoDB `msLogs` collection — that requires RabbitMQ to be up (Step 3). Local logging is the fallback that covers the gap between process start and AMQP connectivity.
### Step 3: Validate RabbitMQ
```rust
match amqp::validate(&cfg.broker_services) { ... }
```
Opens a TCP connection to the configured RabbitMQ broker host and port. Does not authenticate or open an AMQP channel — reachability only. The connection is immediately closed.
**Why RabbitMQ first among services:** RabbitMQ is the transport for all inter-node communication, including log event routing. If RabbitMQ is unreachable, the node cannot communicate with the rest of the cluster at all. It cannot send logs to admin, receive work events, or return results. Validating it before other services establishes that the backbone is up.
**Environment-aware failure handling:**
- `production`: unreachable broker is fatal — the node cannot function
- all other environments: unreachable broker is a warning — IPL continues so developers can work on other components without a running broker
### Step 4: Validate MongoDB
```rust
match mongo::validate_all(&cfg.rec_services) { ... }
```
Opens a TCP connection to each configured MongoDB node. One entry per BEDS service role (app_server, admin, segundo, tercero) that has a `rec_services` config entry.
**Why MongoDB before MariaDB:** MongoDB is the log persistence store. On the admin node, it is where `msLogs` lives. On appServer, it is the primary document store for high-throughput collections. It is typically more critical to the core data path than MariaDB, which tends to hold relational reference data.
**Environment-aware failure handling:** Same pattern as RabbitMQ — fatal in production, warning in development.
### Step 5: Validate MariaDB
```rust
match mariadb::validate_all(&cfg.rel_services) { ... }
```
Opens a TCP connection to the master instance of each configured MariaDB node. The secondary (read replica) is also checked, but secondary failure is always non-fatal — BEDS logs a warning and operates in master-only mode.
**Why secondary failure is always non-fatal:** A missing or unreachable read replica is a degraded state, not a broken state. The node can still serve all operations through the master. Failing hard on a missing replica would cause unnecessary outages during replica maintenance windows.
**Environment-aware failure handling:** Master failure is fatal in production, warning in development. Secondary failure is a warning in all environments.
### Step N (not yet implemented): Shared Filesystem Validation
Validates that the configured shared filesystem path (`/dev/shm` or equivalent) exists and is writable. Used for inter-process communication and temporary file operations.
### Step N+1 (not yet implemented): Node Self-Identification
The node writes its identity record — role, capabilities, env, timestamp — to the `msNodes` collection. This enables topology visibility for operations tooling. It is not a dependency for the core data path.
### Final Step: Node Green
```
tracing::info!("BEDS IPL complete — node green");
```
All required services are reachable. The node enters its operational state and begins processing AMQP events.
## IPL Failure Handling
### In Production
Any required service failure is fatal:
```
[BEDS] [FATAL] [IPL] RabbitMQ unreachable at broker_services.app_server (localhost:5672): connection refused
```
Process exits with code 1. The supervisor (systemd, Docker, whatever manages the process) should restart with backoff.
### In Development
Service failures are warnings. IPL completes regardless:
```
WARN RabbitMQ unreachable (non-fatal in development): connection refused
```
This allows a developer to work on, say, the MariaDB adapter without needing a running RabbitMQ instance. The tradeoff is that a dev node may start in a degraded state — the developer is expected to notice the warnings.
## The `ipl()` Function
`ipl()` lives in `src/main.rs`. It returns `Result<(), String>`. Errors are plain strings — the IPL failure message is written directly to stderr with `eprintln!` before `process::exit(1)`, because at the point of a fatal IPL failure, the logging system may not be fully operational.
`main()` is intentionally minimal:
```rust
fn main() {
if let Err(e) = ipl() {
eprintln!("[BEDS] [FATAL] [IPL] {}", e);
std::process::exit(1);
}
}
```
All logic is in `ipl()`. `main()` exists only to handle the fatal exit path.
## Future IPL Steps
As BEDS matures, the IPL sequence will grow. Expected additions in order:
1. Shared filesystem validation
2. Node role determination (which services are `is_local`)
3. Broker pool startup (spawn Tokio tasks per broker type)
4. Queue and exchange declaration (assert topology on RabbitMQ)
5. Node self-identification (write identity record to MongoDB)
6. Signal handler registration (SIGTERM, SIGINT for graceful shutdown)
7. Node green — begin processing events

170
wiki/05-configuration.md Normal file
View File

@@ -0,0 +1,170 @@
# Configuration System
## Design Philosophy
BEDS configuration follows two rules:
1. **The base file is always safe to commit.** It contains structure and production-safe defaults. No real passwords, no real hostnames. It is the documentation of what the config looks like.
2. **The env file is never committed.** It contains the real values for a specific environment. It lives on the server and is gitignored. If the env file is lost, you rebuild it — you never recover it from git history.
This is the same pattern used in the PHP Namaste framework from day one. The `namaste.xml` base file was committed. The `env.xml` override was not. The habit was intentional: committing credentials to source control, even a private repo, is a category of mistake that ends careers.
## File Locations
```
config/
├── beds.toml ← committed — base config, safe defaults
├── env_dev.toml ← gitignored — development overrides
├── env_qa.toml ← gitignored — QA / staging overrides
└── env_prod.toml ← gitignored — production overrides
```
## Environment Selection
The `BEDS_ENV` environment variable selects which override file to load:
```bash
BEDS_ENV=dev cargo run # loads env_dev.toml (default if unset)
BEDS_ENV=qa ./rustybeds # loads env_qa.toml
BEDS_ENV=prod ./rustybeds # loads env_prod.toml
```
If `BEDS_ENV` is unset, `dev` is assumed. This means a freshly cloned repo with no env file runs in dev mode — which is the correct safe default.
## How Layering Works
The `config` crate performs a deep merge. Only keys present in the env file override the base. Everything else inherits from `beds.toml`.
Example — overriding only the broker password and env name:
```toml
# env_dev.toml — only what differs from beds.toml
[id]
env_name = "development"
[broker_services.app_server]
pass = "my-dev-rabbitmq-password"
```
Every other value — host, port, vhost, instance counts — is inherited from `beds.toml`. You do not need to repeat the full config in the env file.
## Configuration Sections
### Root Flags
```toml
debug = false # debug-level log output
syslog = true # route logs to journald
syslog_mirror_console = true # also echo to console when syslog=true
audit_on = false # global auditing master switch
journal_on = false # global journaling master switch
```
### `[id]` — Node Identity
```toml
[id]
env_name = "production" # development | qa | production
version = "1.0" # match your git release tag
wbid = "ms" # 2-char corporate identifier — prefixed to all collection names
```
The `wbid` is permanent. Once your MongoDB collections are created with a given `wbid`, changing it means renaming every collection and updating every template. Choose carefully.
### `[broker_services]` — RabbitMQ
```toml
[broker_services]
queue_tag = "prod_" # prefixed to every queue name — isolates envs
vhost = "prod" # RabbitMQ virtual host
timer_violation = 3000 # ms before a broker round-trip is a slow query warning
records_per_xfer = 5000 # max records per broker response — circuit breaker
keepalive = true # TCP keepalive — always true in production
heartbeat = 60 # AMQP heartbeat interval in seconds
use_ssl = false # TLS for broker connections
cert_path = "/etc/rabbitmq"
```
Per-node broker config:
```toml
[broker_services.app_server]
host = "localhost"
port = 5672
api_port = 15672 # management UI
user = "beds"
pass = "changeme"
rpi = 50 # records per interval — broker fetch throttle
[broker_services.app_server.instances]
r_broker = 2 # read brokers
w_broker = 2 # write brokers
m_broker = 0 # migration brokers (0 = disabled)
```
### `[rec_services]` — MongoDB
One entry per BEDS service role that needs MongoDB. Key is the service name (`app_server`, `admin`, `segundo`, `tercero`):
```toml
[rec_services.app_server]
host = "localhost"
port = 27017
user = "beds"
pass = "changeme"
database = "beds_app"
use_ssl = false
```
See `beds.toml` for topology examples (standalone, replica set, sharded cluster).
### `[rel_services]` — MariaDB
One entry per service role, with master (required) and secondary (optional):
```toml
[rel_services.app_server.master]
host = "localhost"
port = 3306
user = "beds"
pass = "changeme"
database = "beds_app"
[rel_services.app_server.secondary] # optional read replica
host = "replica.internal"
port = 3306
user = "beds_readonly"
pass = "changeme"
database = "beds_app"
```
Secondary failure is always non-fatal — BEDS logs a warning and operates master-only.
## The Test Fixture
All tests load from `tests/fixtures/beds_test.toml` instead of the live config. This file is committed — it contains only localhost addresses and placeholder credentials. Tests never read `config/beds.toml`.
The fixture is loaded via `config::load_from()`:
```rust
// in unit tests (inside source files)
fn test_cfg() -> BedsConfig {
load_from("tests/fixtures/beds_test.toml", "")
.expect("test fixture failed to load")
}
// in integration tests (under tests/)
let cfg = common::load_test_config();
```
If a test needs a different value, mutate the loaded struct — do not create a separate fixture file for every test variation. The fixture is the baseline; tests modify what they need.
## Adding a New Config Section
1. Add the TOML section to `beds.toml` with safe defaults and full commentary
2. Add the same section to `tests/fixtures/beds_test.toml` with test-safe values
3. Add the corresponding struct(s) to `src/config/structs.rs`
4. Export the new type from `src/config/mod.rs`
5. Add the field to `BedsConfig`
6. Update this wiki page

143
wiki/06-queue-topology.md Normal file
View File

@@ -0,0 +1,143 @@
# Queue Topology
## Overview
BEDS uses a single RabbitMQ topic exchange for all data events. Topic exchanges route messages based on a dotted routing key — this gives BEDS fine-grained control over which brokers receive which events without the overhead of managing multiple exchanges.
## The Exchange
```
Exchange name: beds.events
Exchange type: topic
Durable: true
```
A single exchange handles all event types. Routing keys determine where messages go.
## Routing Key Convention
```
{store_type}.{operation}
```
| Routing Key | Description |
|---|---|
| `rec.read` | MongoDB non-destructive fetch |
| `rec.write` | MongoDB create / update / delete |
| `rec.obj` | MongoDB bulk / migration / object operations |
| `rel.read` | MariaDB non-destructive fetch |
| `rel.write` | MariaDB create / update / delete |
| `rel.obj` | MariaDB bulk / migration / object operations |
| `log` | Log events — routed to admin node |
| `adm` | Administrative events — node management, config |
| `mig` | Migration and warehouse operations — segundo node |
## Queue Naming Convention
Queue names follow the pattern:
```
{queue_tag}{routing_key_with_dots_replaced}
```
Example with `queue_tag = "prod_"`:
```
prod_rec.read
prod_rec.write
prod_rec.obj
prod_rel.read
prod_rel.write
prod_rel.obj
prod_log
prod_adm
prod_mig
```
The `queue_tag` from `beds.toml` ensures queues from different environments (`prod_`, `qa_`, `dev_`) can coexist on a shared RabbitMQ instance without collision.
## Broker-to-Queue Binding
Each broker type binds to one queue and processes events from it:
| Broker Type | Queue Binding | Node |
|---|---|---|
| `rBroker` | `{tag}rec.read`, `{tag}rel.read` | appServer |
| `wBroker` | `{tag}rec.write`, `{tag}rel.write` | appServer |
| `mBroker` | `{tag}rec.obj`, `{tag}rel.obj` | appServer |
| `adminBrokerIn` | `{tag}adm` | admin |
| `adminBrokerOut` | `{tag}adm` | admin |
| `adminLogsBroker` | `{tag}log` | admin |
| `adminSyslogBroker` | `{tag}log` | admin |
| `adminGraphBroker` | `{tag}log` | admin |
| `whBroker` | `{tag}mig` | segundo |
| `cBroker` | `{tag}mig` | segundo |
| `uBroker` | `{tag}rec.read`, `{tag}rec.write` | tercero |
| `sBroker` | `{tag}rec.read`, `{tag}rec.write` | tercero |
## Log Event Routing
Log events deserve special attention because they are cross-cutting — every node emits them, but only admin consumes them.
```
Any node
│ routing key: log
beds.events exchange
│ binding: log → prod_log queue
prod_log queue
│ consumer: adminLogsBroker (admin node only)
admin node
msLogs collection (MongoDB)
```
Non-admin nodes never write to MongoDB directly for logging. They publish to the `log` routing key and trust the admin node to persist the record. If admin is slow, log events queue. If admin is down, log events queue until the RabbitMQ queue limit is reached. Nothing is lost until the queue fills.
This is by design. The log queue is the most important queue in the cluster from an operations standpoint — it should be sized generously.
## Why Topic Exchange Over Direct Exchange
A direct exchange routes based on exact routing key match. A topic exchange supports wildcards:
```
# matches zero or more words
* matches exactly one word
```
This gives BEDS the option to bind a single consumer to multiple routing keys without multiple queue declarations:
```
rec.* matches rec.read, rec.write, rec.obj
*.read matches rec.read, rel.read
```
In the current implementation, brokers bind to specific queues. As the framework grows, the topic exchange flexibility will be used for cross-cutting concerns (audit, metrics) that need visibility across multiple event types without duplicating event payloads.
## Queue Durability and Persistence
All BEDS queues are:
- **Durable** — survive RabbitMQ restarts
- **Persistent messages** — messages survive broker restart (written to disk)
This is non-negotiable for a production framework. The performance cost of persistence (disk write per message) is acceptable given the correctness guarantee.
## The `vhost` Isolation Model
Each environment gets its own RabbitMQ virtual host. A vhost is a completely isolated namespace — queues, exchanges, and bindings in one vhost are invisible to another. A RabbitMQ user is granted access to specific vhosts.
```
vhost: prod ← production traffic
vhost: qa ← QA / staging traffic
vhost: dev ← development traffic
```
Even if all three environments share one RabbitMQ instance, they are fully isolated. A message published to `prod` cannot be consumed by a `dev` consumer.
This was the operational pattern in the Namaste homelab — one RabbitMQ instance, three vhosts, multiple concurrent dev sessions running without interfering with each other.

193
wiki/08-template-system.md Normal file
View File

@@ -0,0 +1,193 @@
# Template System
## What Is a Template?
A BEDS template is a TOML configuration file that defines a data collection — its schema, indexes, behavioural flags, access controls, and lifecycle policy. The template is the contract between the DBA who owns the data and the framework that serves it.
Templates live in `templates/`. Filename convention: `{wbid}{CollectionName}_{schema}.toml`
```
templates/
├── example_rec.toml ← canonical self-documenting REC template
└── mst_logger_rec.toml ← logger collection template
```
## Two Template Types
| Type | Schema | Database | Use Case |
|---|---|---|---|
| REC | `rec` | MongoDB | Document store — logs, events, user profiles, audit records |
| REL | `rel` | MariaDB | Relational store — anything needing SQL joins or transactions |
## The TLA Convention
Every collection has a **three-letter abbreviation (TLA)** declared in the template as `extension`. The TLA is:
1. Appended to the collection name: `msUsers``msUsers_usr`... wait, no. The collection name is `msUsers`. The TLA `_usr` is appended to **every field name** in the collection.
2. Applied to every field: `email_usr`, `status_usr`, `created_usr`
```toml
extension = "_usr" # the TLA for this collection
[fields]
email_usr = "string"
status_usr = "string"
created = "integer" # system fields like created don't carry the TLA
```
**Why TLA?** In a complex query that joins or aggregates across multiple collections, field names collide. `status` from users and `status` from orders are the same name. `status_usr` and `status_ord` never collide. In log output, `status_usr=active` immediately tells you which collection the field came from without having to trace the query.
This convention was enforced in the PHP Namaste framework from the beginning. BEDS validates TLA compliance at template load time — templates that violate the naming convention are rejected.
## The `wbid` Convention
The `wbid` (white-box identifier, 2 characters) is prepended to every collection name:
```
wbid = "ms" + CollectionName "Users" = collection "msUsers"
wbid = "ms" + CollectionName "Logs" = collection "msLogs"
```
The `wbid` namespaces your data within a shared MongoDB instance. If you ever run multiple BEDS deployments against the same MongoDB, different `wbid` values keep their collections from colliding.
## REC Template Anatomy
See `templates/example_rec.toml` for the full self-documenting example. Key sections:
### Identity
```toml
version = 1 # increment when schema changes
service = "app_server" # which node owns this collection
schema = "rec" # always "rec" for MongoDB
template_class = "Users" # human-readable class name
collection = "msUsers" # MongoDB collection name
extension = "_usr" # TLA — appended to all field names
wh_template = "" # warehousing destination template (empty = none)
```
### Behavioural Flags
```toml
closed_class = true # internal only vs. externally accessible
hard_deletes = false # soft delete (status=inactive) vs. permanent removal
updates_enabled = true # can records be updated after insert?
auditing = "full" # disabled | destructive | nondestructive | full
journaling = true # record journal entries for destructive ops?
record_history = false # maintain full version history per record?
record_locking = false # optimistic locking for concurrent writes?
query_timers = true # record query execution times as metric events?
primary_key = "token" # "token" (BEDS GUID) or "mongo_id" (native _id)
tokens = true # generate BEDS GUID token for every record?
cache_ttl = 300 # seconds in in-process cache (0 = disabled)
is_internal = false # exclude from public REST API catalog?
```
### Field Types
```toml
[fields]
_id = "object" # MongoDB native ID — never returned to clients
db_token = "string" # BEDS GUID — the externally-exposed primary key
status = "string"
created = "integer"
accessed = "integer"
email_usr = "string"
score_usr = "double"
count_usr = "integer"
flag_usr = "boolean"
tags_usr = "array"
meta_usr = "object"
```
Valid types: `string`, `integer`, `double`, `boolean`, `object`, `array`, `date`
### The `db_token` Field
Every BEDS record has a `db_token` field — a BEDS-generated GUID that serves as the externally-exposed primary key. The MongoDB native `_id` is never returned to clients.
This is intentional security-through-obscurity. Every script-kiddie who finds an exposed API searches for `id` in the response payload. `db_token` is non-intuitive. It is also structurally meaningful: it is the BEDS-controlled identifier, distinct from the database's internal identifier, which means BEDS controls its format, generation, and uniqueness guarantees.
### Protected Fields
Fields that clients cannot modify:
```toml
protected_fields = ["_id", "db_token", "event_guid", "created", "accessed"]
```
Attempts to update protected fields are silently dropped.
### Index Declarations
BEDS validates all incoming queries against declared indexes at submission time. A query that cannot be satisfied by a declared index is rejected before execution — no full collection scans in production.
```toml
# which fields participate in any index
index_fields = ["db_token", "status", "created", "email_usr"]
# single-field indexes
[single_field_indexes]
db_token = 1
status = -1
created = -1
email_usr = 1
# compound indexes (named — must appear in index_name_list)
index_name_list = ["cIdx1Usr"]
[compound_indexes]
cIdx1Usr = [["status", 1], ["created", -1]]
# unique constraints
[unique_indexes]
db_token = 1
email_usr = 1
# TTL — automatic record expiry
[ttl_indexes]
# accessed = 86400 # expire records not accessed in 24 hours
```
### Cache Map
The cache map controls field name translation between internal schema names and external client-facing names. Schema column names are never exposed to clients.
```toml
[cache_map]
db_token = "id"
status = "status"
created = "createdDate"
email_usr = "email"
```
A client sees `{ "id": "...", "email": "..." }`. The internal `db_token` and `email_usr` names never appear in API responses.
### Warehousing
```toml
[warehouse]
supported = true
automated = true
interval = "M" # D=daily, M=monthly, Q=quarterly, Y=yearly
delete = "H" # H=hard delete source after warehousing, S=soft delete
[warehouse.qualifier]
created = { operand = "null", operator = "lt", value = "" } # caller supplies cutoff date
status = { operand = "null", operator = "eq", value = "active" }
logical_op = "and"
```
## The Logger Template
`templates/mst_logger_rec.toml` is the canonical example of an internal system collection. Key differences from a standard REC template:
- `updates_enabled = false` — log records are immutable
- `auditing = "disabled"` — never audit the logger (infinite recursion)
- `journaling = false` — never journal the logger (same reason)
- `cache_ttl = 0` — no caching — log data is never stale-read
- `is_internal = true` — excluded from the public REST API catalog
- `hard_deletes = true` — log pruning permanently removes records
- TTL index on `created` — automatic log expiry (30 days default)
These choices are documented in the template file with explanations. When in doubt about why a flag is set a certain way on the logger template, the template is the authoritative source.

144
wiki/09-event-lineage.md Normal file
View File

@@ -0,0 +1,144 @@
# Event Lineage
## The Problem
A single client request to a BEDS application does not result in a single database operation. It fans out. A request to update a user record might trigger:
- The primary record update (REL write)
- An audit record insert (REC write)
- A journal entry (REC write)
- Three log events (published to admin)
- A cache invalidation event (ADM event)
That is six database operations from one client request. In production at Giving Assistant, the fanout was often into dozens of concurrent operations per request.
When something goes wrong, the question is: **what did request X actually cause?** Without event lineage, the answer requires correlating timestamps across multiple collections and hoping nothing else happened at the same moment.
## The Solution: Compound Event IDs
Every BEDS event carries three lineage fields:
```
event_id = "{node}.{env}.{guid}"
parent_id = "" # empty string if root event
depth = 0 # integer — levels from root
```
### `event_id`
A compound identifier unique across the entire cluster:
```
ms.production.a1b2c3d4-e5f6-7890-abcd-ef1234567890
│ │ │
│ │ └── UUID v4 — unique within this event
│ └── environment name from config
└── wbid — identifies the cluster
```
The compound format means two events with the same UUID from different clusters or environments never collide. This matters when you are aggregating logs from multiple environments.
### `parent_id`
The `event_id` of the event that spawned this one. Empty string for root events (direct client requests). All derived events (audit records, log entries, journal entries, cache events) carry the root event's `event_id` as their `parent_id`.
### `depth`
How many levels from the root event:
```
depth=0 root event (client request)
depth=1 direct children (first-generation derived events)
depth=2 grandchildren (events spawned by depth=1 events)
```
Depth is capped in practice — a correctly-designed BEDS application should not need depth beyond 3 or 4. Deep recursion is a design smell.
## Querying Event Trees
With these three fields, you can reconstruct the full tree of operations triggered by any event:
**Find the root event:**
```
event_id = "ms.production.a1b2c3d4..."
depth = 0
```
**Find all direct children:**
```
parent_id = "ms.production.a1b2c3d4..."
depth = 1
```
**Find the full subtree:**
```
parent_id = "ms.production.a1b2c3d4..." (all depths)
```
**Reconstruct the full tree:**
```
event_id = "ms.production.a1b2c3d4..." (root)
+ parent_id = "ms.production.a1b2c3d4..." (all children at any depth)
```
Both `event_id` and `parent_id` are indexed on the `msLogs` collection. The compound index `cIdx1Log = [event_id ASC, depth ASC]` is specifically designed for full tree traversal.
## Why Not Distributed Tracing?
Systems like Jaeger, Zipkin, and OpenTelemetry solve the same problem. BEDS does not use them. The reasons are deliberate:
1. **BEDS already has a structured event store.** MongoDB `msLogs` is queryable, indexed, and retains data as long as the TTL allows. A separate tracing system would duplicate this data.
2. **Simplicity.** Adding a distributed tracing system adds operational complexity — another service to run, monitor, and maintain. BEDS event lineage is built into the data model and requires no additional infrastructure.
3. **Self-sufficiency.** BEDS is designed to run in environments that may not have cloud infrastructure available. A homelab running BEDS should be able to answer "what happened?" without an external observability platform.
The tradeoff is that BEDS event lineage is specific to BEDS events. It does not cover external HTTP calls or third-party service interactions. If those are important to observe, a lightweight OpenTelemetry integration could be added to the adapter layer without changing the lineage model.
## The `msLogs` Collection
Log events written by the admin node carry full lineage. The logger template (`mst_logger_rec.toml`) defines the schema:
| Field | Type | Purpose |
|---|---|---|
| `event_id` | string | compound event ID of this log event |
| `parent_id` | string | parent event ID — empty for root events |
| `depth` | integer | levels from root |
| `level_log` | string | debug \| data \| info \| error \| warning \| fatal \| timer \| event |
| `level_val` | integer | -1 through 7 — enables range queries by severity |
| `resource` | string | 4-char component tag (e.g. LOGR, AMQP, CNFG) |
| `service_log` | string | node role that issued the event |
| `env_log` | string | environment |
| `node_log` | string | node name from config |
| `file_log` | string | source file |
| `method_log` | string | calling function name |
| `line_log` | integer | source line number |
| `message_log` | string | the log message |
| `trace_log` | array | stack trace — empty unless trace=true |
| `created` | integer | epoch timestamp |
The `level_val` integer enables range queries that are impossible with string level names:
```
level_val >= 4 # warning and above
level_val == 6 # fatal only
level_val <= 1 # debug and data
```
## Console Output Format
For local console output (before AMQP is up, or when `syslog=false`), BEDS follows the format established in the PHP `consoleLog` function:
```
[dd/mm/yy@HH:MM:SS] [LVL]RESRC: message
```
Example:
```
[04/04/26@14:23:01] [ I]BEDS: BEDS IPL starting, node=ms env=production
[04/04/26@14:23:01] [ I]BEDS: Configuration loaded
[04/04/26@14:23:01] [ I]AMQP: RabbitMQ reachable
[04/04/26@14:23:01] [ W]MNGO: MongoDB unreachable (non-fatal in development): connection refused
```
The level tag is right-padded to 2 characters in brackets. The resource tag is 4 characters. This format was chosen because it is immediately scannable — level and source are visible without reading the message text.

39
wiki/Home.md Normal file
View File

@@ -0,0 +1,39 @@
# BEDS — Back End Data System
## Developer Wiki
Welcome to the BEDS developer wiki. This is a living document. It grows with the codebase and should be updated whenever a design decision is made, a pattern is established, or a component is implemented.
If you are reading this as a new contributor, start here and read in order. The origin story is not fluff — it explains *why* BEDS is built the way it is, and understanding the *why* is the difference between extending the framework correctly and breaking it subtly.
---
## Table of Contents
### Foundation
- [Origin Story](01-origin-story.md) — Where BEDS came from and why it was built
- [Architecture Overview](02-architecture.md) — The full system design and its principles
- [The Four Nodes](03-nodes.md) — appServer, admin, segundo, tercero — roles and responsibilities
### Operations
- [IPL — Initial Program Load](04-ipl.md) — The bootstrap sequence, step by step, and why order matters
- [Configuration System](05-configuration.md) — Layered TOML, environment files, topology options
### Messaging
- [Queue Topology](06-queue-topology.md) — AMQP exchanges, queues, routing keys, and the broker model
- [Broker Calls](07-broker-calls.md) — Every broker event type documented
### Data
- [Template System](08-template-system.md) — REC and REL templates, the TLA convention, schema-as-contract
- [Event Lineage](09-event-lineage.md) — Compound event IDs, parent/child relationships, depth tracking
### Reference
- [Glossary](glossary.md) — Terms, abbreviations, and conventions used throughout BEDS
---
## Contributing to This Wiki
- Write for the programmer who inherits this code after a two-week handoff with no knowledge transfer
- Document decisions, not just mechanics — *why* matters more than *what*
- Dated history entries belong in source code comments, not here — the wiki covers concepts, not changelogs
- When you change the system, update the wiki in the same commit

36
wiki/glossary.md Normal file
View File

@@ -0,0 +1,36 @@
# Glossary
| Term | Definition |
|---|---|
| **AMQP** | Advanced Message Queuing Protocol. The wire protocol used by RabbitMQ. BEDS uses AMQP 0-9-1 exclusively — no vendor-specific extensions. |
| **appServer** | The primary BEDS node role. Handles all client-facing CRUD operations. |
| **admin** | The administrative BEDS node role. Handles logging, auditing, metrics, and administrative events. |
| **BEDS** | Back End Data System. The framework. |
| **broker** | A Tokio async task that listens on one AMQP queue, processes one event type, and routes results back. |
| **broker pool** | The collection of broker tasks running on a node. Pool size is configured per broker type in `beds.toml`. |
| **CALGON** | Async ticket pattern. Client submits a request, receives a GUID immediately, polls for the result later. Used for long-running operations. |
| **closed_class** | Template flag. When true, only internal BEDS services can instantiate the template. When false, external partners may also access it. |
| **COOL storage** | Warehoused data. Maintains full schema and indexing. Queryable but not on the live production data path. |
| **COLD storage** | Archived data. Reformatted (typically CSV). Not directly queryable by BEDS. |
| **depth** | Event lineage field. Integer counting levels from the root event. Root events have depth=0. |
| **event_id** | Compound event identifier: `{wbid}.{env}.{guid}`. Unique across the entire cluster. |
| **factory** | The dispatch layer. Maps a template name to the correct database adapter at runtime. |
| **HOT storage** | Live production data. The primary MongoDB and MariaDB instances. |
| **IPL** | Initial Program Load. The BEDS bootstrap sequence. Term borrowed from IBM mainframe terminology. |
| **is_local** | Config flag per service. Declares that this service runs on the current physical machine. Brokers are only started for local services. |
| **journaling** | Recording a journal entry for every destructive operation on a collection. Distinct from database-level journaling. |
| **NamasteCore** | The unified CRUD trait. Every database adapter and every template implements it. The application layer only calls NamasteCore methods. |
| **Namaste** | The internal codename for the PHP implementation of BEDS. The Rust rewrite is named rustybeds. |
| **parent_id** | Event lineage field. The `event_id` of the event that spawned this one. Empty string for root events. |
| **REC** | Record. MongoDB document store. Template type for document collections. |
| **REL** | Relational. MariaDB relational store. Template type for relational collections. |
| **rpi** | Records Per Interval. Throttle applied to a broker's fetch rate. Prevents a single broker from overwhelming the database. |
| **segundo** | The warehousing BEDS node role. Manages data lifecycle — moving records from HOT to COOL storage. Spanish for "second." |
| **TLA** | Three-Letter Abbreviation. Appended to every field name in a collection to namespace fields and eliminate ambiguity in multi-collection queries. |
| **template** | A TOML configuration file defining a data collection — schema, indexes, behavioural flags, access controls, lifecycle policy. |
| **template_class** | Human-readable collection name declared in the template. Used in logging, admin UI, and the REST API catalog. |
| **tercero** | The user and session management BEDS node role. Spanish for "third." |
| **TLA** | Three-Letter Abbreviation — see above. |
| **WARM storage** | Data being restored from COLD back to HOT. Transitional state. |
| **wbid** | White-box identifier. 2-character corporate prefix prepended to every MongoDB collection name. Declared in `[id]` in `beds.toml`. |
| **WORM** | Write Once Read Many. The append-only pattern used for log collections. Log records are immutable after insert. |