rustybeds/wiki/06-queue-topology.md

# Queue Topology

## Current Runtime Topology (2026-04)

The active Rust runtime currently uses a unified dispatcher model:

1. Single primary events queue per environment tag (`{queue_tag}events`).
2. Unified dispatcher pool consumes from that queue.
3. Queue binding uses topic wildcard (`#`) on `beds.events`.
4. `shutdown` operation is handled as a broker command that triggers coordinated global shutdown.
5. Legacy `rec.read` and `rec.write` broker files remain in repository history/reference, but runtime path is centered on unified dispatcher workers.

The broader broker matrix below remains useful as historical and planned topology context.

## Overview

BEDS uses a single RabbitMQ topic exchange for all data events. Topic exchanges route messages based on a dotted routing key — this gives BEDS fine-grained control over which brokers receive which events without the overhead of managing multiple exchanges.

## The Exchange

```
Exchange name:  beds.events
Exchange type:  topic
Durable:        true
```

A single exchange handles all event types. Routing keys determine where messages go.

## Routing Key Convention

```
{store_type}.{operation}
```

| Routing Key | Description |
|---|---|
| `rec.read` | MongoDB non-destructive fetch |
| `rec.write` | MongoDB create / update / delete |
| `rec.obj` | MongoDB bulk / migration / object operations |
| `rel.read` | MariaDB non-destructive fetch |
| `rel.write` | MariaDB create / update / delete |
| `rel.obj` | MariaDB bulk / migration / object operations |
| `log` | Log events — routed to admin node |
| `adm` | Administrative events — node management, config |
| `mig` | Migration and warehouse operations — segundo node |

## Queue Naming Convention

Queue names follow the pattern:

```
{queue_tag}{routing_key_with_dots_replaced}
```

Example with `queue_tag = "prod_"`:

```
prod_rec.read
prod_rec.write
prod_rec.obj
prod_rel.read
prod_rel.write
prod_rel.obj
prod_log
prod_adm
prod_mig
```

The `queue_tag` from `beds.toml` ensures queues from different environments (`prod_`, `qa_`, `dev_`) can coexist on a shared RabbitMQ instance without collision.

## Broker-to-Queue Binding

Each broker type binds to one queue and processes events from it:

| Broker Type | Queue Binding | Node |
|---|---|---|
| `rBroker` | `{tag}rec.read`, `{tag}rel.read` | appServer |
| `wBroker` | `{tag}rec.write`, `{tag}rel.write` | appServer |
| `mBroker` | `{tag}rec.obj`, `{tag}rel.obj` | appServer |
| `adminBrokerIn` | `{tag}adm` | admin |
| `adminBrokerOut` | `{tag}adm` | admin |
| `adminLogsBroker` | `{tag}log` | admin |
| `adminSyslogBroker` | `{tag}log` | admin |
| `adminGraphBroker` | `{tag}log` | admin |
| `whBroker` | `{tag}mig` | segundo |
| `cBroker` | `{tag}mig` | segundo |
| `uBroker` | `{tag}rec.read`, `{tag}rec.write` | tercero |
| `sBroker` | `{tag}rec.read`, `{tag}rec.write` | tercero |

## Log Event Routing

Log events deserve special attention because they are cross-cutting — every node emits them, but only admin consumes them.

```
Any node
    │
    │  routing key: log
    ▼
beds.events exchange
    │
    │  binding: log → prod_log queue
    ▼
prod_log queue
    │
    │  consumer: adminLogsBroker (admin node only)
    ▼
admin node
    │
    ▼
msLogs collection (MongoDB)
```

Non-admin nodes never write to MongoDB directly for logging. They publish to the `log` routing key and trust the admin node to persist the record. If admin is slow, log events queue. If admin is down, log events queue until the RabbitMQ queue limit is reached. Nothing is lost until the queue fills.

This is by design. The log queue is the most important queue in the cluster from an operations standpoint — it should be sized generously.

## Why Topic Exchange Over Direct Exchange

A direct exchange routes based on exact routing key match. A topic exchange supports wildcards:

```
#      matches zero or more words
*      matches exactly one word
```

This gives BEDS the option to bind a single consumer to multiple routing keys without multiple queue declarations:

```
rec.*   matches rec.read, rec.write, rec.obj
*.read  matches rec.read, rel.read
```

In the current implementation, brokers bind to specific queues. As the framework grows, the topic exchange flexibility will be used for cross-cutting concerns (audit, metrics) that need visibility across multiple event types without duplicating event payloads.

## Queue Declaration Lifecycle

The `beds.events` exchange is declared during IPL (Step 3b), before any broker task starts. This ensures the routing infrastructure exists before anyone tries to publish to it.

**Queues are not declared during IPL.** Each broker task declares its own queue when it starts. This is a deliberate design choice:

- **Queue presence = service ready.** A queue's existence on the broker signals that the task consuming it is alive and ready to process messages. A queue declared at IPL before the consumer starts would be misleading — messages could arrive before the consumer is ready, or worse, before it is confirmed the consumer will start at all.
- **No reserved global topology.** There is no fixed set of queues that must exist for the cluster to function. The topology emerges from the services that are actually running. An appServer with only rBroker and wBroker running has exactly those two queues — not the full topology diagram.
- **Clean restarts.** When a broker task restarts, queue declaration is idempotent — RabbitMQ returns success if the queue already exists with matching parameters. Messages queued during the restart interval are waiting for the consumer when it comes back up.

## Queue Durability and Persistence

All BEDS queues are:
- **Durable** — survive RabbitMQ restarts
- **Persistent messages** — messages survive broker restart (written to disk)

This is non-negotiable for a production framework. The performance cost of persistence (disk write per message) is acceptable given the correctness guarantee.

## DLQ and Retry Topology (Implemented)

For the active POC queues (`rec.read`, `rec.write`), BEDS now provisions:

- Primary queue: `{tag}rec.read` / `{tag}rec.write`
- Retry queue: `{tag}rec.read.retry` / `{tag}rec.write.retry`
- Dead-letter queue: `{tag}rec.read.dlq` / `{tag}rec.write.dlq`

Dead-letter flow:

- Primary queues are configured with dead-letter exchange `beds.dlx`.
- Non-retryable failures (`nack requeue=false`) route to `*.dlq` via routing keys
    `rec.read.dlq` and `rec.write.dlq`.

Retry flow:

- Retryable failures are republished to `*.retry` queues.
- Retry queues apply TTL backoff and dead-letter back to `beds.events` using the
    original routing keys (`rec.read` / `rec.write`).

This avoids tight immediate requeue loops and creates deterministic failure lanes.

## The `vhost` Isolation Model

Each environment gets its own RabbitMQ virtual host. A vhost is a completely isolated namespace — queues, exchanges, and bindings in one vhost are invisible to another. A RabbitMQ user is granted access to specific vhosts.

```
vhost: prod    ← production traffic
vhost: qa      ← QA / staging traffic
vhost: dev     ← development traffic
```

Even if all three environments share one RabbitMQ instance, they are fully isolated. A message published to `prod` cannot be consumed by a `dev` consumer.

This was the operational pattern in the Namaste homelab — one RabbitMQ instance, three vhosts, multiple concurrent dev sessions running without interfering with each other.

## POC Verification

Current proof-of-concept verification for the two active appServer brokers is covered by integration tests:

- `tests/broker_pool_test.rs` validates that configured rBroker/wBroker pool instances spawn.
- `tests/broker_message_flow_test.rs` validates end-to-end message flow by publishing `ping` events to
    `rec.read` and `rec.write` and asserting broker replies.
- `tests/broker_message_flow_test.rs` also validates logger sequence round-trip by publishing a `write`
    event to `rec.write` with `template="Logger"`, then fetching via `fetch` on `rec.read` and asserting
    that the newly written log message is returned.

Current logger sequence in POC now writes and reads through MongoDB (`msLogs`) via the
`Logger` template path (`rec.write` for write, `rec.read` for fetch), using credentials from
the active environment config.

These tests provide a lightweight deployment confidence check while the framework is still in the
"POC before guardrails" phase.