Files
rustybeds/wiki/08-template-system.md

8.3 KiB

Template System

What Is a Template?

A BEDS template is a TOML configuration file that defines a data collection — its schema, indexes, behavioural flags, access controls, and lifecycle policy. The template is the contract between the DBA who owns the data and the framework that serves it.

Templates live in templates/. Filename convention: {wbid}{CollectionName}_{schema}.toml

templates/
├── example_rec.toml       ← canonical self-documenting REC template
└── mst_logger_rec.toml    ← logger collection template

Runtime Template Registry (Implemented)

At IPL, BEDS now loads REC templates from templates/ into a typed runtime registry and validates them before service connectivity checks.

Current runtime validation enforces:

  • Any field listed in protected_fields must exist in [fields].
  • Any field listed in index_fields must exist in [fields].
  • Any key in [single_field_indexes], [unique_indexes], and [ttl_indexes] must exist in [fields].
  • Every field used in [compound_indexes] and partial_indexes qualifiers must exist in [fields].
  • Every field used in regex_pattern_indexing and [cache_map] must exist in [fields].
  • Every compound_indexes index name must be present in index_name_list.

In production, template registry validation failures are fatal during IPL. In non-production environments, failures are warning-only to preserve local POC workflows.

Two Template Types

Type Schema Database Use Case
REC rec MongoDB Document store — logs, events, user profiles, audit records
REL rel MariaDB Relational store — anything needing SQL joins or transactions

The TLA Convention

Every collection has a three-letter abbreviation (TLA) declared in the template as extension. The TLA is:

  1. Appended to the collection name: msUsersmsUsers_usr... wait, no. The collection name is msUsers. The TLA _usr is appended to every field name in the collection.
  2. Applied to every field: email_usr, status_usr, created_usr
extension = "_usr"   # the TLA for this collection

[fields]
email_usr  = "string"
status_usr = "string"
created    = "integer"   # system fields like created don't carry the TLA

Why TLA? In a complex query that joins or aggregates across multiple collections, field names collide. status from users and status from orders are the same name. status_usr and status_ord never collide. In log output, status_usr=active immediately tells you which collection the field came from without having to trace the query.

This convention was enforced in the PHP Namaste framework from the beginning. BEDS validates TLA compliance at template load time — templates that violate the naming convention are rejected.

The wbid Convention

The wbid (white-box identifier, 2 characters) is prepended to every collection name:

wbid = "ms"   + CollectionName "Users"   = collection "msUsers"
wbid = "ms"   + CollectionName "Logs"    = collection "msLogs"

The wbid namespaces your data within a shared MongoDB instance. If you ever run multiple BEDS deployments against the same MongoDB, different wbid values keep their collections from colliding.

REC Template Anatomy

See templates/example_rec.toml for the full self-documenting example. Key sections:

Identity

version        = 1                  # increment when schema changes
service        = "app_server"       # which node owns this collection
schema         = "rec"              # always "rec" for MongoDB
template_class = "Users"            # human-readable class name
collection     = "msUsers"          # MongoDB collection name
extension      = "_usr"             # TLA — appended to all field names
wh_template    = ""                 # warehousing destination template (empty = none)

Behavioural Flags

closed_class    = true      # internal only vs. externally accessible
hard_deletes    = false     # soft delete (status=inactive) vs. permanent removal
updates_enabled = true      # can records be updated after insert?
auditing        = "full"    # disabled | destructive | nondestructive | full
journaling      = true      # record journal entries for destructive ops?
record_history  = false     # maintain full version history per record?
record_locking  = false     # optimistic locking for concurrent writes?
query_timers    = true      # record query execution times as metric events?
primary_key     = "token"   # "token" (BEDS GUID) or "mongo_id" (native _id)
tokens          = true      # generate BEDS GUID token for every record?
cache_ttl       = 300       # seconds in in-process cache (0 = disabled)
is_internal     = false     # exclude from public REST API catalog?

Field Types

[fields]
_id        = "object"    # MongoDB native ID — never returned to clients
db_token   = "string"    # BEDS GUID — the externally-exposed primary key
status     = "string"
created    = "integer"
accessed   = "integer"
email_usr  = "string"
score_usr  = "double"
count_usr  = "integer"
flag_usr   = "boolean"
tags_usr   = "array"
meta_usr   = "object"

Valid types: string, integer, double, boolean, object, array, date

The db_token Field

Every BEDS record has a db_token field — a BEDS-generated GUID that serves as the externally-exposed primary key. The MongoDB native _id is never returned to clients.

This is intentional security-through-obscurity. Every script-kiddie who finds an exposed API searches for id in the response payload. db_token is non-intuitive. It is also structurally meaningful: it is the BEDS-controlled identifier, distinct from the database's internal identifier, which means BEDS controls its format, generation, and uniqueness guarantees.

Protected Fields

Fields that clients cannot modify:

protected_fields = ["_id", "db_token", "event_guid", "created", "accessed"]

Attempts to update protected fields are silently dropped.

Index Declarations

BEDS validates all incoming queries against declared indexes at submission time. A query that cannot be satisfied by a declared index is rejected before execution — no full collection scans in production.

# which fields participate in any index
index_fields = ["db_token", "status", "created", "email_usr"]

# single-field indexes
[single_field_indexes]
db_token  = 1
status    = -1
created   = -1
email_usr = 1

# compound indexes (named — must appear in index_name_list)
index_name_list = ["cIdx1Usr"]

[compound_indexes]
cIdx1Usr = [["status", 1], ["created", -1]]

# unique constraints
[unique_indexes]
db_token  = 1
email_usr = 1

# TTL — automatic record expiry
[ttl_indexes]
# accessed = 86400   # expire records not accessed in 24 hours

Cache Map

The cache map controls field name translation between internal schema names and external client-facing names. Schema column names are never exposed to clients.

[cache_map]
db_token  = "id"
status    = "status"
created   = "createdDate"
email_usr = "email"

A client sees { "id": "...", "email": "..." }. The internal db_token and email_usr names never appear in API responses.

Warehousing

[warehouse]
supported  = true
automated  = true
interval   = "M"    # D=daily, M=monthly, Q=quarterly, Y=yearly
delete     = "H"    # H=hard delete source after warehousing, S=soft delete

[warehouse.qualifier]
created = { operand = "null", operator = "lt", value = "" }   # caller supplies cutoff date
status  = { operand = "null", operator = "eq", value = "active" }
logical_op = "and"

The Logger Template

templates/mst_logger_rec.toml is the canonical example of an internal system collection. Key differences from a standard REC template:

  • updates_enabled = false — log records are immutable
  • auditing = "disabled" — never audit the logger (infinite recursion)
  • journaling = false — never journal the logger (same reason)
  • cache_ttl = 0 — no caching — log data is never stale-read
  • is_internal = true — excluded from the public REST API catalog
  • hard_deletes = true — log pruning permanently removes records
  • TTL index on created — automatic log expiry (30 days default)

These choices are documented in the template file with explanations. When in doubt about why a flag is set a certain way on the logger template, the template is the authoritative source.