docs: replace README screenshot asset (v1.7.8)

docs(todo): add model/preset preflight validation item (v1.7.7)
docs: add v1.6/v1.7 release notes and developer wiki (v1.7.6)
2026-04-28 09:14:54 -07:00 · 2026-04-28 09:08:36 -07:00 · 2026-04-28 08:53:54 -07:00 · 2026-04-28 08:49:19 -07:00 · 2026-04-28 08:44:22 -07:00 · 2026-04-28 08:31:01 -07:00
19 changed files with 4261 additions and 1464 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -3,3 +3,4 @@
 *.py-
 __pycache__/
 venv/
+readme.md-
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,74 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Running the App
+
+```bash
+# Development
+./venv/bin/uvicorn app:app --host 0.0.0.0 --port 8080 --reload
+
+# Production (via systemd)
+sudo systemctl restart jarvischat
+
+# Direct run
+./venv/bin/python app.py
+```
+
+## Dependencies
+
+```bash
+./venv/bin/pip install -r requirements.txt
+# Also requires: psutil jinja2 python-multipart (not in requirements.txt)
+```
+
+## Architecture
+
+Single-file FastAPI backend (`app.py`) + single-template frontend (`templates/index.html`). No build step. SQLite database auto-created at `jarvischat.db` on first run.
+
+### Request Flow: `/api/chat`
+
+1. User message saved to DB → conversation created if new
+2. `build_system_prompt()` assembles: profile + FTS5 memory search results + preset prompt
+3. Streamed to Ollama (`/api/chat`, `stream: true`, `logprobs: true`) via SSE
+4. **Auto web search trigger**: if perplexity > 15.0 OR response matches `REFUSAL_PATTERNS`, re-queries Ollama with SearXNG results prepended to system prompt
+5. Final response saved to DB; SSE `done` event sent with perplexity + tokens/sec
+
+### Request Flow: `/api/search` (explicit search)
+
+Bypasses perplexity/refusal detection entirely — queries SearXNG directly then asks Ollama to summarize with results as system context.
+
+### Memory System
+
+FTS5 virtual table (`memories`) in SQLite. `search_memories()` uses BM25 ranking. `process_remember_command()` intercepts "remember that..." / "forget about..." before the message reaches Ollama and returns a confirmation string. Topic auto-detection via keyword matching in `detect_topic()`.
+
+### Key Constants (top of `app.py`)
+
+- `OLLAMA_BASE` — `http://localhost:11434`
+- `SEARXNG_BASE` — `http://localhost:8888`
+- `PERPLEXITY_THRESHOLD` — `15.0` (controls auto-search sensitivity)
+- `DEFAULT_MODEL` — `llama3.1:latest`
+
+### External Services
+
+- **Ollama** — required, must be running on port 11434
+- **SearXNG** — optional, port 8888; `GET /api/search/status` probes availability
+- **wttr.in** — weather shortcut in `query_searxng()`, bypasses SearXNG for weather queries
+- **rocm-smi** — AMD GPU stats via subprocess; gracefully degrades if not available
+
+### Database
+
+`get_db()` opens a new connection per request (no connection pool). `init_db()` runs at startup via the FastAPI `lifespan` handler. The `profile` table uses a singleton row (`id = 1`). Default settings are seeded but never overwritten by `init_db()`.
+
+### SSE Protocol
+
+All streaming endpoints yield `data: {json}\n\n`. Key event shapes:
+- `{token, conversation_id}` — streaming token
+- `{searching: true}` — web search triggered
+- `{search_results: N}` — N results retrieved
+- `{done: true, perplexity, tokens_per_sec, searched?}` — terminal event
+- `{error: "..."}` — error event
+
+### Deployment
+
+Runs as systemd service under user `jarvischat`, working directory `/opt/jarvischat`. Logs via syslog (`journalctl -u jarvischat`).
--- a/app.py
+++ b/app.py
--- a/docs/copilot-context-loss-incident-2026-04-21.md
+++ b/docs/copilot-context-loss-incident-2026-04-21.md
@@ -0,0 +1,51 @@
+# Copilot Chat Incident Report: Context Loss After Project Context Change
+
+Date observed: 2026-04-21
+Reporter: Michael Shallop (Gramps)
+Environment: VS Code on Linux, GitHub Copilot Chat extension present
+
+## Summary
+Switching/loading project context in the VS Code project window caused Copilot Chat conversational context to reset. This resulted in loss of recently generated conclusion/plan data that was intended to be implemented immediately after loading the new project.
+
+## Impact
+- Lost actionable conclusions from the active design/planning thread.
+- Interrupted workflow at a critical handoff point (planning -> implementation).
+- Forced reconstruction from memory instead of exact prior content.
+- Increased risk of omissions and rework.
+
+## Reproduction Steps
+1. Have an active Copilot Chat conversation containing planning/conclusion details.
+2. Load or switch project context in the current project window.
+3. Return to Copilot Chat and continue the thread.
+4. Observe that prior context is no longer available in-chat as expected.
+
+## Expected Behavior
+- Prior active conversation context should remain available, or
+- The user should be prompted before context-destructive operations, and
+- Recovery path should be obvious and reliable.
+
+## Actual Behavior
+- Current chat context was effectively reset.
+- The previously concluded upgrade notes were not recoverable from active context.
+- Local transcript/debug artifacts did not provide the full prior thread needed.
+
+## Severity
+High (workflow-breaking for planning-heavy sessions)
+
+## User-visible Failure Mode
+The user lost conclusion data that was intended for immediate implementation once the new project loaded.
+
+## Suggested Fixes
+1. Preserve active chat state across workspace/project context changes by default.
+2. Show a blocking warning before any action that can drop active conversation state.
+3. Add one-click export/snapshot of current conversation before context switch.
+4. Improve transcript durability and discoverability for immediate recovery.
+5. Add explicit session continuity indicator so users can verify state retention.
+
+## Notes
+- This incident occurred in a real implementation workflow and caused direct productivity loss.
+- Regression tests should include workspace switch/load scenarios with active chat state.
+
+## Escalation Constraint
+- Current product constraints prevented the assistant from directly self-reporting this incident to the Copilot/VS Code dev team from within the chat runtime.
+- User feedback to include verbatim: "it is idiotic to keep you from self-reporting issues like this."
--- a/docs/images/screenshot.png
+++ b/docs/images/screenshot.png
--- a/docs/wiki/Developer-Architecture.md
+++ b/docs/wiki/Developer-Architecture.md
@@ -0,0 +1,165 @@
+# Developer Architecture Guide
+
+This document explains how JarvisChat is structured, why key guardrails exist, and what the test suite validates.
+
+## 1. System Overview
+
+JarvisChat is a single-process FastAPI service with a Jinja2 frontend and SQLite persistence.
+
+Primary files:
+
+- `app.py`: API, middleware, streaming/chat logic, auth, memory, skills, and DB bootstrap
+- `templates/index.html`: main WebUX, settings panels, auth flow, streaming UI handlers
+- `jarvischat.db`: runtime SQLite database created and migrated at startup
+
+Core runtime integrations:
+
+- Ollama for chat/model interaction
+- SearXNG for web search (optional)
+- wttr.in for weather shortcut queries
+- rocm-smi for GPU stats when available
+
+## 2. Request/Response Architecture
+
+### 2.1 Chat Pipeline (`/api/chat`)
+
+1. Validate session, role, origin, rate, and payload limits in middleware
+2. Persist user message and conversation metadata
+3. Build system prompt from enabled profile, memory context, and active skills metadata
+4. Stream model response over SSE token-by-token
+5. Evaluate uncertainty/refusal; if needed, trigger search augmentation and stream augmented result
+6. Persist final assistant message and emit terminal SSE event
+
+### 2.2 Explicit Search Pipeline (`/api/search`)
+
+1. Persist search-as-message into the target/new conversation
+2. Emit `searching` SSE event
+3. Pull web results from SearXNG
+4. Summarize with Ollama via SSE stream
+5. Persist summary and emit `done` event (plus raw results payload)
+
+### 2.3 Settings/Control Surface
+
+- Profile, presets, memory, conversation management, and settings APIs
+- Skills APIs for phase-1 registry and enable/disable controls
+- Auth/session APIs for guest/admin role handling and keepalive
+
+## 3. Data Model (SQLite)
+
+Key tables:
+
+- `conversations`: conversation headers and timestamps
+- `messages`: ordered chat history entries
+- `profile`: singleton row for injected profile prompt
+- `settings`: runtime toggles and selected defaults
+- `system_presets`: named reusable system prompts
+- `skills`: per-skill enabled state and timestamp
+- `memories` (FTS5 virtual table): searchable user memory facts
+
+Design notes:
+
+- Startup is idempotent: tables are created if missing and defaults seeded only when absent
+- No connection pool: each request opens a short-lived SQLite connection
+
+## 4. Security Implementations
+
+This section documents explicit controls currently in code.
+
+### 4.1 Auth Model
+
+- Guest session is default for conversational access
+- Admin unlock uses 4-digit PIN and creates admin-capable session
+- Admin required for write/destructive routes
+- Session heartbeat/timeout and explicit logout/revoke flow
+
+### 4.2 PIN and Session Hardening
+
+- Admin PIN hashed with PBKDF2-HMAC-SHA256 + salt
+- Failed PIN attempts tracked per client IP
+- Lockout window enforced after max failed attempts
+
+### 4.3 Browser and API Abuse Controls
+
+- Origin checks on state-changing requests
+- Rate limiting by endpoint category and identity (IP/session)
+- Payload size limits per route class
+- Settings key allowlist to block arbitrary configuration injection
+- IP allowlist/CIDR gate with optional trusted proxy forwarding mode
+
+### 4.4 Output and Error Safety
+
+- Search result URLs sanitized to `http`/`https` only
+- Client-safe error envelopes with incident key correlation
+- Full stack traces and diagnostic metadata logged server-side only
+
+### 4.5 Operational Auditability
+
+- Structured audit events for auth actions, admin operations, and guardrail denials
+- Incident logs include event type, key, path/method context, and runtime metadata
+
+## 5. Skills Framework (Phase 1)
+
+Goal: introduce a governed skills control plane inside the local JarvisChat sandbox.
+
+Current behavior:
+
+- Built-in skill registry defined server-side
+- Per-skill enable/disable persisted in DB
+- Global `skills_enabled` master toggle in settings
+- Active skills injected into system prompt with bounded text budget
+- API endpoints to list skills, list active skills, and toggle skill state
+- WebUX settings panel to control master/per-skill toggles
+
+Non-goals in phase 1:
+
+- No unrestricted shell/tool execution
+- No external connector execution (filesystem, Gmail, etc.)
+
+## 6. Testing Strategy and Validation Intent
+
+The test suite validates both behavior and guardrail assumptions.
+
+### 6.1 What We Test
+
+- Auth capability separation (guest vs admin)
+- URL sanitization safety for outbound links
+- Rate and payload guardrails
+- IP allowlist behavior
+- Safe error envelope behavior and SSE error leakage prevention
+- Streaming chat/search and memory command paths
+- Skills framework toggles and prompt-injection behavior
+
+### 6.2 Why These Tests Matter
+
+- Confirms security controls are active and regression-resistant
+- Ensures streaming UX protocol remains stable (`token`, `searching`, `done`, `error`)
+- Verifies policy intent: dangerous actions require admin capability
+- Validates new features preserve prior guarantees
+
+### 6.3 Internal Process Validation
+
+For substantive changes, Definition of Done includes:
+
+1. Implement code change
+2. Add/adjust tests proving behavior and guardrail intent
+3. Update README release notes for user-facing impact
+4. Update wiki architecture/security/testing docs for maintainers
+5. Validate with targeted test runs before merge/deploy
+
+This process is intentionally explicit so design decisions remain auditable over time.
+
+## 7. Deployment and Operations Notes
+
+- Primary deployment target: local/homelab systemd service
+- Required dependency: Ollama
+- Optional dependency: SearXNG
+- Recommended log review path: system journal for startup, guardrail denials, and incidents
+
+## 8. Contribution Guidance
+
+When adding a feature:
+
+1. Define security posture first (who can execute, what can fail, and failure mode)
+2. Implement smallest safe slice with clear limits
+3. Add tests that prove both happy path and guardrail path
+4. Update this wiki and README in the same change
--- a/docs/wiki/Home.md
+++ b/docs/wiki/Home.md
@@ -0,0 +1,23 @@
+# JarvisChat Developer Wiki
+
+This wiki is the developer-facing architecture and process reference for JarvisChat.
+
+## Audience
+
+- Contributors maintaining backend, frontend, security posture, and deployment process
+- Operators validating local or homelab deployments
+
+## Start Here
+
+- Architecture and components: [Developer-Architecture.md](Developer-Architecture.md)
+- Active implementation backlog: [current-wip.md](current-wip.md)
+
+## Scope and Support Model
+
+JarvisChat is designed for local and trusted-LAN operation.
+
+The code may technically function against external or commercial endpoints, but this deployment mode is not a supported target in this project.
+
+## Wiki Maintenance Rule
+
+When architecture, security behavior, or test policy changes, update this wiki in the same change set as code and tests.
--- a/docs/wiki/current-wip.md
+++ b/docs/wiki/current-wip.md
@@ -0,0 +1,84 @@
+# JarvisChat Current WiP Backlog
+
+Last updated: 2026-04-27
+Owner: Gramps + Copilot
+Scope: issues, bugs, security exposures, and feature enhancements.
+
+Total identified items: 27
+
+## Priority Definitions
+- P0: Critical risk or data-loss/security exposure; do first.
+- P1: High impact reliability/correctness work.
+- P2: Important feature/UX improvements.
+- P3: Nice-to-have polish.
+
+## Top 10 (Urgency Order)
+1. [P0][DONE] Add authentication/authorization for all write and admin endpoints.
+2. [P0][DONE] Add CSRF/origin protection for browser-initiated state-changing requests.
+3. [P0][DONE] Block unsafe URL schemes in rendered search-result links (e.g., javascript:).
+4. [P0][DONE] Add rate limiting and request body size limits for chat/search/profile APIs.
+5. [P1][DONE] Restrict settings updates to an allowlist of valid keys.
+6. [P1] Add pagination + hard caps on list endpoints (memories, conversations, message history).
+7. [P1][DONE] Stop returning raw exception text to clients; use safe error envelopes.
+8. [P1][DONE] Add automated tests for chat streaming, auto-search trigger, and memory command paths.
+9. [P2][DONE] Implement skills/tool-call framework (MCP-style) with per-skill enable controls.
+10. [P2] Implement heartbeat/check-in pipeline with scheduler + summary endpoint.
+
+## Item 1 Executive Summary (Scope + Security)
+
+- Status: Complete. Guest/admin capability split implemented with admin-only write enforcement, origin checks on state-changing requests, audit logging, and endpoint capability tests.
+
+- Decision: JarvisChat is local-first by design. Primary mode is same-host Ollama; optional mode allows RFC1918 LAN endpoints only.
+- Constraint: Public Internet AI endpoints are out of scope unless explicitly enabled in a future advanced mode.
+- Risk: Even on LAN, unauthenticated write/admin endpoints permit unauthorized data tampering and deletion.
+- Requirement: Add mandatory admin authentication for all POST/PUT/DELETE routes and destructive actions.
+- Authentication shape (scope-locked): two capability tiers only: guest (chat-only) and admin (4-digit PIN unlock).
+- Scope guardrail: Avoid full RBAC. Keep capability split minimal: conversational chat for guest, advanced/destructive actions for admin.
+- Definition of done:
+	1. Auth required on all state-changing endpoints.
+	2. Destructive actions require admin authorization.
+	3. Endpoint configuration rejects non-local/non-RFC1918 AI backends by default.
+	4. Strong rate limiting + lockout controls in place for PIN attempts.
+	5. Security events logged for failed and successful admin actions.
+
+## Full Backlog (Sorted by Priority)
+
+### P0 Critical
+1. Add auth for write/admin endpoints (`POST/PUT/DELETE` routes, mass delete, profile/settings changes).
+2. Add CSRF or strict origin checks for browser session protection.
+3. Validate/sanitize outbound href URLs before rendering in HTML (allow http/https only).
+4. Add per-IP rate limiting on `/api/chat`, `/api/search`, `/api/profile`, `/api/settings`.
+5. Enforce request size limits (message/profile text and JSON body) to prevent memory abuse.
+
+### P1 High
+6. Add settings key allowlist in `/api/settings` to prevent arbitrary key injection.
+7. Add pagination (`limit`, `offset`) with enforced maximums for list APIs.
+8. Add DB indexes and query hygiene for scalability (`messages.conversation_id`, timestamps).
+9. Replace raw exception leakage to clients with generic safe error messages + server-side logs.
+10. Add request/response timeout and retry policy consistency across external calls.
+11. Add endpoint-level audit logging for destructive operations.
+12. Add unit/integration tests for: remember/forget parsing, refusal detection, search fallback, SSE done/error shape.
+13. Add conversation title sanitization and length constraints.
+14. Ensure default preset semantics are correct (currently all seeded presets are marked default).
+15. Add preflight validation for required model/preset selection and block send with clear user guidance instead of timing out.
+
+### P2 Important Features
+16. Skills system: load markdown skill files with YAML frontmatter from skills directory.
+17. Skills registry API: list/enable/disable skills and expose active skills to UI.
+18. Inject active skill instructions into system prompt with bounded token budget.
+19. Tool execution guardrails: allowlist, confirmation mode, and execution logs.
+20. Heartbeat scheduler (cron/systemd timer) for daily check-ins.
+21. Heartbeat endpoint for generated briefings and anomaly summaries.
+22. Model info UI panel (description, updated date, best-use purpose).
+23. Default model selection improvements and persistence validation.
+24. Hidden model list support (exclude models from dropdown).
+25. Model update action from UI (trigger controlled model pull).
+
+### P3 Nice to Have
+26. Conversation search/filter and export tooling.
+27. Keyboard shortcuts, retry button, and source-link polish.
+
+## Maintenance Rules
+- Keep this file as the single source of truth.
+- Update item priority/status whenever work starts or completes.
+- Mirror the Top 10 summary in README and keep counts aligned.
--- a/readme.md
+++ b/readme.md
@@ -1,263 +1,355 @@
-# ⚡ JarvisChat
+# ⚡ JarvisChat v1.7.8
+
 ![screenshot](docs/images/screenshot.png)
-**A lightweight Ollama coding companion that runs on Python 3.13**

-![Version](https://img.shields.io/badge/version-1.3.1-blue)
-![Python](https://img.shields.io/badge/python-3.13-green)
-![License](https://img.shields.io/badge/license-MIT-orange)
+**A lightweight Ollama coding companion with persistent memory, web search, and real-time system monitoring.**

-JarvisChat is a single-file FastAPI application that provides a clean, responsive web interface for Ollama. It features persistent memory, automatic web search when the model is uncertain, and real-time token tracking.
+Built with FastAPI + SQLite + Jinja2. Runs on Python 3.13. No Docker required.
+
+Developer wiki: [docs/wiki/Home.md](docs/wiki/Home.md)
+
+Core architecture deep-dive: [docs/wiki/Developer-Architecture.md](docs/wiki/Developer-Architecture.md)
+
+## Security Scope Disclaimer
+
+JarvisChat is designed for local and home-lab use (same host or trusted LAN).
+
+JarvisChat may technically work with frontier or commercial AI endpoints, but the author does not recommend or support that usage.
+
+Supported deployments are contained local/home-lab environments.
+
+By default, API access is limited to loopback + private LAN CIDRs. You can override with `JARVISCHAT_ALLOWED_CIDRS` (comma-separated CIDRs) and optionally trust reverse-proxy forwarding with `JARVISCHAT_TRUST_X_FORWARDED_FOR=true`.
+
+If you deploy outside a trusted local subnet, your risk profile changes significantly and the default protections here may be insufficient.
+
+Use at your own risk. No warranty is provided for Internet-exposed deployments.
+
+## What's New in v1.7.x
+
+- **Security hardening suite completed** - request rate limits, payload caps, settings allowlist, safe error envelopes, and LAN CIDR gate controls
+- **Customer-safe incident handling** - client-facing errors include support-friendly incident keys while full traces remain in server logs
+- **Streaming and regression test expansion** - automated coverage for SSE chat/search paths, memory remember/forget command handling, and auth/guardrail behavior
+- **Skills framework (Phase 1)** - built-in local skill registry with per-skill enable controls, API endpoints, and bounded prompt injection
+- **Skills WebUX controls** - Settings modal now includes a master skills toggle and per-skill toggles for admin users
+
+## What's New in v1.6.x
+
+- **Guest/admin capability split** - guest chat by default with 4-digit admin PIN for advanced or destructive operations
+- **Session + lockout controls** - session lifecycle endpoints, heartbeat, logout/revoke behavior, failed PIN lockout protections, and auth audit events
+- **Browser request protections** - strict origin checks for state-changing requests and admin-only write enforcement
+- **Unsafe link protection** - outbound search links sanitized to allow only http/https absolute URLs
+- **Operational stability fixes** - safer first-boot PIN policy handling and memory-search tokenization fix for punctuation/FTS edge cases
+
+## What's New in v1.5.0
+
+- **Explicit Web Search Button** — 🔍 button next to SEND forces a web search, bypassing model uncertainty detection
+- **Orange Search Styling** — Search results, WEB badge, and search button share consistent orange color scheme
+- **Expanded Refusal Patterns** — Added "As an AI model", "based on my training data", "I don't have the capability"
+- **Code cleanup** — Removed unused `JSONResponse` import and dead `raw_results_md` variable
+- **Bug fixes** — Replaced bare `except` clauses with `except Exception`; corrected `add_memory()` return type to `int | None`; updated `TemplateResponse` call to Starlette's current API signature
+
+## What's New in v1.4.0
+
+- **FTS5 Memory System**: Say "remember that..." to store facts — they're automatically retrieved by relevance and injected into context
+- **Forget Command**: Say "forget about..." to remove memories
+- **Memory Toggle**: Enable/disable memory injection from topbar or settings
+- **Multi-file Structure**: Backend and frontend separated for easier maintenance

 ## Features

- **Persistent Profile/Memory** — Your context is injected into every conversation automatically
- **System Prompt Presets** — Switch between coding assistant, sysadmin, general, or custom modes
- **Streaming Chat** — Real-time token streaming with conversation history
- **Model Switching** — Hot-swap between all installed Ollama models
- **Web Search Integration** — SearXNG kicks in automatically when the model is uncertain (perplexity-based)
- **Weather Queries** — Direct wttr.in integration for weather questions
- **Token Thermometer** — Visual context usage bar with live updates as you type
- **Perplexity & Speed Badges** — See model confidence (PPL) and tokens/sec on each response
- **Copy-to-Clipboard** — One-click copy on all code blocks
- **Dark Theme** — Easy on the eyes for long coding sessions
+- **Persistent Memory** — SQLite FTS5 full-text search for fast, relevant memory retrieval
+- **Web Search** — SearXNG integration for automatic web lookups when the model is uncertain
+- **Explicit Search** — 🔍 button to force web search without waiting for model uncertainty
+- **Profile Injection** — Custom system prompt injected into every conversation
+- **System Presets** — Save and switch between different system prompts
+- **Real-time Stats** — CPU, RAM, GPU, VRAM monitoring in sidebar
+- **Token Thermometer** — Visual context window usage indicator
+- **Streaming Responses** — Server-sent events for real-time token display
+- **Conversation History** — SQLite-backed chat persistence with mass-delete option
+- **Model Switching** — Change Ollama models on the fly

-## Architecture
+## Current WiP (Prioritized)
+
+Canonical backlog: [docs/wiki/current-wip.md](docs/wiki/current-wip.md)
+
+Scope boundary: local-first (same-host Ollama), optional RFC1918 LAN endpoints, no public Internet AI endpoints by default.
+
+Total identified items: 27
+
+Top 10 (brief):
+
+1. P0 [DONE]: Add auth for write/admin endpoints
+2. P0 [DONE]: Add CSRF/origin protection for state-changing requests
+3. P0 [DONE]: Block unsafe URL schemes in rendered links
+4. P0 [DONE]: Add rate limiting and request size limits
+5. P1 [DONE]: Restrict `/api/settings` updates to allowlisted keys
+6. P1: Add pagination + hard caps for list APIs
+7. P1 [DONE]: Replace raw exception leakage with safe client errors
+8. P1 [DONE]: Add automated tests for streaming/search/memory paths
+9. P2 [DONE]: Implement MCP-style skills/tool-call framework
+10. P2: Implement heartbeat/check-in scheduler + summary endpoint
+
+Item 1 executive summary: keep guest mode for conversational chat, require 4-digit admin PIN for advanced/destructive actions, and enforce local/LAN-only backend policy by default.
+
+Implementation status: complete (guest session by default + admin unlock + admin-only write enforcement + origin checks + safe-link sanitization + audit logging + rate/payload guardrails + capability tests).
+
+## TODO
+
+1. ~~Verify SearXNG and Docker services persist across reboots~~
+2. Conversation search/filter by keyword
+3. Export conversation to markdown/text
+4. Keyboard shortcuts (Ctrl+N new chat, Ctrl+Enter send)
+5. Retry button on assistant messages
+6. Source links — clickable links when search used
+7. Allow conversation renaming
+8. Multiple profiles — coding/sysadmin/general
+9. Auto-generate conversation tags (client-side KWIC, top 5, filterable badges)
+10. Image input support — pull vision model, file input/drag-drop, base64 encode, pass `images` array to Ollama `/api/chat`
+11. Split-screen option for btop display
+12. Skills as markdown files — `/opt/jarvischat/skills/`, YAML frontmatter + instructions, injected into context for tool calls
+13. Heartbeats / proactive check-ins — cron + endpoint for daily briefings, HA anomaly alerts
+14. Model info button — (i) icon next to Model dropdown, shows div with model description, last updated date, best-use purpose
+15. Set default model — toggle any model as the default selection
+16. Hide/remove model from list — exclude models from dropdown
+17. Update model function — trigger `ollama pull` for selected model from UI
+18. Add mouseover tooltip to SEND button
+19. Add preflight validation for required model/preset selection and show a clear warning before send to prevent avoidable timeout loops
+
+## File Structure

 ```
-Browser ◄──► app.py (FastAPI) ◄──► Ollama (LLM)
-                    │
-                    ▼ (when uncertain)
-               SearXNG (web search)
+/opt/jarvischat/
+├── app.py              # FastAPI backend
+├── jarvischat.db       # SQLite database (auto-created)
+├── static/
+│   └── logo.png        # Logo image (optional)
+└── templates/
+    └── index.html      # Frontend
 ```

-JarvisChat acts as middleware between your browser and Ollama. When the model's perplexity exceeds a threshold (default 15.0) or it refuses to answer, JarvisChat automatically queries SearXNG, injects the results, and re-prompts the model.
-
-**This is NOT training** — SearXNG is only used at runtime as a fallback for uncertain responses.
-
 ## Requirements

 - Python 3.11+ (tested on 3.13)
- Ollama running locally (default: `localhost:11434`)
- SearXNG (optional, for web search — default: `localhost:8888`)
- ROCm (optional, for AMD GPU stats — `rocm-smi` must be in PATH)
+- Ollama running locally or on network
+- SearXNG (optional, for web search)

 ## Installation

-```bash
-# Clone or download app.py
-git clone https://github.com/llamachileshop-code/313_webui.git
-cd 313_webui
+### Fresh Install

-# Create virtual environment (recommended)
+```bash
+# Create directory and venv
+sudo mkdir -p /opt/jarvischat
+sudo chown $USER:$USER /opt/jarvischat
+cd /opt/jarvischat
 python3 -m venv venv
-source venv/bin/activate

 # Install dependencies
-pip install fastapi httpx uvicorn psutil
+./venv/bin/pip install fastapi uvicorn httpx psutil jinja2 python-multipart

-# Run
-python app.py
-# or
-uvicorn app:app --host 0.0.0.0 --port 8080
+# Set admin PIN before first startup (4 digits)
+export JARVISCHAT_ADMIN_PIN=4827
+
+# Create subdirectories
+mkdir -p templates static
+
+# Copy files
+# (copy app.py to /opt/jarvischat/)
+# (copy index.html to /opt/jarvischat/templates/)
+# (copy logo.png to /opt/jarvischat/static/ — optional)
 ```

-Open `http://localhost:8080` in your browser.
+WARNING: Do not use `1234` as your admin PIN unless you accept weak local security.
+
+NOTE: First boot now requires `JARVISCHAT_ADMIN_PIN` unless you explicitly opt into insecure fallback with `JARVISCHAT_ALLOW_DEFAULT_PIN=true`.
+
+### Upgrading from v1.4.x

-**Note:** If running as a systemd service with a venv, install dependencies using the venv pip directly:
 ```bash
-/opt/jarvischat/venv/bin/pip install fastapi httpx uvicorn psutil
+cd /opt/jarvischat
+
+# Backup
+cp app.py app.py.bak
+cp templates/index.html templates/index.html.bak
+
+# Copy new files
+# (copy app.py, replacing old version)
+# (copy index.html to templates/)
+
+# Restart
+sudo systemctl restart jarvischat
 ```

-## Running as a Service
-
-**Important:** Although JarvisChat is a single-file Python application, it's designed to run as a persistent service alongside Ollama — not as a one-off script. Both services should start on boot.
-
-### systemd Service (recommended)
+## Systemd Service

 Create `/etc/systemd/system/jarvischat.service`:

 ```ini
 [Unit]
-Description=JarvisChat - Ollama Web UI
-After=network.target ollama.service
-Wants=ollama.service
+Description=JarvisChat - Local Ollama Web Interface
+After=network.target

 [Service]
 Type=simple
-User=your-username
-WorkingDirectory=/path/to/313_webui
-ExecStart=/usr/bin/python3 app.py
-Restart=on-failure
+User=jarvischat
+Group=jarvischat
+WorkingDirectory=/opt/jarvischat
+ExecStart=/opt/jarvischat/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8080
+Restart=always
 RestartSec=5

 [Install]
 WantedBy=multi-user.target
 ```

-Then enable and start:
-
 ```bash
 sudo systemctl daemon-reload
 sudo systemctl enable jarvischat
 sudo systemctl start jarvischat
 ```

-### Verify Both Services
+## Memory Commands

-```bash
-# Check Ollama
-systemctl status ollama
+In chat, natural language triggers memory operations:

-# Check JarvisChat
-systemctl status jarvischat
+| You say | What happens |
+|---------|--------------|
+| "remember that I prefer Rust over Go" | Stores as `preference` |
+| "remember that JarvisChat runs on port 8080" | Stores as `infrastructure` |
+| "note that the deadline is Friday" | Stores as `general` |
+| "forget about the deadline" | Removes matching memories |

-# View JarvisChat logs
-journalctl -t jarvischat -f
-```
+Memories are automatically searched based on your message content and injected into the system prompt when relevant.

-## Configuration
+### Memory Topics

-Edit these constants at the top of `app.py`:
-
-```python
-VERSION = "1.3.1"
-OLLAMA_BASE = "http://localhost:11434"
-SEARXNG_BASE = "http://localhost:8888"
-DEFAULT_MODEL = "deepseek-coder:6.7b"
-PERPLEXITY_THRESHOLD = 15.0  # Higher = less likely to trigger search
-```
-
-## Database
-
-JarvisChat uses SQLite (`jarvischat.db` in the same directory as `app.py`):
-
-| Table | Purpose |
-|-------|---------|
-| conversations | Chat sessions with model and timestamps |
-| messages | Individual messages with role and content |
-| system_presets | Saved system prompt presets |
-| profile | Your persistent memory/context |
-| settings | App settings (search/profile toggles, default model) |
-
-## Logging
-
-JarvisChat logs to syslog via journald:
-
-```bash
-# Follow live logs
-journalctl -t jarvischat -f
-
-# View last 100 entries
-journalctl -t jarvischat -n 100
-```
-
-## Token Thermometer
-
-The vertical bar next to the input shows your context usage in real-time:
-
- **Green** — Plenty of room
- **Yellow** — 70%+ used
- **Red** — 90%+ used (approaching limit)
-
-The count includes: profile + preset + conversation history + current input. Context size is fetched from Ollama when you switch models.
-
-## Search Flow
-
-1. User sends message → Ollama streams response with logprobs
-2. JarvisChat calculates perplexity from logprobs
-3. If perplexity > 15.0 OR refusal patterns detected:
-   - Yield `{searching: True}` to show spinner
-   - Query SearXNG (or wttr.in for weather)
-   - Inject results into context
-   - Re-prompt Ollama
-4. If model still refuses, format raw search results directly
-5. Clean hedging phrases from response
-6. Yield final response with PPL and t/s badges
+Memories are auto-categorized:
+- `preference` — likes, dislikes, choices
+- `project` — active work, repos, tasks
+- `infrastructure` — servers, services, configs
+- `personal` — name, location, background
+- `general` — everything else

 ## API Endpoints

-| Endpoint | Method | Description |
-|----------|--------|-------------|
-| `/` | GET | Web UI |
-| `/api/models` | GET | List Ollama models |
-| `/api/ps` | GET | Running models |
-| `/api/show` | POST | Model info (context size) |
-| `/api/stats` | GET | System stats (CPU, memory, GPU, VRAM) |
-| `/api/chat` | POST | Stream chat (SSE) |
-| `/api/conversations` | GET/DELETE | List/delete all conversations |
-| `/api/conversations/{id}` | GET/DELETE | Get/delete conversation |
-| `/api/profile` | GET/PUT | Get/update profile |
-| `/api/presets` | GET/POST | List/create presets |
-| `/api/presets/{id}` | PUT/DELETE | Update/delete preset |
-| `/api/settings` | GET/PUT | App settings |
-| `/api/search/status` | GET | SearXNG availability |
+### Memory

-## Screenshots
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/api/memories` | List all memories |
+| POST | `/api/memories` | Add memory `{"fact": "...", "topic": "general"}` |
+| DELETE | `/api/memories/{rowid}` | Delete memory by ID |
+| GET | `/api/memories/search?q=term` | Search memories |
+| GET | `/api/memories/stats` | Get counts by topic |

-*(Add your own screenshot here)*
+### Chat & Models

-## TODO
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/api/models` | List available Ollama models |
+| POST | `/api/chat` | Send message (streaming SSE) |
+| POST | `/api/search` | Explicit web search (streaming SSE) |
+| POST | `/api/show` | Get model info (context size) |
+| GET | `/api/ps` | Get running models |

-### Active
+### Settings & Profile

-1. ~~**Mass-delete conversation history**~~ ✓ (v1.3.0)
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/api/profile` | Get profile content |
+| PUT | `/api/profile` | Update profile |
+| GET | `/api/profile/default` | Get default profile |
+| GET | `/api/settings` | Get settings |
+| PUT | `/api/settings` | Update settings |

-2. **Verify SearXNG and Docker services persist across reboots**
-   - Expand refusal patterns: "As an AI model", "based on my training data", "I don't have the capability"
+### Conversations

-3. **Input trigger: `search+` prefix**
-   - Strip prefix, query SearXNG directly, Ollama summarizes
-   - Raw results in expandable div (not tooltip)
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/api/conversations` | List conversations |
+| GET | `/api/conversations/{id}` | Get conversation with messages |
+| DELETE | `/api/conversations/{id}` | Delete conversation |
+| DELETE | `/api/conversations` | Delete ALL conversations |

-4. **Add `profile.example.md`**
-   - Recommended default profile with anti-bullshit rules (no "As an AI", no OpenAI mentions)
+### Presets

-### Backlog
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/api/presets` | List presets |
+| POST | `/api/presets` | Create preset |
+| PUT | `/api/presets/{id}` | Update preset |
+| DELETE | `/api/presets/{id}` | Delete preset |

-5. Conversation search/filter by keyword
-6. Export conversation to markdown/text
-7. Keyboard shortcuts (Ctrl+N new chat, Ctrl+Enter send)
-8. ~~Token count estimate before sending~~ ✓ (v1.2.9)
-9. Model info display — context length, VRAM usage from Ollama `/api/ps`
-10. Retry button on assistant messages
-11. Source links — clickable links when search used
-12. Allow conversation renaming
-13. Multiple profiles — coding/sysadmin/general
-14. Auto-generate conversation tags (client-side KWIC, top 5, filterable badges)
-15. **Image input support**
-    - Pull vision model (llava, llama3.2-vision, etc.)
-    - Frontend: file input / drag-drop, base64 encode
-    - Backend: pass `images` array to Ollama `/api/chat`
+### System

-## Version History
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| GET | `/api/stats` | CPU, RAM, GPU, VRAM stats |
+| GET | `/api/search/status` | SearXNG availability |

-| Version | Changes |
-|---------|---------|
-| 1.3.1 | System stats panel (CPU, memory, GPU, VRAM) in sidebar |
-| 1.3.0 | Delete all conversations button |
-| 1.2.9 | Token thermometer with live context tracking |
-| 1.2.8 | Logo in sidebar, llama emoji tagline |
-| 1.2.7 | Tokens per second (t/s) badge on responses |
-| 1.2.6 | wttr.in weather integration, improved search extraction |
-| 1.2.5 | SearXNG infoboxes/answers, smarter query building |
-| 1.2.4 | Perplexity badges, hedging cleanup |
-| 1.2.3 | SearXNG integration with perplexity-based triggering |
-| 1.2.0 | System prompt presets, settings persistence |
-| 1.1.0 | Profile memory, model switching |
-| 1.0.0 | Initial release |
+## Configuration
+
+Settings are stored in the `settings` table and include:
+
+- `profile_enabled` — Inject profile into chats (true/false)
+- `search_enabled` — Auto web search (true/false)
+- `memory_enabled` — Memory injection (true/false)
+- `default_model` — Default Ollama model
+- `searxng_url` — SearXNG instance URL (default: `http://localhost:8888`)
+
+## Testing Memory
+
+```bash
+# Add a memory via API
+curl -X POST http://jarvis:8080/api/memories \
+  -H "Content-Type: application/json" \
+  -d '{"fact": "User prefers native installs over Docker", "topic": "preference"}'
+
+# Search memories
+curl "http://jarvis:8080/api/memories/search?q=docker"
+
+# List all memories
+curl http://jarvis:8080/api/memories
+
+# Get stats
+curl http://jarvis:8080/api/memories/stats
+```
+
+Or in chat:
+1. Say "remember that I hate YAML"
+2. Later ask "what markup languages should I avoid?"
+3. JarvisChat will inject the YAML preference into context
+
+## Troubleshooting
+
+### Service won't start
+
+Check logs:
+```bash
+journalctl -u jarvischat -n 50 --no-pager
+```
+
+Common issues:
+- Missing `jinja2`: `./venv/bin/pip install jinja2`
+- Missing `templates/` directory
+- Wrong permissions on `/opt/jarvischat`
+
+### Memory not working
+
+1. Check memory is enabled (🧠 MEM ON in topbar)
+2. Verify memories exist: `curl http://jarvis:8080/api/memories`
+3. Check FTS5 table: `sqlite3 jarvischat.db "SELECT * FROM memories_fts;"`
+
+### Web search not working
+
+1. Verify SearXNG is running: `curl http://localhost:8888/search?q=test&format=json`
+2. Check search status: `curl http://jarvis:8080/api/search/status`
+3. Ensure JSON format is enabled in SearXNG settings

 ## License

 MIT

---
+## Repository

-## A Note from Gramps
-
-I named my AI machine "jarvis" after the AI assistant in *Iron Man* (2008) — because it's an awesome name. When I started building a local coding companion to talk to it, "JarvisChat" just made sense.
-
-This project is in active development. Eventually it'll get packaged up as a Docker thing, but for now while I'm iterating fast, a single-file Python service does the job.
-
---
-
-*Built with 🦙 by Gramps at the Llama Chile Shop*
+Gitea: `ssh://gitea@llgit.llamachile.tube:1319/gramps/jarvisChat.git`
--- a/static/logo.jpg
+++ b/static/logo.jpg
--- a/templates/index.html
+++ b/templates/index.html
--- a/tests/test_auth_capabilities.py
+++ b/tests/test_auth_capabilities.py
@@ -0,0 +1,78 @@
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_client(tmp_path: Path) -> TestClient:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-test.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.init_db()
+    return TestClient(app_module.app)
+
+
+def test_guest_read_only_admin_write_blocked(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        guest = client.post("/api/auth/guest", headers={"Origin": "http://testserver"})
+        assert guest.status_code == 200
+        sid = guest.json()["session_id"]
+        headers = {"X-Session-ID": sid}
+
+        read_resp = client.get("/api/memories", headers=headers)
+        assert read_resp.status_code == 200
+
+        write_resp = client.post(
+            "/api/memories",
+            json={"fact": "guest write should fail", "topic": "general"},
+            headers={**headers, "Origin": "http://testserver"},
+        )
+        assert write_resp.status_code == 403
+
+
+def test_admin_can_write_and_delete_memory(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        login = client.post(
+            "/api/auth/login",
+            json={"pin": "1234"},
+            headers={"Origin": "http://testserver"},
+        )
+        assert login.status_code == 200
+        sid = login.json()["session_id"]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        create_resp = client.post(
+            "/api/memories",
+            json={"fact": "admin write ok", "topic": "general"},
+            headers=headers,
+        )
+        assert create_resp.status_code == 200
+        rowid = create_resp.json()["rowid"]
+
+        delete_resp = client.delete(f"/api/memories/{rowid}", headers=headers)
+        assert delete_resp.status_code == 200
+
+
+def test_origin_check_blocks_cross_site_writes(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        denied = client.post("/api/auth/guest", headers={"Origin": "http://evil.example"})
+        assert denied.status_code == 403
+
+        allowed = client.post("/api/auth/guest", headers={"Origin": "http://testserver"})
+        assert allowed.status_code == 200
+
+
+def test_logout_revokes_session(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        guest = client.post("/api/auth/guest", headers={"Origin": "http://testserver"})
+        sid = guest.json()["session_id"]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        logout = client.post("/api/auth/logout", headers=headers)
+        assert logout.status_code == 200
+
+        after = client.get("/api/memories", headers={"X-Session-ID": sid})
+        assert after.status_code == 401
--- a/tests/test_chat_streaming_and_memory_paths.py
+++ b/tests/test_chat_streaming_and_memory_paths.py
@@ -0,0 +1,188 @@
+import json
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_client(tmp_path: Path) -> TestClient:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-streaming.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.RATE_EVENTS.clear()
+    app_module.init_db()
+    return TestClient(app_module.app, raise_server_exceptions=False)
+
+
+def parse_sse_payloads(body: str) -> list[dict]:
+    payloads: list[dict] = []
+    for chunk in body.split("\n\n"):
+        chunk = chunk.strip()
+        if not chunk.startswith("data: "):
+            continue
+        raw = chunk[len("data: ") :]
+        payloads.append(json.loads(raw))
+    return payloads
+
+
+class _MockStreamResponse:
+    def __init__(self, lines: list[str]):
+        self._lines = lines
+
+    async def __aenter__(self):
+        return self
+
+    async def __aexit__(self, exc_type, exc, tb):
+        return False
+
+    async def aiter_lines(self):
+        for line in self._lines:
+            yield line
+
+
+def _stream_json_lines(events: list[dict]) -> list[str]:
+    return [json.dumps(event) for event in events]
+
+
+def test_chat_stream_emits_tokens_and_done(tmp_path: Path, monkeypatch):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
+            "session_id"
+        ]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        events = _stream_json_lines(
+            [
+                {"message": {"content": "Hel"}, "logprobs": [{"logprob": -0.01}]},
+                {"message": {"content": "lo"}, "logprobs": [{"logprob": -0.01}]},
+                {"done": True, "eval_count": 2, "eval_duration": 1000000000},
+            ]
+        )
+
+        def stream_stub(self, method, url, json=None, timeout=None):
+            return _MockStreamResponse(events)
+
+        monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", stream_stub)
+
+        resp = client.post(
+            "/api/chat",
+            json={"message": "hello", "model": app_module.DEFAULT_MODEL},
+            headers=headers,
+        )
+        assert resp.status_code == 200
+        payloads = parse_sse_payloads(resp.text)
+
+        token_text = "".join(p.get("token", "") for p in payloads if "token" in p)
+        assert token_text == "Hello"
+        done_events = [p for p in payloads if p.get("done")]
+        assert done_events
+        assert "searched" not in done_events[-1]
+
+
+def test_chat_auto_search_trigger_emits_search_events(tmp_path: Path, monkeypatch):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
+            "session_id"
+        ]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        first_stream = _stream_json_lines(
+            [
+                {
+                    "message": {"content": "I am uncertain."},
+                    "logprobs": [{"logprob": -5.0}],
+                },
+                {"done": True, "eval_count": 2, "eval_duration": 1000000000},
+            ]
+        )
+        second_stream = _stream_json_lines(
+            [
+                {"message": {"content": "Based on current data: 42."}},
+                {"done": True},
+            ]
+        )
+        stream_batches = [first_stream, second_stream]
+
+        def stream_stub(self, method, url, json=None, timeout=None):
+            return _MockStreamResponse(stream_batches.pop(0))
+
+        async def search_stub(query: str, max_results: int = 5):
+            return [
+                {
+                    "title": "Answer",
+                    "url": "https://example.com",
+                    "content": "The value is 42.",
+                }
+            ]
+
+        monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", stream_stub)
+        monkeypatch.setattr(app_module, "query_searxng", search_stub)
+
+        resp = client.post(
+            "/api/chat",
+            json={"message": "what is the latest value", "model": app_module.DEFAULT_MODEL},
+            headers=headers,
+        )
+        assert resp.status_code == 200
+        payloads = parse_sse_payloads(resp.text)
+
+        assert any(p.get("searching") is True for p in payloads)
+        assert any("search_results" in p for p in payloads)
+        assert any(p.get("augmented") is True for p in payloads)
+        done_events = [p for p in payloads if p.get("done")]
+        assert done_events and done_events[-1].get("searched") is True
+
+
+def test_memory_command_paths_remember_and_forget(tmp_path: Path, monkeypatch):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
+            "session_id"
+        ]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        base_stream = _stream_json_lines(
+            [
+                {"message": {"content": "ok"}, "logprobs": [{"logprob": -0.01}]},
+                {"done": True, "eval_count": 1, "eval_duration": 1000000000},
+            ]
+        )
+
+        def stream_stub(self, method, url, json=None, timeout=None):
+            return _MockStreamResponse(base_stream)
+
+        monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", stream_stub)
+
+        remember_resp = client.post(
+            "/api/chat",
+            json={
+                "message": "remember that my favorite language is rust",
+                "model": app_module.DEFAULT_MODEL,
+            },
+            headers=headers,
+        )
+        assert remember_resp.status_code == 200
+        remember_events = parse_sse_payloads(remember_resp.text)
+        assert any("Remembered" in p.get("token", "") for p in remember_events)
+
+        memories_after_add = client.get("/api/memories", headers={"X-Session-ID": sid})
+        assert memories_after_add.status_code == 200
+        assert memories_after_add.json().get("count", 0) >= 1
+
+        forget_resp = client.post(
+            "/api/chat",
+            json={
+                "message": "forget about my favorite language",
+                "model": app_module.DEFAULT_MODEL,
+            },
+            headers=headers,
+        )
+        assert forget_resp.status_code == 200
+        forget_events = parse_sse_payloads(forget_resp.text)
+        assert any("Forgot" in p.get("token", "") for p in forget_events)
+
+        memories_after_forget = client.get("/api/memories", headers={"X-Session-ID": sid})
+        assert memories_after_forget.status_code == 200
+        assert memories_after_forget.json().get("count", 0) == 0
--- a/tests/test_error_envelopes.py
+++ b/tests/test_error_envelopes.py
@@ -0,0 +1,72 @@
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_client(tmp_path: Path) -> TestClient:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-errors.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.RATE_EVENTS.clear()
+    app_module.init_db()
+    return TestClient(app_module.app, raise_server_exceptions=False)
+
+
+def test_unhandled_api_exception_returns_friendly_error_with_incident_key(
+    tmp_path: Path, monkeypatch
+):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
+            "session_id"
+        ]
+        headers = {"X-Session-ID": sid}
+
+        def boom(_topic=None):
+            raise RuntimeError("super secret db internals")
+
+        monkeypatch.setattr(app_module, "get_all_memories", boom)
+
+        resp = client.get("/api/memories", headers=headers)
+        assert resp.status_code == 500
+        payload = resp.json()
+        assert payload.get("error_key", "").startswith("INC-")
+        assert "support lookup" in payload.get("detail", "").lower()
+        assert "super secret db internals" not in resp.text
+
+
+def test_chat_stream_error_hides_internal_exception_and_emits_incident_key(
+    tmp_path: Path, monkeypatch
+):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
+            "session_id"
+        ]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        class BrokenStreamContext:
+            async def __aenter__(self):
+                raise RuntimeError("ultra secret model transport failure")
+
+            async def __aexit__(self, exc_type, exc, tb):
+                return False
+
+        def broken_stream(*args, **kwargs):
+            return BrokenStreamContext()
+
+        monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", broken_stream)
+
+        resp = client.post(
+            "/api/chat",
+            json={"message": "hello", "model": app_module.DEFAULT_MODEL},
+            headers=headers,
+        )
+
+        assert resp.status_code == 200
+        body = resp.text
+        assert "ultra secret model transport failure" not in body
+        assert "error_key" in body
+        assert "support lookup" in body.lower()
--- a/tests/test_ip_allowlist.py
+++ b/tests/test_ip_allowlist.py
@@ -0,0 +1,50 @@
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_client(tmp_path: Path) -> TestClient:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-ip.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.RATE_EVENTS.clear()
+    app_module.init_db()
+    return TestClient(app_module.app)
+
+
+def test_ip_helper_allows_local_defaults():
+    assert app_module.is_ip_allowed("127.0.0.1")
+    assert app_module.is_ip_allowed("192.168.1.10")
+    assert app_module.is_ip_allowed("10.0.0.42")
+    assert app_module.is_ip_allowed("172.16.1.2")
+    assert app_module.is_ip_allowed("testclient")
+
+
+def test_ip_helper_blocks_public_ip():
+    assert not app_module.is_ip_allowed("8.8.8.8")
+
+
+def test_middleware_blocks_disallowed_ip(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        original_get_client_ip = app_module.get_client_ip
+        try:
+            app_module.get_client_ip = lambda _req: "8.8.8.8"
+            resp = client.post("/api/auth/guest")
+            assert resp.status_code == 403
+        finally:
+            app_module.get_client_ip = original_get_client_ip
+
+
+def test_middleware_allows_local_ip(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        original_get_client_ip = app_module.get_client_ip
+        try:
+            app_module.get_client_ip = lambda _req: "192.168.50.109"
+            resp = client.post("/api/auth/guest")
+            assert resp.status_code == 200
+        finally:
+            app_module.get_client_ip = original_get_client_ip
--- a/tests/test_rate_and_payload_guardrails.py
+++ b/tests/test_rate_and_payload_guardrails.py
@@ -0,0 +1,76 @@
+import json
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_client(tmp_path: Path) -> TestClient:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-rate.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.RATE_EVENTS.clear()
+    app_module.init_db()
+    return TestClient(app_module.app)
+
+
+def test_stats_rate_limit_hits_429(tmp_path: Path):
+    old_limit = app_module.RL_STATS_PER_WINDOW
+    old_window = app_module.RATE_WINDOW_SECONDS
+    app_module.RL_STATS_PER_WINDOW = 2
+    app_module.RATE_WINDOW_SECONDS = 60
+    try:
+        with make_client(tmp_path) as client:
+            sid = client.post("/api/auth/guest").json()["session_id"]
+            headers = {"X-Session-ID": sid}
+
+            r1 = client.get("/api/stats", headers=headers)
+            r2 = client.get("/api/stats", headers=headers)
+            r3 = client.get("/api/stats", headers=headers)
+
+            assert r1.status_code == 200
+            assert r2.status_code == 200
+            assert r3.status_code == 429
+    finally:
+        app_module.RL_STATS_PER_WINDOW = old_limit
+        app_module.RATE_WINDOW_SECONDS = old_window
+
+
+def test_large_login_payload_rejected_413(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        huge_pin = "1" * (app_module.BODY_LIMIT_DEFAULT_BYTES + 100)
+        resp = client.post(
+            "/api/auth/login",
+            data=json.dumps({"pin": huge_pin}),
+            headers={"Content-Type": "application/json"},
+        )
+        assert resp.status_code == 413
+
+
+def test_chat_message_length_rejected_413(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest").json()["session_id"]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+        message = "x" * (app_module.MAX_CHAT_MESSAGE_CHARS + 1)
+        resp = client.post(
+            "/api/chat",
+            json={"message": message, "model": app_module.DEFAULT_MODEL},
+            headers=headers,
+        )
+        assert resp.status_code == 413
+
+
+def test_search_query_length_rejected_413(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest").json()["session_id"]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+        query = "q" * (app_module.MAX_SEARCH_QUERY_CHARS + 1)
+        resp = client.post(
+            "/api/search",
+            json={"query": query, "model": app_module.DEFAULT_MODEL},
+            headers=headers,
+        )
+        assert resp.status_code == 413
--- a/tests/test_search_url_sanitization.py
+++ b/tests/test_search_url_sanitization.py
@@ -0,0 +1,17 @@
+import app as app_module
+
+
+def test_sanitize_outbound_url_allows_http_https():
+    assert app_module.sanitize_outbound_url("https://example.com/path") == "https://example.com/path"
+    assert app_module.sanitize_outbound_url("http://example.com") == "http://example.com"
+
+
+def test_sanitize_outbound_url_blocks_unsafe_schemes():
+    assert app_module.sanitize_outbound_url("javascript:alert(1)") == ""
+    assert app_module.sanitize_outbound_url("data:text/html,evil") == ""
+    assert app_module.sanitize_outbound_url("file:///etc/passwd") == ""
+
+
+def test_sanitize_outbound_url_blocks_relative_and_empty():
+    assert app_module.sanitize_outbound_url("/relative/path") == ""
+    assert app_module.sanitize_outbound_url("") == ""
--- a/tests/test_settings_allowlist.py
+++ b/tests/test_settings_allowlist.py
@@ -0,0 +1,57 @@
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_admin_client(tmp_path: Path) -> tuple[TestClient, dict[str, str]]:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-settings.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.init_db()
+
+    client = TestClient(app_module.app)
+    login = client.post(
+        "/api/auth/login",
+        json={"pin": "1234"},
+        headers={"Origin": "http://testserver"},
+    )
+    assert login.status_code == 200
+    sid = login.json()["session_id"]
+    headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+    return client, headers
+
+
+def test_settings_allow_known_keys(tmp_path: Path):
+    client, headers = make_admin_client(tmp_path)
+    try:
+        resp = client.put(
+            "/api/settings",
+            json={
+                "profile_enabled": "false",
+                "search_enabled": "true",
+                "memory_enabled": "false",
+                "default_model": "llama3.1:latest",
+            },
+            headers=headers,
+        )
+        assert resp.status_code == 200
+    finally:
+        client.close()
+
+
+def test_settings_reject_unknown_keys(tmp_path: Path):
+    client, headers = make_admin_client(tmp_path)
+    try:
+        resp = client.put(
+            "/api/settings",
+            json={"admin_pin_hash": "oops"},
+            headers=headers,
+        )
+        assert resp.status_code == 400
+        assert "Unknown setting key" in resp.json().get("detail", "")
+    finally:
+        client.close()
--- a/tests/test_skills_framework.py
+++ b/tests/test_skills_framework.py
@@ -0,0 +1,93 @@
+import os
+from pathlib import Path
+
+from fastapi.testclient import TestClient
+
+import app as app_module
+
+
+def make_client(tmp_path: Path) -> TestClient:
+    os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
+    app_module.DB_PATH = tmp_path / "jarvischat-skills.db"
+    app_module.SESSIONS.clear()
+    app_module.PIN_ATTEMPTS.clear()
+    app_module.RATE_EVENTS.clear()
+    app_module.init_db()
+    return TestClient(app_module.app, raise_server_exceptions=False)
+
+
+def test_guest_can_list_skills(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
+            "session_id"
+        ]
+        resp = client.get("/api/skills", headers={"X-Session-ID": sid})
+        assert resp.status_code == 200
+        payload = resp.json()
+        assert payload["count"] >= 1
+        assert any(skill["key"] == "memory.search" for skill in payload["skills"])
+
+
+def test_admin_can_toggle_skill_enabled_state(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        login = client.post(
+            "/api/auth/login",
+            json={"pin": "1234"},
+            headers={"Origin": "http://testserver"},
+        )
+        sid = login.json()["session_id"]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        disable = client.put(
+            "/api/skills/search.web",
+            json={"enabled": False},
+            headers=headers,
+        )
+        assert disable.status_code == 200
+        assert disable.json()["skill"]["enabled"] is False
+
+        active = client.get("/api/skills/active", headers={"X-Session-ID": sid})
+        assert active.status_code == 200
+        assert all(skill["key"] != "search.web" for skill in active.json()["skills"])
+
+
+def test_unknown_skill_update_is_rejected(tmp_path: Path):
+    with make_client(tmp_path) as client:
+        login = client.post(
+            "/api/auth/login",
+            json={"pin": "1234"},
+            headers={"Origin": "http://testserver"},
+        )
+        sid = login.json()["session_id"]
+        headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
+
+        resp = client.put(
+            "/api/skills/nope.unknown",
+            json={"enabled": True},
+            headers=headers,
+        )
+        assert resp.status_code == 404
+
+
+def test_prompt_injection_respects_skills_enabled_setting(tmp_path: Path):
+    with make_client(tmp_path):
+        db = app_module.get_db()
+        try:
+            db.execute(
+                "INSERT OR REPLACE INTO settings (key, value) VALUES (?, ?)",
+                ("skills_enabled", "false"),
+            )
+            db.commit()
+            without_skills = app_module.build_system_prompt(db, "", "hello")
+            assert "## Active Skills" not in without_skills
+
+            db.execute(
+                "INSERT OR REPLACE INTO settings (key, value) VALUES (?, ?)",
+                ("skills_enabled", "true"),
+            )
+            db.commit()
+            with_skills = app_module.build_system_prompt(db, "", "hello")
+            assert "## Active Skills" in with_skills
+            assert "memory.search" in with_skills
+        finally:
+            db.close()
Author	SHA1	Message	Date
gramps	18bca027de	docs: replace README screenshot asset (v1.7.8)	2026-04-28 09:14:54 -07:00
gramps	36bca94840	docs(todo): add model/preset preflight validation item (v1.7.7)	2026-04-28 09:08:36 -07:00
gramps	71b48d940f	docs: add v1.6/v1.7 release notes and developer wiki (v1.7.6)	2026-04-28 08:53:54 -07:00
gramps	58945a4324	feat(ui): add phase-1 skills toggles in settings (v1.7.5)	2026-04-28 08:49:19 -07:00
gramps	4d1541412b	feat(skills): add phase-1 skill registry and toggles (v1.7.4)	2026-04-28 08:44:22 -07:00
gramps	250fec1f06	test(streaming): cover chat/search/memory paths (v1.7.3)	2026-04-28 08:31:01 -07:00
gramps	12188f3ad2	feat(errors): incident-key safe error envelopes (v1.7.2)	2026-04-27 16:56:17 -07:00
gramps	9589141521	feat(settings): allowlist /api/settings keys (v1.7.1)	2026-04-27 16:48:19 -07:00
gramps	c88e52e0ef	chore(release): bump version to v1.7.0	2026-04-27 16:44:33 -07:00
gramps	76e4461b38	feat(security): add LAN IP allowlist and ingress guardrails	2026-04-27 16:43:21 -07:00
gramps	28aa40c42a	release: v1.6.1 link sanitization and backlog updates	2026-04-27 16:25:35 -07:00
gramps	d9eba53926	fix(memory): sanitize FTS query tokens to handle punctuation	2026-04-27 10:23:42 -07:00
gramps	091a851064	chore(release): bump version to v1.6.0	2026-04-27 10:14:24 -07:00
gramps	81319f83d4	feat(auth): add guest/admin PIN security model and hardening	2026-04-27 10:09:53 -07:00
Llama Chile Shop	fc11b73319	Update readme.md marked #1 as completed	2026-04-08 05:02:30 +00:00
gramps	46f1d6bf4e	Add CLAUDE.md with architecture and development guidance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-30 09:12:39 -07:00
gramps	6f410e29d2	Fix type errors and bare except clauses in app.py; update readme Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-29 16:09:13 -07:00
gramps	7a151b7d50	Remove unused imports and dead code; update readme - Drop unused JSONResponse import from fastapi.responses - Remove never-used raw_results_md variable in explicit_search stream - Note cleanup in v1.5.0 changelog Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-29 15:53:48 -07:00
gramps	6988997144	added readme for 1.5	2026-03-15 18:06:00 -07:00
gramps	c798f1220c	updated readme with new todos, minor css tweak	2026-03-15 17:51:27 -07:00
gramps	dc55d0a8c9	add jarvischat logo 2.0	2026-03-15 17:47:23 -07:00
gramps	3d1ede26ca	v.1.5.0: Explicit web search button, orange search styling	2026-03-15 17:12:20 -07:00
gramps	d57f009b10	Fix default model to llama3.1:latest	2026-03-15 15:57:33 -07:00
gramps	1c91c336a9	docs: update readme for v1.4.0, fix venv instructions	2026-03-15 15:27:35 -07:00
gramps	757f26669a	stupid error fix for the logo	2026-03-15 14:56:47 -07:00
gramps	7fccb926db	fix: logo extension jpg to png	2026-03-15 14:54:18 -07:00
gramps	47850efd2a	merge: resolve readme conflict, keep remote header with local content	2026-03-15 14:25:07 -07:00
gramps	4c7610a554	feat(memory): add FTS5 memory system, refactor to multi-file structure	2026-03-15 14:17:15 -07:00