Compare commits

...

28 Commits

Author SHA1 Message Date
18bca027de docs: replace README screenshot asset (v1.7.8) 2026-04-28 09:14:54 -07:00
36bca94840 docs(todo): add model/preset preflight validation item (v1.7.7) 2026-04-28 09:08:36 -07:00
71b48d940f docs: add v1.6/v1.7 release notes and developer wiki (v1.7.6) 2026-04-28 08:53:54 -07:00
58945a4324 feat(ui): add phase-1 skills toggles in settings (v1.7.5) 2026-04-28 08:49:19 -07:00
4d1541412b feat(skills): add phase-1 skill registry and toggles (v1.7.4) 2026-04-28 08:44:22 -07:00
250fec1f06 test(streaming): cover chat/search/memory paths (v1.7.3) 2026-04-28 08:31:01 -07:00
12188f3ad2 feat(errors): incident-key safe error envelopes (v1.7.2) 2026-04-27 16:56:17 -07:00
9589141521 feat(settings): allowlist /api/settings keys (v1.7.1) 2026-04-27 16:48:19 -07:00
c88e52e0ef chore(release): bump version to v1.7.0 2026-04-27 16:44:33 -07:00
76e4461b38 feat(security): add LAN IP allowlist and ingress guardrails 2026-04-27 16:43:21 -07:00
28aa40c42a release: v1.6.1 link sanitization and backlog updates 2026-04-27 16:25:35 -07:00
d9eba53926 fix(memory): sanitize FTS query tokens to handle punctuation 2026-04-27 10:23:42 -07:00
091a851064 chore(release): bump version to v1.6.0 2026-04-27 10:14:24 -07:00
81319f83d4 feat(auth): add guest/admin PIN security model and hardening 2026-04-27 10:09:53 -07:00
fc11b73319 Update readme.md
marked #1 as completed
2026-04-08 05:02:30 +00:00
46f1d6bf4e Add CLAUDE.md with architecture and development guidance
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 09:12:39 -07:00
6f410e29d2 Fix type errors and bare except clauses in app.py; update readme
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 16:09:13 -07:00
7a151b7d50 Remove unused imports and dead code; update readme
- Drop unused JSONResponse import from fastapi.responses
- Remove never-used raw_results_md variable in explicit_search stream
- Note cleanup in v1.5.0 changelog

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 15:53:48 -07:00
6988997144 added readme for 1.5 2026-03-15 18:06:00 -07:00
c798f1220c updated readme with new todos, minor css tweak 2026-03-15 17:51:27 -07:00
dc55d0a8c9 add jarvischat logo 2.0 2026-03-15 17:47:23 -07:00
3d1ede26ca v.1.5.0: Explicit web search button, orange search styling 2026-03-15 17:12:20 -07:00
d57f009b10 Fix default model to llama3.1:latest 2026-03-15 15:57:33 -07:00
1c91c336a9 docs: update readme for v1.4.0, fix venv instructions 2026-03-15 15:27:35 -07:00
757f26669a stupid error fix for the logo 2026-03-15 14:56:47 -07:00
7fccb926db fix: logo extension jpg to png 2026-03-15 14:54:18 -07:00
47850efd2a merge: resolve readme conflict, keep remote header with local content 2026-03-15 14:25:07 -07:00
4c7610a554 feat(memory): add FTS5 memory system, refactor to multi-file structure 2026-03-15 14:17:15 -07:00
19 changed files with 4261 additions and 1464 deletions

1
.gitignore vendored
View File

@@ -3,3 +3,4 @@
*.py- *.py-
__pycache__/ __pycache__/
venv/ venv/
readme.md-

74
CLAUDE.md Normal file
View File

@@ -0,0 +1,74 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Running the App
```bash
# Development
./venv/bin/uvicorn app:app --host 0.0.0.0 --port 8080 --reload
# Production (via systemd)
sudo systemctl restart jarvischat
# Direct run
./venv/bin/python app.py
```
## Dependencies
```bash
./venv/bin/pip install -r requirements.txt
# Also requires: psutil jinja2 python-multipart (not in requirements.txt)
```
## Architecture
Single-file FastAPI backend (`app.py`) + single-template frontend (`templates/index.html`). No build step. SQLite database auto-created at `jarvischat.db` on first run.
### Request Flow: `/api/chat`
1. User message saved to DB → conversation created if new
2. `build_system_prompt()` assembles: profile + FTS5 memory search results + preset prompt
3. Streamed to Ollama (`/api/chat`, `stream: true`, `logprobs: true`) via SSE
4. **Auto web search trigger**: if perplexity > 15.0 OR response matches `REFUSAL_PATTERNS`, re-queries Ollama with SearXNG results prepended to system prompt
5. Final response saved to DB; SSE `done` event sent with perplexity + tokens/sec
### Request Flow: `/api/search` (explicit search)
Bypasses perplexity/refusal detection entirely — queries SearXNG directly then asks Ollama to summarize with results as system context.
### Memory System
FTS5 virtual table (`memories`) in SQLite. `search_memories()` uses BM25 ranking. `process_remember_command()` intercepts "remember that..." / "forget about..." before the message reaches Ollama and returns a confirmation string. Topic auto-detection via keyword matching in `detect_topic()`.
### Key Constants (top of `app.py`)
- `OLLAMA_BASE``http://localhost:11434`
- `SEARXNG_BASE``http://localhost:8888`
- `PERPLEXITY_THRESHOLD``15.0` (controls auto-search sensitivity)
- `DEFAULT_MODEL``llama3.1:latest`
### External Services
- **Ollama** — required, must be running on port 11434
- **SearXNG** — optional, port 8888; `GET /api/search/status` probes availability
- **wttr.in** — weather shortcut in `query_searxng()`, bypasses SearXNG for weather queries
- **rocm-smi** — AMD GPU stats via subprocess; gracefully degrades if not available
### Database
`get_db()` opens a new connection per request (no connection pool). `init_db()` runs at startup via the FastAPI `lifespan` handler. The `profile` table uses a singleton row (`id = 1`). Default settings are seeded but never overwritten by `init_db()`.
### SSE Protocol
All streaming endpoints yield `data: {json}\n\n`. Key event shapes:
- `{token, conversation_id}` — streaming token
- `{searching: true}` — web search triggered
- `{search_results: N}` — N results retrieved
- `{done: true, perplexity, tokens_per_sec, searched?}` — terminal event
- `{error: "..."}` — error event
### Deployment
Runs as systemd service under user `jarvischat`, working directory `/opt/jarvischat`. Logs via syslog (`journalctl -u jarvischat`).

2941
app.py

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,51 @@
# Copilot Chat Incident Report: Context Loss After Project Context Change
Date observed: 2026-04-21
Reporter: Michael Shallop (Gramps)
Environment: VS Code on Linux, GitHub Copilot Chat extension present
## Summary
Switching/loading project context in the VS Code project window caused Copilot Chat conversational context to reset. This resulted in loss of recently generated conclusion/plan data that was intended to be implemented immediately after loading the new project.
## Impact
- Lost actionable conclusions from the active design/planning thread.
- Interrupted workflow at a critical handoff point (planning -> implementation).
- Forced reconstruction from memory instead of exact prior content.
- Increased risk of omissions and rework.
## Reproduction Steps
1. Have an active Copilot Chat conversation containing planning/conclusion details.
2. Load or switch project context in the current project window.
3. Return to Copilot Chat and continue the thread.
4. Observe that prior context is no longer available in-chat as expected.
## Expected Behavior
- Prior active conversation context should remain available, or
- The user should be prompted before context-destructive operations, and
- Recovery path should be obvious and reliable.
## Actual Behavior
- Current chat context was effectively reset.
- The previously concluded upgrade notes were not recoverable from active context.
- Local transcript/debug artifacts did not provide the full prior thread needed.
## Severity
High (workflow-breaking for planning-heavy sessions)
## User-visible Failure Mode
The user lost conclusion data that was intended for immediate implementation once the new project loaded.
## Suggested Fixes
1. Preserve active chat state across workspace/project context changes by default.
2. Show a blocking warning before any action that can drop active conversation state.
3. Add one-click export/snapshot of current conversation before context switch.
4. Improve transcript durability and discoverability for immediate recovery.
5. Add explicit session continuity indicator so users can verify state retention.
## Notes
- This incident occurred in a real implementation workflow and caused direct productivity loss.
- Regression tests should include workspace switch/load scenarios with active chat state.
## Escalation Constraint
- Current product constraints prevented the assistant from directly self-reporting this incident to the Copilot/VS Code dev team from within the chat runtime.
- User feedback to include verbatim: "it is idiotic to keep you from self-reporting issues like this."

Binary file not shown.

Before

Width:  |  Height:  |  Size: 322 KiB

After

Width:  |  Height:  |  Size: 219 KiB

View File

@@ -0,0 +1,165 @@
# Developer Architecture Guide
This document explains how JarvisChat is structured, why key guardrails exist, and what the test suite validates.
## 1. System Overview
JarvisChat is a single-process FastAPI service with a Jinja2 frontend and SQLite persistence.
Primary files:
- `app.py`: API, middleware, streaming/chat logic, auth, memory, skills, and DB bootstrap
- `templates/index.html`: main WebUX, settings panels, auth flow, streaming UI handlers
- `jarvischat.db`: runtime SQLite database created and migrated at startup
Core runtime integrations:
- Ollama for chat/model interaction
- SearXNG for web search (optional)
- wttr.in for weather shortcut queries
- rocm-smi for GPU stats when available
## 2. Request/Response Architecture
### 2.1 Chat Pipeline (`/api/chat`)
1. Validate session, role, origin, rate, and payload limits in middleware
2. Persist user message and conversation metadata
3. Build system prompt from enabled profile, memory context, and active skills metadata
4. Stream model response over SSE token-by-token
5. Evaluate uncertainty/refusal; if needed, trigger search augmentation and stream augmented result
6. Persist final assistant message and emit terminal SSE event
### 2.2 Explicit Search Pipeline (`/api/search`)
1. Persist search-as-message into the target/new conversation
2. Emit `searching` SSE event
3. Pull web results from SearXNG
4. Summarize with Ollama via SSE stream
5. Persist summary and emit `done` event (plus raw results payload)
### 2.3 Settings/Control Surface
- Profile, presets, memory, conversation management, and settings APIs
- Skills APIs for phase-1 registry and enable/disable controls
- Auth/session APIs for guest/admin role handling and keepalive
## 3. Data Model (SQLite)
Key tables:
- `conversations`: conversation headers and timestamps
- `messages`: ordered chat history entries
- `profile`: singleton row for injected profile prompt
- `settings`: runtime toggles and selected defaults
- `system_presets`: named reusable system prompts
- `skills`: per-skill enabled state and timestamp
- `memories` (FTS5 virtual table): searchable user memory facts
Design notes:
- Startup is idempotent: tables are created if missing and defaults seeded only when absent
- No connection pool: each request opens a short-lived SQLite connection
## 4. Security Implementations
This section documents explicit controls currently in code.
### 4.1 Auth Model
- Guest session is default for conversational access
- Admin unlock uses 4-digit PIN and creates admin-capable session
- Admin required for write/destructive routes
- Session heartbeat/timeout and explicit logout/revoke flow
### 4.2 PIN and Session Hardening
- Admin PIN hashed with PBKDF2-HMAC-SHA256 + salt
- Failed PIN attempts tracked per client IP
- Lockout window enforced after max failed attempts
### 4.3 Browser and API Abuse Controls
- Origin checks on state-changing requests
- Rate limiting by endpoint category and identity (IP/session)
- Payload size limits per route class
- Settings key allowlist to block arbitrary configuration injection
- IP allowlist/CIDR gate with optional trusted proxy forwarding mode
### 4.4 Output and Error Safety
- Search result URLs sanitized to `http`/`https` only
- Client-safe error envelopes with incident key correlation
- Full stack traces and diagnostic metadata logged server-side only
### 4.5 Operational Auditability
- Structured audit events for auth actions, admin operations, and guardrail denials
- Incident logs include event type, key, path/method context, and runtime metadata
## 5. Skills Framework (Phase 1)
Goal: introduce a governed skills control plane inside the local JarvisChat sandbox.
Current behavior:
- Built-in skill registry defined server-side
- Per-skill enable/disable persisted in DB
- Global `skills_enabled` master toggle in settings
- Active skills injected into system prompt with bounded text budget
- API endpoints to list skills, list active skills, and toggle skill state
- WebUX settings panel to control master/per-skill toggles
Non-goals in phase 1:
- No unrestricted shell/tool execution
- No external connector execution (filesystem, Gmail, etc.)
## 6. Testing Strategy and Validation Intent
The test suite validates both behavior and guardrail assumptions.
### 6.1 What We Test
- Auth capability separation (guest vs admin)
- URL sanitization safety for outbound links
- Rate and payload guardrails
- IP allowlist behavior
- Safe error envelope behavior and SSE error leakage prevention
- Streaming chat/search and memory command paths
- Skills framework toggles and prompt-injection behavior
### 6.2 Why These Tests Matter
- Confirms security controls are active and regression-resistant
- Ensures streaming UX protocol remains stable (`token`, `searching`, `done`, `error`)
- Verifies policy intent: dangerous actions require admin capability
- Validates new features preserve prior guarantees
### 6.3 Internal Process Validation
For substantive changes, Definition of Done includes:
1. Implement code change
2. Add/adjust tests proving behavior and guardrail intent
3. Update README release notes for user-facing impact
4. Update wiki architecture/security/testing docs for maintainers
5. Validate with targeted test runs before merge/deploy
This process is intentionally explicit so design decisions remain auditable over time.
## 7. Deployment and Operations Notes
- Primary deployment target: local/homelab systemd service
- Required dependency: Ollama
- Optional dependency: SearXNG
- Recommended log review path: system journal for startup, guardrail denials, and incidents
## 8. Contribution Guidance
When adding a feature:
1. Define security posture first (who can execute, what can fail, and failure mode)
2. Implement smallest safe slice with clear limits
3. Add tests that prove both happy path and guardrail path
4. Update this wiki and README in the same change

23
docs/wiki/Home.md Normal file
View File

@@ -0,0 +1,23 @@
# JarvisChat Developer Wiki
This wiki is the developer-facing architecture and process reference for JarvisChat.
## Audience
- Contributors maintaining backend, frontend, security posture, and deployment process
- Operators validating local or homelab deployments
## Start Here
- Architecture and components: [Developer-Architecture.md](Developer-Architecture.md)
- Active implementation backlog: [current-wip.md](current-wip.md)
## Scope and Support Model
JarvisChat is designed for local and trusted-LAN operation.
The code may technically function against external or commercial endpoints, but this deployment mode is not a supported target in this project.
## Wiki Maintenance Rule
When architecture, security behavior, or test policy changes, update this wiki in the same change set as code and tests.

84
docs/wiki/current-wip.md Normal file
View File

@@ -0,0 +1,84 @@
# JarvisChat Current WiP Backlog
Last updated: 2026-04-27
Owner: Gramps + Copilot
Scope: issues, bugs, security exposures, and feature enhancements.
Total identified items: 27
## Priority Definitions
- P0: Critical risk or data-loss/security exposure; do first.
- P1: High impact reliability/correctness work.
- P2: Important feature/UX improvements.
- P3: Nice-to-have polish.
## Top 10 (Urgency Order)
1. [P0][DONE] Add authentication/authorization for all write and admin endpoints.
2. [P0][DONE] Add CSRF/origin protection for browser-initiated state-changing requests.
3. [P0][DONE] Block unsafe URL schemes in rendered search-result links (e.g., javascript:).
4. [P0][DONE] Add rate limiting and request body size limits for chat/search/profile APIs.
5. [P1][DONE] Restrict settings updates to an allowlist of valid keys.
6. [P1] Add pagination + hard caps on list endpoints (memories, conversations, message history).
7. [P1][DONE] Stop returning raw exception text to clients; use safe error envelopes.
8. [P1][DONE] Add automated tests for chat streaming, auto-search trigger, and memory command paths.
9. [P2][DONE] Implement skills/tool-call framework (MCP-style) with per-skill enable controls.
10. [P2] Implement heartbeat/check-in pipeline with scheduler + summary endpoint.
## Item 1 Executive Summary (Scope + Security)
- Status: Complete. Guest/admin capability split implemented with admin-only write enforcement, origin checks on state-changing requests, audit logging, and endpoint capability tests.
- Decision: JarvisChat is local-first by design. Primary mode is same-host Ollama; optional mode allows RFC1918 LAN endpoints only.
- Constraint: Public Internet AI endpoints are out of scope unless explicitly enabled in a future advanced mode.
- Risk: Even on LAN, unauthenticated write/admin endpoints permit unauthorized data tampering and deletion.
- Requirement: Add mandatory admin authentication for all POST/PUT/DELETE routes and destructive actions.
- Authentication shape (scope-locked): two capability tiers only: guest (chat-only) and admin (4-digit PIN unlock).
- Scope guardrail: Avoid full RBAC. Keep capability split minimal: conversational chat for guest, advanced/destructive actions for admin.
- Definition of done:
1. Auth required on all state-changing endpoints.
2. Destructive actions require admin authorization.
3. Endpoint configuration rejects non-local/non-RFC1918 AI backends by default.
4. Strong rate limiting + lockout controls in place for PIN attempts.
5. Security events logged for failed and successful admin actions.
## Full Backlog (Sorted by Priority)
### P0 Critical
1. Add auth for write/admin endpoints (`POST/PUT/DELETE` routes, mass delete, profile/settings changes).
2. Add CSRF or strict origin checks for browser session protection.
3. Validate/sanitize outbound href URLs before rendering in HTML (allow http/https only).
4. Add per-IP rate limiting on `/api/chat`, `/api/search`, `/api/profile`, `/api/settings`.
5. Enforce request size limits (message/profile text and JSON body) to prevent memory abuse.
### P1 High
6. Add settings key allowlist in `/api/settings` to prevent arbitrary key injection.
7. Add pagination (`limit`, `offset`) with enforced maximums for list APIs.
8. Add DB indexes and query hygiene for scalability (`messages.conversation_id`, timestamps).
9. Replace raw exception leakage to clients with generic safe error messages + server-side logs.
10. Add request/response timeout and retry policy consistency across external calls.
11. Add endpoint-level audit logging for destructive operations.
12. Add unit/integration tests for: remember/forget parsing, refusal detection, search fallback, SSE done/error shape.
13. Add conversation title sanitization and length constraints.
14. Ensure default preset semantics are correct (currently all seeded presets are marked default).
15. Add preflight validation for required model/preset selection and block send with clear user guidance instead of timing out.
### P2 Important Features
16. Skills system: load markdown skill files with YAML frontmatter from skills directory.
17. Skills registry API: list/enable/disable skills and expose active skills to UI.
18. Inject active skill instructions into system prompt with bounded token budget.
19. Tool execution guardrails: allowlist, confirmation mode, and execution logs.
20. Heartbeat scheduler (cron/systemd timer) for daily check-ins.
21. Heartbeat endpoint for generated briefings and anomaly summaries.
22. Model info UI panel (description, updated date, best-use purpose).
23. Default model selection improvements and persistence validation.
24. Hidden model list support (exclude models from dropdown).
25. Model update action from UI (trigger controlled model pull).
### P3 Nice to Have
26. Conversation search/filter and export tooling.
27. Keyboard shortcuts, retry button, and source-link polish.
## Maintenance Rules
- Keep this file as the single source of truth.
- Update item priority/status whenever work starts or completes.
- Mirror the Top 10 summary in README and keep counts aligned.

470
readme.md
View File

@@ -1,263 +1,355 @@
# ⚡ JarvisChat # ⚡ JarvisChat v1.7.8
![screenshot](docs/images/screenshot.png) ![screenshot](docs/images/screenshot.png)
**A lightweight Ollama coding companion that runs on Python 3.13**
![Version](https://img.shields.io/badge/version-1.3.1-blue) **A lightweight Ollama coding companion with persistent memory, web search, and real-time system monitoring.**
![Python](https://img.shields.io/badge/python-3.13-green)
![License](https://img.shields.io/badge/license-MIT-orange)
JarvisChat is a single-file FastAPI application that provides a clean, responsive web interface for Ollama. It features persistent memory, automatic web search when the model is uncertain, and real-time token tracking. Built with FastAPI + SQLite + Jinja2. Runs on Python 3.13. No Docker required.
Developer wiki: [docs/wiki/Home.md](docs/wiki/Home.md)
Core architecture deep-dive: [docs/wiki/Developer-Architecture.md](docs/wiki/Developer-Architecture.md)
## Security Scope Disclaimer
JarvisChat is designed for local and home-lab use (same host or trusted LAN).
JarvisChat may technically work with frontier or commercial AI endpoints, but the author does not recommend or support that usage.
Supported deployments are contained local/home-lab environments.
By default, API access is limited to loopback + private LAN CIDRs. You can override with `JARVISCHAT_ALLOWED_CIDRS` (comma-separated CIDRs) and optionally trust reverse-proxy forwarding with `JARVISCHAT_TRUST_X_FORWARDED_FOR=true`.
If you deploy outside a trusted local subnet, your risk profile changes significantly and the default protections here may be insufficient.
Use at your own risk. No warranty is provided for Internet-exposed deployments.
## What's New in v1.7.x
- **Security hardening suite completed** - request rate limits, payload caps, settings allowlist, safe error envelopes, and LAN CIDR gate controls
- **Customer-safe incident handling** - client-facing errors include support-friendly incident keys while full traces remain in server logs
- **Streaming and regression test expansion** - automated coverage for SSE chat/search paths, memory remember/forget command handling, and auth/guardrail behavior
- **Skills framework (Phase 1)** - built-in local skill registry with per-skill enable controls, API endpoints, and bounded prompt injection
- **Skills WebUX controls** - Settings modal now includes a master skills toggle and per-skill toggles for admin users
## What's New in v1.6.x
- **Guest/admin capability split** - guest chat by default with 4-digit admin PIN for advanced or destructive operations
- **Session + lockout controls** - session lifecycle endpoints, heartbeat, logout/revoke behavior, failed PIN lockout protections, and auth audit events
- **Browser request protections** - strict origin checks for state-changing requests and admin-only write enforcement
- **Unsafe link protection** - outbound search links sanitized to allow only http/https absolute URLs
- **Operational stability fixes** - safer first-boot PIN policy handling and memory-search tokenization fix for punctuation/FTS edge cases
## What's New in v1.5.0
- **Explicit Web Search Button** — 🔍 button next to SEND forces a web search, bypassing model uncertainty detection
- **Orange Search Styling** — Search results, WEB badge, and search button share consistent orange color scheme
- **Expanded Refusal Patterns** — Added "As an AI model", "based on my training data", "I don't have the capability"
- **Code cleanup** — Removed unused `JSONResponse` import and dead `raw_results_md` variable
- **Bug fixes** — Replaced bare `except` clauses with `except Exception`; corrected `add_memory()` return type to `int | None`; updated `TemplateResponse` call to Starlette's current API signature
## What's New in v1.4.0
- **FTS5 Memory System**: Say "remember that..." to store facts — they're automatically retrieved by relevance and injected into context
- **Forget Command**: Say "forget about..." to remove memories
- **Memory Toggle**: Enable/disable memory injection from topbar or settings
- **Multi-file Structure**: Backend and frontend separated for easier maintenance
## Features ## Features
- **Persistent Profile/Memory** — Your context is injected into every conversation automatically - **Persistent Memory** — SQLite FTS5 full-text search for fast, relevant memory retrieval
- **System Prompt Presets** — Switch between coding assistant, sysadmin, general, or custom modes - **Web Search** — SearXNG integration for automatic web lookups when the model is uncertain
- **Streaming Chat** — Real-time token streaming with conversation history - **Explicit Search** — 🔍 button to force web search without waiting for model uncertainty
- **Model Switching** — Hot-swap between all installed Ollama models - **Profile Injection** — Custom system prompt injected into every conversation
- **Web Search Integration** — SearXNG kicks in automatically when the model is uncertain (perplexity-based) - **System Presets** — Save and switch between different system prompts
- **Weather Queries** — Direct wttr.in integration for weather questions - **Real-time Stats** — CPU, RAM, GPU, VRAM monitoring in sidebar
- **Token Thermometer** — Visual context usage bar with live updates as you type - **Token Thermometer** — Visual context window usage indicator
- **Perplexity & Speed Badges** — See model confidence (PPL) and tokens/sec on each response - **Streaming Responses** — Server-sent events for real-time token display
- **Copy-to-Clipboard** — One-click copy on all code blocks - **Conversation History** — SQLite-backed chat persistence with mass-delete option
- **Dark Theme** — Easy on the eyes for long coding sessions - **Model Switching** — Change Ollama models on the fly
## Architecture ## Current WiP (Prioritized)
Canonical backlog: [docs/wiki/current-wip.md](docs/wiki/current-wip.md)
Scope boundary: local-first (same-host Ollama), optional RFC1918 LAN endpoints, no public Internet AI endpoints by default.
Total identified items: 27
Top 10 (brief):
1. P0 [DONE]: Add auth for write/admin endpoints
2. P0 [DONE]: Add CSRF/origin protection for state-changing requests
3. P0 [DONE]: Block unsafe URL schemes in rendered links
4. P0 [DONE]: Add rate limiting and request size limits
5. P1 [DONE]: Restrict `/api/settings` updates to allowlisted keys
6. P1: Add pagination + hard caps for list APIs
7. P1 [DONE]: Replace raw exception leakage with safe client errors
8. P1 [DONE]: Add automated tests for streaming/search/memory paths
9. P2 [DONE]: Implement MCP-style skills/tool-call framework
10. P2: Implement heartbeat/check-in scheduler + summary endpoint
Item 1 executive summary: keep guest mode for conversational chat, require 4-digit admin PIN for advanced/destructive actions, and enforce local/LAN-only backend policy by default.
Implementation status: complete (guest session by default + admin unlock + admin-only write enforcement + origin checks + safe-link sanitization + audit logging + rate/payload guardrails + capability tests).
## TODO
1. ~~Verify SearXNG and Docker services persist across reboots~~
2. Conversation search/filter by keyword
3. Export conversation to markdown/text
4. Keyboard shortcuts (Ctrl+N new chat, Ctrl+Enter send)
5. Retry button on assistant messages
6. Source links — clickable links when search used
7. Allow conversation renaming
8. Multiple profiles — coding/sysadmin/general
9. Auto-generate conversation tags (client-side KWIC, top 5, filterable badges)
10. Image input support — pull vision model, file input/drag-drop, base64 encode, pass `images` array to Ollama `/api/chat`
11. Split-screen option for btop display
12. Skills as markdown files — `/opt/jarvischat/skills/`, YAML frontmatter + instructions, injected into context for tool calls
13. Heartbeats / proactive check-ins — cron + endpoint for daily briefings, HA anomaly alerts
14. Model info button — (i) icon next to Model dropdown, shows div with model description, last updated date, best-use purpose
15. Set default model — toggle any model as the default selection
16. Hide/remove model from list — exclude models from dropdown
17. Update model function — trigger `ollama pull` for selected model from UI
18. Add mouseover tooltip to SEND button
19. Add preflight validation for required model/preset selection and show a clear warning before send to prevent avoidable timeout loops
## File Structure
``` ```
Browser ◄──► app.py (FastAPI) ◄──► Ollama (LLM) /opt/jarvischat/
├── app.py # FastAPI backend
▼ (when uncertain) ├── jarvischat.db # SQLite database (auto-created)
SearXNG (web search) ├── static/
│ └── logo.png # Logo image (optional)
└── templates/
└── index.html # Frontend
``` ```
JarvisChat acts as middleware between your browser and Ollama. When the model's perplexity exceeds a threshold (default 15.0) or it refuses to answer, JarvisChat automatically queries SearXNG, injects the results, and re-prompts the model.
**This is NOT training** — SearXNG is only used at runtime as a fallback for uncertain responses.
## Requirements ## Requirements
- Python 3.11+ (tested on 3.13) - Python 3.11+ (tested on 3.13)
- Ollama running locally (default: `localhost:11434`) - Ollama running locally or on network
- SearXNG (optional, for web search — default: `localhost:8888`) - SearXNG (optional, for web search)
- ROCm (optional, for AMD GPU stats — `rocm-smi` must be in PATH)
## Installation ## Installation
```bash ### Fresh Install
# Clone or download app.py
git clone https://github.com/llamachileshop-code/313_webui.git
cd 313_webui
# Create virtual environment (recommended) ```bash
# Create directory and venv
sudo mkdir -p /opt/jarvischat
sudo chown $USER:$USER /opt/jarvischat
cd /opt/jarvischat
python3 -m venv venv python3 -m venv venv
source venv/bin/activate
# Install dependencies # Install dependencies
pip install fastapi httpx uvicorn psutil ./venv/bin/pip install fastapi uvicorn httpx psutil jinja2 python-multipart
# Run # Set admin PIN before first startup (4 digits)
python app.py export JARVISCHAT_ADMIN_PIN=4827
# or
uvicorn app:app --host 0.0.0.0 --port 8080 # Create subdirectories
mkdir -p templates static
# Copy files
# (copy app.py to /opt/jarvischat/)
# (copy index.html to /opt/jarvischat/templates/)
# (copy logo.png to /opt/jarvischat/static/ — optional)
``` ```
Open `http://localhost:8080` in your browser. WARNING: Do not use `1234` as your admin PIN unless you accept weak local security.
NOTE: First boot now requires `JARVISCHAT_ADMIN_PIN` unless you explicitly opt into insecure fallback with `JARVISCHAT_ALLOW_DEFAULT_PIN=true`.
### Upgrading from v1.4.x
**Note:** If running as a systemd service with a venv, install dependencies using the venv pip directly:
```bash ```bash
/opt/jarvischat/venv/bin/pip install fastapi httpx uvicorn psutil cd /opt/jarvischat
# Backup
cp app.py app.py.bak
cp templates/index.html templates/index.html.bak
# Copy new files
# (copy app.py, replacing old version)
# (copy index.html to templates/)
# Restart
sudo systemctl restart jarvischat
``` ```
## Running as a Service ## Systemd Service
**Important:** Although JarvisChat is a single-file Python application, it's designed to run as a persistent service alongside Ollama — not as a one-off script. Both services should start on boot.
### systemd Service (recommended)
Create `/etc/systemd/system/jarvischat.service`: Create `/etc/systemd/system/jarvischat.service`:
```ini ```ini
[Unit] [Unit]
Description=JarvisChat - Ollama Web UI Description=JarvisChat - Local Ollama Web Interface
After=network.target ollama.service After=network.target
Wants=ollama.service
[Service] [Service]
Type=simple Type=simple
User=your-username User=jarvischat
WorkingDirectory=/path/to/313_webui Group=jarvischat
ExecStart=/usr/bin/python3 app.py WorkingDirectory=/opt/jarvischat
Restart=on-failure ExecStart=/opt/jarvischat/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8080
Restart=always
RestartSec=5 RestartSec=5
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
``` ```
Then enable and start:
```bash ```bash
sudo systemctl daemon-reload sudo systemctl daemon-reload
sudo systemctl enable jarvischat sudo systemctl enable jarvischat
sudo systemctl start jarvischat sudo systemctl start jarvischat
``` ```
### Verify Both Services ## Memory Commands
```bash In chat, natural language triggers memory operations:
# Check Ollama
systemctl status ollama
# Check JarvisChat | You say | What happens |
systemctl status jarvischat |---------|--------------|
| "remember that I prefer Rust over Go" | Stores as `preference` |
| "remember that JarvisChat runs on port 8080" | Stores as `infrastructure` |
| "note that the deadline is Friday" | Stores as `general` |
| "forget about the deadline" | Removes matching memories |
# View JarvisChat logs Memories are automatically searched based on your message content and injected into the system prompt when relevant.
journalctl -t jarvischat -f
```
## Configuration ### Memory Topics
Edit these constants at the top of `app.py`: Memories are auto-categorized:
- `preference` — likes, dislikes, choices
```python - `project` — active work, repos, tasks
VERSION = "1.3.1" - `infrastructure` — servers, services, configs
OLLAMA_BASE = "http://localhost:11434" - `personal` — name, location, background
SEARXNG_BASE = "http://localhost:8888" - `general` — everything else
DEFAULT_MODEL = "deepseek-coder:6.7b"
PERPLEXITY_THRESHOLD = 15.0 # Higher = less likely to trigger search
```
## Database
JarvisChat uses SQLite (`jarvischat.db` in the same directory as `app.py`):
| Table | Purpose |
|-------|---------|
| conversations | Chat sessions with model and timestamps |
| messages | Individual messages with role and content |
| system_presets | Saved system prompt presets |
| profile | Your persistent memory/context |
| settings | App settings (search/profile toggles, default model) |
## Logging
JarvisChat logs to syslog via journald:
```bash
# Follow live logs
journalctl -t jarvischat -f
# View last 100 entries
journalctl -t jarvischat -n 100
```
## Token Thermometer
The vertical bar next to the input shows your context usage in real-time:
- **Green** — Plenty of room
- **Yellow** — 70%+ used
- **Red** — 90%+ used (approaching limit)
The count includes: profile + preset + conversation history + current input. Context size is fetched from Ollama when you switch models.
## Search Flow
1. User sends message → Ollama streams response with logprobs
2. JarvisChat calculates perplexity from logprobs
3. If perplexity > 15.0 OR refusal patterns detected:
- Yield `{searching: True}` to show spinner
- Query SearXNG (or wttr.in for weather)
- Inject results into context
- Re-prompt Ollama
4. If model still refuses, format raw search results directly
5. Clean hedging phrases from response
6. Yield final response with PPL and t/s badges
## API Endpoints ## API Endpoints
| Endpoint | Method | Description | ### Memory
|----------|--------|-------------|
| `/` | GET | Web UI |
| `/api/models` | GET | List Ollama models |
| `/api/ps` | GET | Running models |
| `/api/show` | POST | Model info (context size) |
| `/api/stats` | GET | System stats (CPU, memory, GPU, VRAM) |
| `/api/chat` | POST | Stream chat (SSE) |
| `/api/conversations` | GET/DELETE | List/delete all conversations |
| `/api/conversations/{id}` | GET/DELETE | Get/delete conversation |
| `/api/profile` | GET/PUT | Get/update profile |
| `/api/presets` | GET/POST | List/create presets |
| `/api/presets/{id}` | PUT/DELETE | Update/delete preset |
| `/api/settings` | GET/PUT | App settings |
| `/api/search/status` | GET | SearXNG availability |
## Screenshots | Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/memories` | List all memories |
| POST | `/api/memories` | Add memory `{"fact": "...", "topic": "general"}` |
| DELETE | `/api/memories/{rowid}` | Delete memory by ID |
| GET | `/api/memories/search?q=term` | Search memories |
| GET | `/api/memories/stats` | Get counts by topic |
*(Add your own screenshot here)* ### Chat & Models
## TODO | Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/models` | List available Ollama models |
| POST | `/api/chat` | Send message (streaming SSE) |
| POST | `/api/search` | Explicit web search (streaming SSE) |
| POST | `/api/show` | Get model info (context size) |
| GET | `/api/ps` | Get running models |
### Active ### Settings & Profile
1. ~~**Mass-delete conversation history**~~ ✓ (v1.3.0) | Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/profile` | Get profile content |
| PUT | `/api/profile` | Update profile |
| GET | `/api/profile/default` | Get default profile |
| GET | `/api/settings` | Get settings |
| PUT | `/api/settings` | Update settings |
2. **Verify SearXNG and Docker services persist across reboots** ### Conversations
- Expand refusal patterns: "As an AI model", "based on my training data", "I don't have the capability"
3. **Input trigger: `search+` prefix** | Method | Endpoint | Description |
- Strip prefix, query SearXNG directly, Ollama summarizes |--------|----------|-------------|
- Raw results in expandable div (not tooltip) | GET | `/api/conversations` | List conversations |
| GET | `/api/conversations/{id}` | Get conversation with messages |
| DELETE | `/api/conversations/{id}` | Delete conversation |
| DELETE | `/api/conversations` | Delete ALL conversations |
4. **Add `profile.example.md`** ### Presets
- Recommended default profile with anti-bullshit rules (no "As an AI", no OpenAI mentions)
### Backlog | Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/presets` | List presets |
| POST | `/api/presets` | Create preset |
| PUT | `/api/presets/{id}` | Update preset |
| DELETE | `/api/presets/{id}` | Delete preset |
5. Conversation search/filter by keyword ### System
6. Export conversation to markdown/text
7. Keyboard shortcuts (Ctrl+N new chat, Ctrl+Enter send)
8. ~~Token count estimate before sending~~ ✓ (v1.2.9)
9. Model info display — context length, VRAM usage from Ollama `/api/ps`
10. Retry button on assistant messages
11. Source links — clickable links when search used
12. Allow conversation renaming
13. Multiple profiles — coding/sysadmin/general
14. Auto-generate conversation tags (client-side KWIC, top 5, filterable badges)
15. **Image input support**
- Pull vision model (llava, llama3.2-vision, etc.)
- Frontend: file input / drag-drop, base64 encode
- Backend: pass `images` array to Ollama `/api/chat`
## Version History | Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/stats` | CPU, RAM, GPU, VRAM stats |
| GET | `/api/search/status` | SearXNG availability |
| Version | Changes | ## Configuration
|---------|---------|
| 1.3.1 | System stats panel (CPU, memory, GPU, VRAM) in sidebar | Settings are stored in the `settings` table and include:
| 1.3.0 | Delete all conversations button |
| 1.2.9 | Token thermometer with live context tracking | - `profile_enabled` — Inject profile into chats (true/false)
| 1.2.8 | Logo in sidebar, llama emoji tagline | - `search_enabled` — Auto web search (true/false)
| 1.2.7 | Tokens per second (t/s) badge on responses | - `memory_enabled` — Memory injection (true/false)
| 1.2.6 | wttr.in weather integration, improved search extraction | - `default_model` — Default Ollama model
| 1.2.5 | SearXNG infoboxes/answers, smarter query building | - `searxng_url` — SearXNG instance URL (default: `http://localhost:8888`)
| 1.2.4 | Perplexity badges, hedging cleanup |
| 1.2.3 | SearXNG integration with perplexity-based triggering | ## Testing Memory
| 1.2.0 | System prompt presets, settings persistence |
| 1.1.0 | Profile memory, model switching | ```bash
| 1.0.0 | Initial release | # Add a memory via API
curl -X POST http://jarvis:8080/api/memories \
-H "Content-Type: application/json" \
-d '{"fact": "User prefers native installs over Docker", "topic": "preference"}'
# Search memories
curl "http://jarvis:8080/api/memories/search?q=docker"
# List all memories
curl http://jarvis:8080/api/memories
# Get stats
curl http://jarvis:8080/api/memories/stats
```
Or in chat:
1. Say "remember that I hate YAML"
2. Later ask "what markup languages should I avoid?"
3. JarvisChat will inject the YAML preference into context
## Troubleshooting
### Service won't start
Check logs:
```bash
journalctl -u jarvischat -n 50 --no-pager
```
Common issues:
- Missing `jinja2`: `./venv/bin/pip install jinja2`
- Missing `templates/` directory
- Wrong permissions on `/opt/jarvischat`
### Memory not working
1. Check memory is enabled (🧠 MEM ON in topbar)
2. Verify memories exist: `curl http://jarvis:8080/api/memories`
3. Check FTS5 table: `sqlite3 jarvischat.db "SELECT * FROM memories_fts;"`
### Web search not working
1. Verify SearXNG is running: `curl http://localhost:8888/search?q=test&format=json`
2. Check search status: `curl http://jarvis:8080/api/search/status`
3. Ensure JSON format is enabled in SearXNG settings
## License ## License
MIT MIT
--- ## Repository
## A Note from Gramps Gitea: `ssh://gitea@llgit.llamachile.tube:1319/gramps/jarvisChat.git`
I named my AI machine "jarvis" after the AI assistant in *Iron Man* (2008) — because it's an awesome name. When I started building a local coding companion to talk to it, "JarvisChat" just made sense.
This project is in active development. Eventually it'll get packaged up as a Docker thing, but for now while I'm iterating fast, a single-file Python service does the job.
---
*Built with 🦙 by Gramps at the Llama Chile Shop*

BIN
static/logo.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 206 KiB

1251
templates/index.html Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,78 @@
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_client(tmp_path: Path) -> TestClient:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-test.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.init_db()
return TestClient(app_module.app)
def test_guest_read_only_admin_write_blocked(tmp_path: Path):
with make_client(tmp_path) as client:
guest = client.post("/api/auth/guest", headers={"Origin": "http://testserver"})
assert guest.status_code == 200
sid = guest.json()["session_id"]
headers = {"X-Session-ID": sid}
read_resp = client.get("/api/memories", headers=headers)
assert read_resp.status_code == 200
write_resp = client.post(
"/api/memories",
json={"fact": "guest write should fail", "topic": "general"},
headers={**headers, "Origin": "http://testserver"},
)
assert write_resp.status_code == 403
def test_admin_can_write_and_delete_memory(tmp_path: Path):
with make_client(tmp_path) as client:
login = client.post(
"/api/auth/login",
json={"pin": "1234"},
headers={"Origin": "http://testserver"},
)
assert login.status_code == 200
sid = login.json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
create_resp = client.post(
"/api/memories",
json={"fact": "admin write ok", "topic": "general"},
headers=headers,
)
assert create_resp.status_code == 200
rowid = create_resp.json()["rowid"]
delete_resp = client.delete(f"/api/memories/{rowid}", headers=headers)
assert delete_resp.status_code == 200
def test_origin_check_blocks_cross_site_writes(tmp_path: Path):
with make_client(tmp_path) as client:
denied = client.post("/api/auth/guest", headers={"Origin": "http://evil.example"})
assert denied.status_code == 403
allowed = client.post("/api/auth/guest", headers={"Origin": "http://testserver"})
assert allowed.status_code == 200
def test_logout_revokes_session(tmp_path: Path):
with make_client(tmp_path) as client:
guest = client.post("/api/auth/guest", headers={"Origin": "http://testserver"})
sid = guest.json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
logout = client.post("/api/auth/logout", headers=headers)
assert logout.status_code == 200
after = client.get("/api/memories", headers={"X-Session-ID": sid})
assert after.status_code == 401

View File

@@ -0,0 +1,188 @@
import json
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_client(tmp_path: Path) -> TestClient:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-streaming.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.RATE_EVENTS.clear()
app_module.init_db()
return TestClient(app_module.app, raise_server_exceptions=False)
def parse_sse_payloads(body: str) -> list[dict]:
payloads: list[dict] = []
for chunk in body.split("\n\n"):
chunk = chunk.strip()
if not chunk.startswith("data: "):
continue
raw = chunk[len("data: ") :]
payloads.append(json.loads(raw))
return payloads
class _MockStreamResponse:
def __init__(self, lines: list[str]):
self._lines = lines
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
return False
async def aiter_lines(self):
for line in self._lines:
yield line
def _stream_json_lines(events: list[dict]) -> list[str]:
return [json.dumps(event) for event in events]
def test_chat_stream_emits_tokens_and_done(tmp_path: Path, monkeypatch):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
"session_id"
]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
events = _stream_json_lines(
[
{"message": {"content": "Hel"}, "logprobs": [{"logprob": -0.01}]},
{"message": {"content": "lo"}, "logprobs": [{"logprob": -0.01}]},
{"done": True, "eval_count": 2, "eval_duration": 1000000000},
]
)
def stream_stub(self, method, url, json=None, timeout=None):
return _MockStreamResponse(events)
monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", stream_stub)
resp = client.post(
"/api/chat",
json={"message": "hello", "model": app_module.DEFAULT_MODEL},
headers=headers,
)
assert resp.status_code == 200
payloads = parse_sse_payloads(resp.text)
token_text = "".join(p.get("token", "") for p in payloads if "token" in p)
assert token_text == "Hello"
done_events = [p for p in payloads if p.get("done")]
assert done_events
assert "searched" not in done_events[-1]
def test_chat_auto_search_trigger_emits_search_events(tmp_path: Path, monkeypatch):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
"session_id"
]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
first_stream = _stream_json_lines(
[
{
"message": {"content": "I am uncertain."},
"logprobs": [{"logprob": -5.0}],
},
{"done": True, "eval_count": 2, "eval_duration": 1000000000},
]
)
second_stream = _stream_json_lines(
[
{"message": {"content": "Based on current data: 42."}},
{"done": True},
]
)
stream_batches = [first_stream, second_stream]
def stream_stub(self, method, url, json=None, timeout=None):
return _MockStreamResponse(stream_batches.pop(0))
async def search_stub(query: str, max_results: int = 5):
return [
{
"title": "Answer",
"url": "https://example.com",
"content": "The value is 42.",
}
]
monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", stream_stub)
monkeypatch.setattr(app_module, "query_searxng", search_stub)
resp = client.post(
"/api/chat",
json={"message": "what is the latest value", "model": app_module.DEFAULT_MODEL},
headers=headers,
)
assert resp.status_code == 200
payloads = parse_sse_payloads(resp.text)
assert any(p.get("searching") is True for p in payloads)
assert any("search_results" in p for p in payloads)
assert any(p.get("augmented") is True for p in payloads)
done_events = [p for p in payloads if p.get("done")]
assert done_events and done_events[-1].get("searched") is True
def test_memory_command_paths_remember_and_forget(tmp_path: Path, monkeypatch):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
"session_id"
]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
base_stream = _stream_json_lines(
[
{"message": {"content": "ok"}, "logprobs": [{"logprob": -0.01}]},
{"done": True, "eval_count": 1, "eval_duration": 1000000000},
]
)
def stream_stub(self, method, url, json=None, timeout=None):
return _MockStreamResponse(base_stream)
monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", stream_stub)
remember_resp = client.post(
"/api/chat",
json={
"message": "remember that my favorite language is rust",
"model": app_module.DEFAULT_MODEL,
},
headers=headers,
)
assert remember_resp.status_code == 200
remember_events = parse_sse_payloads(remember_resp.text)
assert any("Remembered" in p.get("token", "") for p in remember_events)
memories_after_add = client.get("/api/memories", headers={"X-Session-ID": sid})
assert memories_after_add.status_code == 200
assert memories_after_add.json().get("count", 0) >= 1
forget_resp = client.post(
"/api/chat",
json={
"message": "forget about my favorite language",
"model": app_module.DEFAULT_MODEL,
},
headers=headers,
)
assert forget_resp.status_code == 200
forget_events = parse_sse_payloads(forget_resp.text)
assert any("Forgot" in p.get("token", "") for p in forget_events)
memories_after_forget = client.get("/api/memories", headers={"X-Session-ID": sid})
assert memories_after_forget.status_code == 200
assert memories_after_forget.json().get("count", 0) == 0

View File

@@ -0,0 +1,72 @@
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_client(tmp_path: Path) -> TestClient:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-errors.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.RATE_EVENTS.clear()
app_module.init_db()
return TestClient(app_module.app, raise_server_exceptions=False)
def test_unhandled_api_exception_returns_friendly_error_with_incident_key(
tmp_path: Path, monkeypatch
):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
"session_id"
]
headers = {"X-Session-ID": sid}
def boom(_topic=None):
raise RuntimeError("super secret db internals")
monkeypatch.setattr(app_module, "get_all_memories", boom)
resp = client.get("/api/memories", headers=headers)
assert resp.status_code == 500
payload = resp.json()
assert payload.get("error_key", "").startswith("INC-")
assert "support lookup" in payload.get("detail", "").lower()
assert "super secret db internals" not in resp.text
def test_chat_stream_error_hides_internal_exception_and_emits_incident_key(
tmp_path: Path, monkeypatch
):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
"session_id"
]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
class BrokenStreamContext:
async def __aenter__(self):
raise RuntimeError("ultra secret model transport failure")
async def __aexit__(self, exc_type, exc, tb):
return False
def broken_stream(*args, **kwargs):
return BrokenStreamContext()
monkeypatch.setattr(app_module.httpx.AsyncClient, "stream", broken_stream)
resp = client.post(
"/api/chat",
json={"message": "hello", "model": app_module.DEFAULT_MODEL},
headers=headers,
)
assert resp.status_code == 200
body = resp.text
assert "ultra secret model transport failure" not in body
assert "error_key" in body
assert "support lookup" in body.lower()

View File

@@ -0,0 +1,50 @@
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_client(tmp_path: Path) -> TestClient:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-ip.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.RATE_EVENTS.clear()
app_module.init_db()
return TestClient(app_module.app)
def test_ip_helper_allows_local_defaults():
assert app_module.is_ip_allowed("127.0.0.1")
assert app_module.is_ip_allowed("192.168.1.10")
assert app_module.is_ip_allowed("10.0.0.42")
assert app_module.is_ip_allowed("172.16.1.2")
assert app_module.is_ip_allowed("testclient")
def test_ip_helper_blocks_public_ip():
assert not app_module.is_ip_allowed("8.8.8.8")
def test_middleware_blocks_disallowed_ip(tmp_path: Path):
with make_client(tmp_path) as client:
original_get_client_ip = app_module.get_client_ip
try:
app_module.get_client_ip = lambda _req: "8.8.8.8"
resp = client.post("/api/auth/guest")
assert resp.status_code == 403
finally:
app_module.get_client_ip = original_get_client_ip
def test_middleware_allows_local_ip(tmp_path: Path):
with make_client(tmp_path) as client:
original_get_client_ip = app_module.get_client_ip
try:
app_module.get_client_ip = lambda _req: "192.168.50.109"
resp = client.post("/api/auth/guest")
assert resp.status_code == 200
finally:
app_module.get_client_ip = original_get_client_ip

View File

@@ -0,0 +1,76 @@
import json
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_client(tmp_path: Path) -> TestClient:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-rate.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.RATE_EVENTS.clear()
app_module.init_db()
return TestClient(app_module.app)
def test_stats_rate_limit_hits_429(tmp_path: Path):
old_limit = app_module.RL_STATS_PER_WINDOW
old_window = app_module.RATE_WINDOW_SECONDS
app_module.RL_STATS_PER_WINDOW = 2
app_module.RATE_WINDOW_SECONDS = 60
try:
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest").json()["session_id"]
headers = {"X-Session-ID": sid}
r1 = client.get("/api/stats", headers=headers)
r2 = client.get("/api/stats", headers=headers)
r3 = client.get("/api/stats", headers=headers)
assert r1.status_code == 200
assert r2.status_code == 200
assert r3.status_code == 429
finally:
app_module.RL_STATS_PER_WINDOW = old_limit
app_module.RATE_WINDOW_SECONDS = old_window
def test_large_login_payload_rejected_413(tmp_path: Path):
with make_client(tmp_path) as client:
huge_pin = "1" * (app_module.BODY_LIMIT_DEFAULT_BYTES + 100)
resp = client.post(
"/api/auth/login",
data=json.dumps({"pin": huge_pin}),
headers={"Content-Type": "application/json"},
)
assert resp.status_code == 413
def test_chat_message_length_rejected_413(tmp_path: Path):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest").json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
message = "x" * (app_module.MAX_CHAT_MESSAGE_CHARS + 1)
resp = client.post(
"/api/chat",
json={"message": message, "model": app_module.DEFAULT_MODEL},
headers=headers,
)
assert resp.status_code == 413
def test_search_query_length_rejected_413(tmp_path: Path):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest").json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
query = "q" * (app_module.MAX_SEARCH_QUERY_CHARS + 1)
resp = client.post(
"/api/search",
json={"query": query, "model": app_module.DEFAULT_MODEL},
headers=headers,
)
assert resp.status_code == 413

View File

@@ -0,0 +1,17 @@
import app as app_module
def test_sanitize_outbound_url_allows_http_https():
assert app_module.sanitize_outbound_url("https://example.com/path") == "https://example.com/path"
assert app_module.sanitize_outbound_url("http://example.com") == "http://example.com"
def test_sanitize_outbound_url_blocks_unsafe_schemes():
assert app_module.sanitize_outbound_url("javascript:alert(1)") == ""
assert app_module.sanitize_outbound_url("data:text/html,evil") == ""
assert app_module.sanitize_outbound_url("file:///etc/passwd") == ""
def test_sanitize_outbound_url_blocks_relative_and_empty():
assert app_module.sanitize_outbound_url("/relative/path") == ""
assert app_module.sanitize_outbound_url("") == ""

View File

@@ -0,0 +1,57 @@
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_admin_client(tmp_path: Path) -> tuple[TestClient, dict[str, str]]:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-settings.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.init_db()
client = TestClient(app_module.app)
login = client.post(
"/api/auth/login",
json={"pin": "1234"},
headers={"Origin": "http://testserver"},
)
assert login.status_code == 200
sid = login.json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
return client, headers
def test_settings_allow_known_keys(tmp_path: Path):
client, headers = make_admin_client(tmp_path)
try:
resp = client.put(
"/api/settings",
json={
"profile_enabled": "false",
"search_enabled": "true",
"memory_enabled": "false",
"default_model": "llama3.1:latest",
},
headers=headers,
)
assert resp.status_code == 200
finally:
client.close()
def test_settings_reject_unknown_keys(tmp_path: Path):
client, headers = make_admin_client(tmp_path)
try:
resp = client.put(
"/api/settings",
json={"admin_pin_hash": "oops"},
headers=headers,
)
assert resp.status_code == 400
assert "Unknown setting key" in resp.json().get("detail", "")
finally:
client.close()

View File

@@ -0,0 +1,93 @@
import os
from pathlib import Path
from fastapi.testclient import TestClient
import app as app_module
def make_client(tmp_path: Path) -> TestClient:
os.environ["JARVISCHAT_ADMIN_PIN"] = "1234"
app_module.DB_PATH = tmp_path / "jarvischat-skills.db"
app_module.SESSIONS.clear()
app_module.PIN_ATTEMPTS.clear()
app_module.RATE_EVENTS.clear()
app_module.init_db()
return TestClient(app_module.app, raise_server_exceptions=False)
def test_guest_can_list_skills(tmp_path: Path):
with make_client(tmp_path) as client:
sid = client.post("/api/auth/guest", headers={"Origin": "http://testserver"}).json()[
"session_id"
]
resp = client.get("/api/skills", headers={"X-Session-ID": sid})
assert resp.status_code == 200
payload = resp.json()
assert payload["count"] >= 1
assert any(skill["key"] == "memory.search" for skill in payload["skills"])
def test_admin_can_toggle_skill_enabled_state(tmp_path: Path):
with make_client(tmp_path) as client:
login = client.post(
"/api/auth/login",
json={"pin": "1234"},
headers={"Origin": "http://testserver"},
)
sid = login.json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
disable = client.put(
"/api/skills/search.web",
json={"enabled": False},
headers=headers,
)
assert disable.status_code == 200
assert disable.json()["skill"]["enabled"] is False
active = client.get("/api/skills/active", headers={"X-Session-ID": sid})
assert active.status_code == 200
assert all(skill["key"] != "search.web" for skill in active.json()["skills"])
def test_unknown_skill_update_is_rejected(tmp_path: Path):
with make_client(tmp_path) as client:
login = client.post(
"/api/auth/login",
json={"pin": "1234"},
headers={"Origin": "http://testserver"},
)
sid = login.json()["session_id"]
headers = {"X-Session-ID": sid, "Origin": "http://testserver"}
resp = client.put(
"/api/skills/nope.unknown",
json={"enabled": True},
headers=headers,
)
assert resp.status_code == 404
def test_prompt_injection_respects_skills_enabled_setting(tmp_path: Path):
with make_client(tmp_path):
db = app_module.get_db()
try:
db.execute(
"INSERT OR REPLACE INTO settings (key, value) VALUES (?, ?)",
("skills_enabled", "false"),
)
db.commit()
without_skills = app_module.build_system_prompt(db, "", "hello")
assert "## Active Skills" not in without_skills
db.execute(
"INSERT OR REPLACE INTO settings (key, value) VALUES (?, ?)",
("skills_enabled", "true"),
)
db.commit()
with_skills = app_module.build_system_prompt(db, "", "hello")
assert "## Active Skills" in with_skills
assert "memory.search" in with_skills
finally:
db.close()