feat(memory): add FTS5 memory system, refactor to multi-file structure

2026-03-15 14:17:15 -07:00
parent 3a0264efa6
commit 4c7610a554
4 changed files with 1199 additions and 1519 deletions
--- a/readme.md
+++ b/readme.md
@@ -1,263 +1,89 @@
-# ⚡ JarvisChat
+# JarvisChat v1.4.0

-**A lightweight Ollama coding companion that runs on Python 3.13**
+Lightweight Ollama coding companion with FTS5 memory system.

-![Version](https://img.shields.io/badge/version-1.3.1-blue)
-![Python](https://img.shields.io/badge/python-3.13-green)
-![License](https://img.shields.io/badge/license-MIT-orange)
+## New in v1.4.0
+- **FTS5 Memory System**: Say "remember that..." to store facts, they're automatically retrieved by relevance
+- **Forget command**: Say "forget about..." to remove memories
+- **Memory toggle**: Enable/disable memory injection from topbar
+- **Refactored structure**: Separated frontend from backend for maintainability

-JarvisChat is a single-file FastAPI application that provides a clean, responsive web interface for Ollama. It features persistent memory, automatic web search when the model is uncertain, and real-time token tracking.
-
-## Features
-
- **Persistent Profile/Memory** — Your context is injected into every conversation automatically
- **System Prompt Presets** — Switch between coding assistant, sysadmin, general, or custom modes
- **Streaming Chat** — Real-time token streaming with conversation history
- **Model Switching** — Hot-swap between all installed Ollama models
- **Web Search Integration** — SearXNG kicks in automatically when the model is uncertain (perplexity-based)
- **Weather Queries** — Direct wttr.in integration for weather questions
- **Token Thermometer** — Visual context usage bar with live updates as you type
- **Perplexity & Speed Badges** — See model confidence (PPL) and tokens/sec on each response
- **Copy-to-Clipboard** — One-click copy on all code blocks
- **Dark Theme** — Easy on the eyes for long coding sessions
-
-## Architecture
+## File Structure

 ```
-Browser ◄──► app.py (FastAPI) ◄──► Ollama (LLM)
-                    │
-                    ▼ (when uncertain)
-               SearXNG (web search)
+/opt/jarvischat/
+├── app.py              # FastAPI backend (~600 lines)
+├── jarvischat.db       # SQLite database (auto-created)
+├── static/
+│   └── logo.jpg        # Your logo (optional)
+└── templates/
+    └── index.html      # Frontend
 ```

-JarvisChat acts as middleware between your browser and Ollama. When the model's perplexity exceeds a threshold (default 15.0) or it refuses to answer, JarvisChat automatically queries SearXNG, injects the results, and re-prompts the model.
-
-**This is NOT training** — SearXNG is only used at runtime as a fallback for uncertain responses.
-
-## Requirements
-
- Python 3.11+ (tested on 3.13)
- Ollama running locally (default: `localhost:11434`)
- SearXNG (optional, for web search — default: `localhost:8888`)
- ROCm (optional, for AMD GPU stats — `rocm-smi` must be in PATH)
-
 ## Installation

 ```bash
-# Clone or download app.py
-git clone https://github.com/llamachileshop-code/313_webui.git
-cd 313_webui
+# Backup existing
+cd /opt/jarvischat
+cp app.py app.py.bak

-# Create virtual environment (recommended)
-python3 -m venv venv
-source venv/bin/activate
+# Create directories
+mkdir -p templates static

-# Install dependencies
-pip install fastapi httpx uvicorn psutil
+# Copy new files (from wherever you downloaded them)
+cp /path/to/new/app.py .
+cp /path/to/new/templates/index.html templates/

-# Run
-python app.py
-# or
-uvicorn app:app --host 0.0.0.0 --port 8080
+# Extract logo from old app.py if you want (or just let it fail gracefully)
+# The frontend handles missing logo with onerror="this.style.display='none'"
+
+# Restart service
+sudo systemctl restart jarvischat
 ```

-Open `http://localhost:8080` in your browser.
+## Memory Commands

-**Note:** If running as a systemd service with a venv, install dependencies using the venv pip directly:
-```bash
-/opt/jarvischat/venv/bin/pip install fastapi httpx uvicorn psutil
-```
+In chat, you can say:
+- "remember that I prefer Rust over Go" → stores as preference
+- "remember that JarvisChat runs on port 8080" → stores as infrastructure  
+- "note that the deadline is Friday" → stores as general
+- "forget about the deadline" → removes matching memories

-## Running as a Service
-
-**Important:** Although JarvisChat is a single-file Python application, it's designed to run as a persistent service alongside Ollama — not as a one-off script. Both services should start on boot.
-
-### systemd Service (recommended)
-
-Create `/etc/systemd/system/jarvischat.service`:
-
-```ini
-[Unit]
-Description=JarvisChat - Ollama Web UI
-After=network.target ollama.service
-Wants=ollama.service
-
-[Service]
-Type=simple
-User=your-username
-WorkingDirectory=/path/to/313_webui
-ExecStart=/usr/bin/python3 app.py
-Restart=on-failure
-RestartSec=5
-
-[Install]
-WantedBy=multi-user.target
-```
-
-Then enable and start:
-
-```bash
-sudo systemctl daemon-reload
-sudo systemctl enable jarvischat
-sudo systemctl start jarvischat
-```
-
-### Verify Both Services
-
-```bash
-# Check Ollama
-systemctl status ollama
-
-# Check JarvisChat
-systemctl status jarvischat
-
-# View JarvisChat logs
-journalctl -t jarvischat -f
-```
-
-## Configuration
-
-Edit these constants at the top of `app.py`:
-
-```python
-VERSION = "1.3.1"
-OLLAMA_BASE = "http://localhost:11434"
-SEARXNG_BASE = "http://localhost:8888"
-DEFAULT_MODEL = "deepseek-coder:6.7b"
-PERPLEXITY_THRESHOLD = 15.0  # Higher = less likely to trigger search
-```
-
-## Database
-
-JarvisChat uses SQLite (`jarvischat.db` in the same directory as `app.py`):
-
-| Table | Purpose |
-|-------|---------|
-| conversations | Chat sessions with model and timestamps |
-| messages | Individual messages with role and content |
-| system_presets | Saved system prompt presets |
-| profile | Your persistent memory/context |
-| settings | App settings (search/profile toggles, default model) |
-
-## Logging
-
-JarvisChat logs to syslog via journald:
-
-```bash
-# Follow live logs
-journalctl -t jarvischat -f
-
-# View last 100 entries
-journalctl -t jarvischat -n 100
-```
-
-## Token Thermometer
-
-The vertical bar next to the input shows your context usage in real-time:
-
- **Green** — Plenty of room
- **Yellow** — 70%+ used
- **Red** — 90%+ used (approaching limit)
-
-The count includes: profile + preset + conversation history + current input. Context size is fetched from Ollama when you switch models.
-
-## Search Flow
-
-1. User sends message → Ollama streams response with logprobs
-2. JarvisChat calculates perplexity from logprobs
-3. If perplexity > 15.0 OR refusal patterns detected:
-   - Yield `{searching: True}` to show spinner
-   - Query SearXNG (or wttr.in for weather)
-   - Inject results into context
-   - Re-prompt Ollama
-4. If model still refuses, format raw search results directly
-5. Clean hedging phrases from response
-6. Yield final response with PPL and t/s badges
+Memories are automatically searched and injected based on your message content.

 ## API Endpoints

-| Endpoint | Method | Description |
-|----------|--------|-------------|
-| `/` | GET | Web UI |
-| `/api/models` | GET | List Ollama models |
-| `/api/ps` | GET | Running models |
-| `/api/show` | POST | Model info (context size) |
-| `/api/stats` | GET | System stats (CPU, memory, GPU, VRAM) |
-| `/api/chat` | POST | Stream chat (SSE) |
-| `/api/conversations` | GET/DELETE | List/delete all conversations |
-| `/api/conversations/{id}` | GET/DELETE | Get/delete conversation |
-| `/api/profile` | GET/PUT | Get/update profile |
-| `/api/presets` | GET/POST | List/create presets |
-| `/api/presets/{id}` | PUT/DELETE | Update/delete preset |
-| `/api/settings` | GET/PUT | App settings |
-| `/api/search/status` | GET | SearXNG availability |
+### Memory
+- `GET /api/memories` - List all memories
+- `POST /api/memories` - Add memory `{"fact": "...", "topic": "general"}`
+- `DELETE /api/memories/{rowid}` - Delete memory
+- `GET /api/memories/search?q=rust` - Search memories
+- `GET /api/memories/stats` - Get counts by topic

-## Screenshots
+### Existing
+- `GET /api/models` - List Ollama models
+- `POST /api/chat` - Send message (streaming)
+- `GET /api/profile` - Get profile
+- `PUT /api/settings` - Update settings

-*(Add your own screenshot here)*
+## Dependencies

-## TODO
+```bash
+pip install fastapi uvicorn httpx psutil jinja2 python-multipart --break-system-packages
+```

-### Active
+## Testing Memory

-1. ~~**Mass-delete conversation history**~~ ✓ (v1.3.0)
+```bash
+# Add a memory via API
+curl -X POST http://jarvis:8080/api/memories \
+  -H "Content-Type: application/json" \
+  -d '{"fact": "User prefers native installs over Docker", "topic": "preference"}'

-2. **Verify SearXNG and Docker services persist across reboots**
-   - Expand refusal patterns: "As an AI model", "based on my training data", "I don't have the capability"
+# Search memories
+curl "http://jarvis:8080/api/memories/search?q=docker"

-3. **Input trigger: `search+` prefix**
-   - Strip prefix, query SearXNG directly, Ollama summarizes
-   - Raw results in expandable div (not tooltip)
-
-4. **Add `profile.example.md`**
-   - Recommended default profile with anti-bullshit rules (no "As an AI", no OpenAI mentions)
-
-### Backlog
-
-5. Conversation search/filter by keyword
-6. Export conversation to markdown/text
-7. Keyboard shortcuts (Ctrl+N new chat, Ctrl+Enter send)
-8. ~~Token count estimate before sending~~ ✓ (v1.2.9)
-9. Model info display — context length, VRAM usage from Ollama `/api/ps`
-10. Retry button on assistant messages
-11. Source links — clickable links when search used
-12. Allow conversation renaming
-13. Multiple profiles — coding/sysadmin/general
-14. Auto-generate conversation tags (client-side KWIC, top 5, filterable badges)
-15. **Image input support**
-    - Pull vision model (llava, llama3.2-vision, etc.)
-    - Frontend: file input / drag-drop, base64 encode
-    - Backend: pass `images` array to Ollama `/api/chat`
-
-## Version History
-
-| Version | Changes |
-|---------|---------|
-| 1.3.1 | System stats panel (CPU, memory, GPU, VRAM) in sidebar |
-| 1.3.0 | Delete all conversations button |
-| 1.2.9 | Token thermometer with live context tracking |
-| 1.2.8 | Logo in sidebar, llama emoji tagline |
-| 1.2.7 | Tokens per second (t/s) badge on responses |
-| 1.2.6 | wttr.in weather integration, improved search extraction |
-| 1.2.5 | SearXNG infoboxes/answers, smarter query building |
-| 1.2.4 | Perplexity badges, hedging cleanup |
-| 1.2.3 | SearXNG integration with perplexity-based triggering |
-| 1.2.0 | System prompt presets, settings persistence |
-| 1.1.0 | Profile memory, model switching |
-| 1.0.0 | Initial release |
-
-## License
-
-MIT
-
---
-
-## A Note from Gramps
-
-I named my AI machine "jarvis" after the AI assistant in *Iron Man* (2008) — because it's an awesome name. When I started building a local coding companion to talk to it, "JarvisChat" just made sense.
-
-This project is in active development. Eventually it'll get packaged up as a Docker thing, but for now while I'm iterating fast, a single-file Python service does the job.
-
---
-
-*Built with 🦙 by Gramps at the Llama Chile Shop*
+# Or in chat, just say:
+# "remember that I hate yaml"
+# Then ask: "what markup languages should I avoid?"
+```