gramps/jarvisChat

Fork 0

Go to file

gramps 46cccc9087 Initial commit

2026-03-09 20:06:01 -07:00

.gitignore

Initial commit

2026-03-09 20:06:01 -07:00

app.py

Initial commit

2026-03-09 20:06:01 -07:00

readme.md

Initial commit

2026-03-09 20:06:01 -07:00

requirements.txt

Initial commit

2026-03-09 20:06:01 -07:00

readme.md

⚡ JarvisChat

A lightweight Ollama coding companion that runs on Python 3.13

JarvisChat is a single-file FastAPI application that provides a clean, responsive web interface for Ollama. It features persistent memory, automatic web search when the model is uncertain, and real-time token tracking.

Features

Persistent Profile/Memory — Your context is injected into every conversation automatically
System Prompt Presets — Switch between coding assistant, sysadmin, general, or custom modes
Streaming Chat — Real-time token streaming with conversation history
Model Switching — Hot-swap between all installed Ollama models
Web Search Integration — SearXNG kicks in automatically when the model is uncertain (perplexity-based)
Weather Queries — Direct wttr.in integration for weather questions
Token Thermometer — Visual context usage bar with live updates as you type
Perplexity & Speed Badges — See model confidence (PPL) and tokens/sec on each response
Copy-to-Clipboard — One-click copy on all code blocks
Dark Theme — Easy on the eyes for long coding sessions

Architecture

Browser ◄──► app.py (FastAPI) ◄──► Ollama (LLM)
                    │
                    ▼ (when uncertain)
               SearXNG (web search)

JarvisChat acts as middleware between your browser and Ollama. When the model's perplexity exceeds a threshold (default 15.0) or it refuses to answer, JarvisChat automatically queries SearXNG, injects the results, and re-prompts the model.

This is NOT training — SearXNG is only used at runtime as a fallback for uncertain responses.

Requirements

Python 3.11+ (tested on 3.13)
Ollama running locally (default: localhost:11434)
SearXNG (optional, for web search — default: localhost:8888)

Installation

# Clone or download app.py
git clone https://llgit.llamachile.shop/gramps/jarvischat.git
cd jarvischat

# Install dependencies
pip install fastapi httpx uvicorn

# Run
python app.py
# or
uvicorn app:app --host 0.0.0.0 --port 8080

Open http://localhost:8080 in your browser.

Running as a Service

Important: Although JarvisChat is a single-file Python application, it's designed to run as a persistent service alongside Ollama — not as a one-off script. Both services should start on boot.

systemd Service (recommended)

Create /etc/systemd/system/jarvischat.service:

[Unit]
Description=JarvisChat - Ollama Web UI
After=network.target ollama.service
Wants=ollama.service

[Service]
Type=simple
User=jarvischat
WorkingDirectory=/opt/jarvischat
ExecStart=/usr/bin/python3 app.py
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Then enable and start:

sudo systemctl daemon-reload
sudo systemctl enable jarvischat
sudo systemctl start jarvischat

Verify Both Services

# Check Ollama
systemctl status ollama

# Check JarvisChat
systemctl status jarvischat

# View JarvisChat logs
journalctl -t jarvischat -f

Configuration

Edit these constants at the top of app.py:

VERSION = "1.3.0"
OLLAMA_BASE = "http://localhost:11434"
SEARXNG_BASE = "http://localhost:8888"
DEFAULT_MODEL = "deepseek-coder:6.7b"
PERPLEXITY_THRESHOLD = 15.0  # Higher = less likely to trigger search

Database

JarvisChat uses SQLite (jarvischat.db in the same directory as app.py):

Table	Purpose
conversations	Chat sessions with model and timestamps
messages	Individual messages with role and content
system_presets	Saved system prompt presets
profile	Your persistent memory/context
settings	App settings (search/profile toggles, default model)

Logging

JarvisChat logs to syslog via journald:

# Follow live logs
journalctl -t jarvischat -f

# View last 100 entries
journalctl -t jarvischat -n 100

Token Thermometer

The vertical bar next to the input shows your context usage in real-time:

Green — Plenty of room
Yellow — 70%+ used
Red — 90%+ used (approaching limit)

The count includes: profile + preset + conversation history + current input. Context size is fetched from Ollama when you switch models.

Search Flow

User sends message → Ollama streams response with logprobs
JarvisChat calculates perplexity from logprobs
If perplexity > 15.0 OR refusal patterns detected:
- Yield {searching: True} to show spinner
- Query SearXNG (or wttr.in for weather)
- Inject results into context
- Re-prompt Ollama
If model still refuses, format raw search results directly
Clean hedging phrases from response
Yield final response with PPL and t/s badges

API Endpoints

Endpoint	Method	Description
`/`	GET	Web UI
`/api/models`	GET	List Ollama models
`/api/ps`	GET	Running models
`/api/show`	POST	Model info (context size)
`/api/chat`	POST	Stream chat (SSE)
`/api/conversations`	GET	List conversations
`/api/conversations/{id}`	GET/DELETE	Get/delete conversation
`/api/profile`	GET/PUT	Get/update profile
`/api/presets`	GET/POST	List/create presets
`/api/presets/{id}`	PUT/DELETE	Update/delete preset
`/api/settings`	GET/PUT	App settings
`/api/search/status`	GET	SearXNG availability

Screenshots

(Add your own screenshot here)

TODO

Active

~~Mass-delete conversation history~~ ✓ (v1.3.0)
Verify SearXNG and Docker services persist across reboots
- Expand refusal patterns: "As an AI model", "based on my training data", "I don't have the capability"
Input trigger: search+ prefix
- Strip prefix, query SearXNG directly, Ollama summarizes
- Raw results in expandable div (not tooltip)
Add profile.example.md
- Recommended default profile with anti-bullshit rules (no "As an AI", no OpenAI mentions)

Backlog

Conversation search/filter by keyword
Export conversation to markdown/text
Keyboard shortcuts (Ctrl+N new chat, Ctrl+Enter send)
~~Token count estimate before sending~~ ✓ (v1.2.9)
Model info display — context length, VRAM usage from Ollama /api/ps
Retry button on assistant messages
Source links — clickable links when search used
Allow conversation renaming
Multiple profiles — coding/sysadmin/general
Auto-generate conversation tags (client-side KWIC, top 5, filterable badges)
Image input support
- Pull vision model (llava, llama3.2-vision, etc.)
- Frontend: file input / drag-drop, base64 encode
- Backend: pass images array to Ollama /api/chat

Version History

Version	Changes
1.3.0	Delete all conversations button
1.2.9	Token thermometer with live context tracking
1.2.8	Logo in sidebar, llama emoji tagline
1.2.7	Tokens per second (t/s) badge on responses
1.2.6	wttr.in weather integration, improved search extraction
1.2.5	SearXNG infoboxes/answers, smarter query building
1.2.4	Perplexity badges, hedging cleanup
1.2.3	SearXNG integration with perplexity-based triggering
1.2.0	System prompt presets, settings persistence
1.1.0	Profile memory, model switching
1.0.0	Initial release

License

MIT

A Note from Gramps

I named my AI machine "jarvis" after the AI assistant in Iron Man (2008) — because it's an awesome name. When I started building a local coding companion to talk to it, "JarvisChat" just made sense.

This project is in active development. Eventually it'll get packaged up as a Docker thing, but for now while I'm iterating fast, a single-file Python service does the job.

Built with 🦙 by Gramps at the Llama Chile Shop