A Slack bot that receives coding tasks via @mentions, dispatches them to Claude Code (or a local LM Studio model), and streams progress + results back into the Slack thread. Supports multiple repos and multiple models (Anthropic cloud + local models on LM Studio).
@bot fix the failing login tests in PMSS
@bot model=qwen3 investigate this bug
@bot add admin filter to the user list
The bot:
- Extracts the model (if specified, otherwise uses default)
- Sends an acknowledgment in the thread
- Launches Claude Code against the correct repo (determined by the agent from
repos.json) - Streams progress updates in-place
- Posts a final summary with what was done
- Python 3.11+
- uv
- Claude Code CLI logged in (uses your Claude plan, no API key needed)
- A Slack workspace with a bot app configured (see Slack Setup)
# Install dependencies
uv sync --all-extras
# Copy and fill in your Slack tokens
cp .env.example .env
# Edit .env with your SLACK_BOT_TOKEN and SLACK_APP_TOKEN
# Scan your repos to build the registry
uv run bot-cli scan-repos
# Start the bot
uv run bot-cli run| Variable | Required | Default | Description |
|---|---|---|---|
SLACK_BOT_TOKEN |
Yes | - | Bot User OAuth Token (xoxb-...) |
SLACK_APP_TOKEN |
Yes | - | App-Level Token (xapp-...) for Socket Mode |
REPOS_BASE_DIR |
No | /Users/Shared/github |
Directory containing all your repos |
OLLA_URL |
Yes | - | Olla LLM router endpoint (e.g. http://127.0.0.1:40114/olla/proxy/v1). See /Users/Shared/github/llmhosting/. |
DEFAULT_MODEL |
No | sonnet |
Default model when none specified |
TASK_TIMEOUT_SECONDS |
No | 600 |
Max time per task in seconds |
Maps model aliases to providers. Add new models by adding a line:
{
"sonnet": {"provider": "anthropic", "model_id": "claude-sonnet-4-20250514"},
"opus": {"provider": "anthropic", "model_id": "claude-opus-4-6"},
"haiku": {"provider": "anthropic", "model_id": "claude-haiku-4-5-20251001"},
"qwen": {"provider": "local", "model_id": "qwen/qwen3.6-35b-a3b"}
}anthropicprovider: uses Claude Code CLI directly (your logged-in plan)localprovider: talks to a local LLM via the Olla router (OLLA_URL) and runs a built-in tool loop (Read, Write, Edit, Bash).model_idis whatever Olla routes — either an alias from Olla'smodel_aliasesconfig, or a real backend model name.
Generated by scan-repos, then hand-editable. Maps repo names to aliases, keywords, and frameworks:
{
"predictmystepscore": {
"path": "predictmystepscore",
"aliases": ["predictmystepscore", "pmss"],
"keywords": ["usmle", "score", "step"],
"framework": "django"
}
}The agent uses this to figure out which repo you're referring to.
# Scan REPOS_BASE_DIR for git repos and generate config/repos.json
uv run bot-cli scan-repos
# Start the bot
uv run bot-cli run
# Check if a model is configured
uv run bot-cli test-model sonnet# Basic — uses default model
@bot fix the login tests
# Specify a model
@bot model=opus refactor the auth middleware
# The bot figures out the repo from context
@bot add stripe webhook handling to PMSS
The bot is thread-aware in two ways:
Follow-ups: After the bot completes a task, @mention it in the same thread to continue the conversation. The agent keeps its full context (files, git state, reasoning) across turns — it's the same session, not a fresh start.
@bot fix the failing login tests in PMSS
→ Bot: Fixed 3 tests. Created branch. Want me to push and create a PR?
@bot yes please push it
→ Bot: Pushed and created PR #42
Jumping into existing threads: You can discuss something with teammates in a thread, then drop the bot in mid-conversation. It fetches the full thread history and sees everything that was said.
@John: I think we should refactor the auth middleware to use JWT
@Eric: agreed, here's the file that needs changing: auth.py lines 40-60
@bot make the change suggested above for PMSS
→ Bot sees the full conversation and knows exactly what to do
Note: the bot only responds when @mentioned — it won't react to casual messages in threads.
- ACK — "Got it, working on this using
sonnet..." - Progress — a single message updated in-place as the agent works
- Summary — what was done, with cost if available
Local models are served through Olla (config in /Users/Shared/github/llmhosting/), which routes requests across our Tailscale fleet of Ollama / LM Studio / vLLM / llama.cpp boxes. Set OLLA_URL in .env to point at the Olla instance, then add an entry in config/models.json:
"qwen": {"provider": "local", "model_id": "qwen/qwen3.6-35b-a3b"}Then in Slack: @bot model=qwen investigate this bug.
The bot calls Olla over its OpenAI-compatible endpoint and runs its own tool-execution loop. We tried fronting LM Studio with the LiteLLM Anthropic-bridge so claude-agent-sdk could talk to it transparently, but the bridge mangled tool-call arguments — so non-Anthropic models go through a separate, simpler runner instead.
src/bot/
config/settings.py # Pydantic settings from .env
slack/
handler.py # Slack event handling, model extraction
feedback.py # Thread updates (ACK, progress, summary)
executor/
worker.py # Claude Agent SDK execution (anthropic provider)
local_runner.py # OpenAI SDK + tool loop (local provider, talks to Olla)
app.py # Slack Bolt + Socket Mode entrypoint
cli.py # CLI commands
config/
models.json # Model registry
repos.json # Repo registry (generated)
# Install with dev dependencies
uv sync --all-extras
# Run tests
uv run pytest tests/ -v