Fix production timeouts: stop uv re-syncing deps at container startup#54
Conversation
The image CMD used 'uv run' without --no-sync, so every container start re-resolved the environment — installing the dev group (ruff, black, pyright) from PyPI and re-bytecode-compiling ~3900 files. This burned 30-60s of full-core CPU before the server bound port 8000, which in production caused liveness-probe kills during startup, CPU starvation of sibling pods (their /healthz exceeded the 3s probe timeout), HPA scale-ups from the compile CPU spike, and dropped in-flight MCP requests on every kill. With --no-sync, uv runs the entrypoint from the venv baked at build time (uv sync --frozen --no-dev), so cold start is ~1s and the runtime no longer depends on PyPI availability.
The console 'all' scope already covers project and organization access, so advertising the finer-grained aliases is redundant.
Greptile SummaryThis PR fixes production container restart storms by adding
Confidence Score: 4/5Safe to merge for the container startup fix; the scope removal in constants.py is unexplained and warrants a quick confirmation before shipping. The Dockerfile change is correct and well-motivated — src/mcp_server_appwrite/constants.py — the Important Files Changed
|
Problem
Production (
mcp.appwrite.io) was timing out on tool requests. The pods ondo-fra1-assets-fra1-prodwere stuck in a restart storm after the 0.8.2 rollout.Root cause: the image CMD is
uv run mcp-server-appwritewithout--no-sync, so every container start re-resolves the environment — downloading the dev group (ruff, black, pyright) from PyPI, rebuilding the package, and bytecode-compiling ~3900 files:That burns 30–60s of full-core CPU before port 8000 binds. In the cluster this cascaded:
period=10s, timeout=3s, failureThreshold=3, no startupProbe) killed pods mid-startup → restart → recompile → loop.assetsnodes, starving the healthy sibling pod so its/healthzexceeded the 3s probe timeout → kubelet killed it too (context deadline exceededevents).ASGI callable returned without completing response) → client-side tool-call timeouts.Fix
Add
--no-syncso the entrypoint runs from the venv baked at build time (uv sync --frozen --no-dev). Verified locally: container cold start drops to ~1s with no network access, and/healthzresponds immediately. Also removes the runtime dependency on PyPI availability.Follow-ups (Helm chart, separate repo)
startupProbeso slow starts can never be liveness-killed.