thrum_llm_client

Multi-provider LLM endpoint client for Go. One unified API across chat, image generation, video generation, and embeddings — backed by OpenAI-compatible providers (including vLLM, DeepSeek, Groq, OpenRouter), Anthropic, Z.ai, and Ollama. Auto-detects provider from URL, handles retries, supports streaming, writes generated artifacts to disk on request, and surfaces typed errors for control flow.

Module path

github.com/leonletto/thrum_llm_client

Package

import "github.com/leonletto/thrum_llm_client/endpoint"

Installation

go get github.com/leonletto/thrum_llm_client@latest

Provider support

Provider	Chat	Stream	Image	Video	Embed	Notes
OpenAI	✓	✓	✓	✓	✓	`gpt-image-1` / `dall-e-3` for image; `sora-2` / `sora-2-pro` for video
OpenRouter	✓	✓	✓	✓	✓	One `EndpointURL = "https://openrouter.ai/api"` covers all modalities. Image: Gemini, FLUX, etc. Video: Veo, Sora, Seedance, Wan, Kling
Z.ai	✓ (reasoning)	✓	✓	✓	—	GLM-5.1 chat with `ReasoningMode` control. CogView-4 / GLM-Image. Vidu2 family + CogVideoX-3
Anthropic	✓	✓	—	—	—	Claude family. Image/video not exposed by provider
Ollama	✓	✓	—	—	✓	Local-first; chat + embeddings
vLLM / DeepSeek / Groq	✓	✓	—	—	✓	OpenAI-compatible adapter; embeddings where the provider supports them

Providers without a given modality return endpoint.ErrCapabilityNotSupported on the call.

Reasoning-mode policy (Z.ai)

Reasoning-capable providers (currently Z.ai) emit a wire-level "thinking" control on each request. ModelProfile.ReasoningMode controls the wire shape per model:

off (default): emit explicit {"type":"disabled"} when EnableThinking=false. Safe for reasoning models like GLM-5.1.
auto: omit the field when disabled. Use only for non-reasoning models that reject explicit disabled.
on: force {"type":"enabled"} regardless of caller — for always-reasoning models.

YAML:

profiles:
  - canonical_id: glm-5.1
    provider_models: { zai: glm-5.1 }
    reasoning_mode: off  # explicit, but the default

Typed errors

All four provider adapters return *endpoint.EndpointError wrapping a typed sentinel for HTTP status-error paths. Use errors.Is:

_, err := client.Chat(ctx, model, msgs, nil)
switch {
case errors.Is(err, endpoint.ErrAuthenticationRequired): // 401
case errors.Is(err, endpoint.ErrRateLimited):            // 429 — back off
case errors.Is(err, endpoint.ErrBadRequest):             // 400, 422
case errors.Is(err, endpoint.ErrForbidden):              // 403
case errors.Is(err, endpoint.ErrNotFound):               // 404
case errors.Is(err, endpoint.ErrTimeout):                // 408, 504
case errors.Is(err, endpoint.ErrServiceUnavailable):     // 503 — retry with backoff
case errors.Is(err, endpoint.ErrServerError):            // 5xx catch-all
}

The provider-supplied human-readable message (when present) is preserved in the wrapped error's text for logs, while errors.Is matches the typed sentinel for control flow.

Image generation

Three providers supported: Z.ai (CogView-4, GLM-Image), OpenAI (gpt-image-1, dall-e-3), OpenRouter (Gemini 2.5 Flash Image / "Nano Banana", FLUX, etc.).

For OpenRouter image and video, use the same base URL — https://openrouter.ai/api — across all modalities (chat, image, video). The library handles the per-modality path internally.

client, err := endpoint.NewImageClient(endpoint.ImageClientConfig{
    EndpointURL: "https://api.z.ai",
    APIKey:      os.Getenv("ZAI_API_KEY"),
})
res, err := client.GenerateImage(ctx, endpoint.ImageOptions{
    Model:  "cogView-4-250304",
    Prompt: "A kitten on a sunny windowsill",
    Size:   "1024x1024",
})
url := res.Images[0].URL  // 30-day expiry on Z.ai; ~1h on OpenAI dall-e

Errors use the same errors.Is sentinel set as chat:

if errors.Is(err, endpoint.ErrRateLimited) {
    // back off
}

Video generation

Three providers supported: Z.ai (Vidu2 family, CogVideoX-3), OpenAI (Sora-2, Sora-2-Pro), OpenRouter (Sora 2 Pro / Veo 3.1 / Seedance / Wan / Kling via their normalized API).

client, err := endpoint.NewVideoClient(endpoint.VideoClientConfig{
    EndpointURL: "https://api.openai.com",
    APIKey:      os.Getenv("OPENAI_API_KEY"),
})
job, err := client.SubmitVideo(ctx, endpoint.VideoOptions{
    Model:    "sora-2-pro",
    Prompt:   "A cat surfing a wave at sunset",
    Size:     "1280x720",
    Duration: 16,
})
job, err = client.WaitVideo(ctx, job.ID, endpoint.PollOptions{
    Interval: 10 * time.Second,
    MaxWait:  15 * time.Minute,
})
if job.Status != endpoint.JobStatusCompleted {
    return fmt.Errorf("job ended in %s: %s", job.Status, job.Error)
}
rc, err := job.Videos[0].OpenContent(ctx)
defer rc.Close()
io.Copy(file, rc)

OpenContent works uniformly across providers — Z.ai/OpenRouter HTTP-GET the embedded URL; OpenAI streams from a separate content endpoint. Caller never branches on provider.

WaitVideo returns endpoint.ErrPollTimeout when PollOptions.MaxWait is exceeded — distinct from the HTTP-level ErrTimeout (408/504). Context cancellation propagates the wrapped context.Canceled / context.DeadlineExceeded unchanged.

Saving generated artifacts to disk

Both image and video generation support an opt-in download to a caller-supplied directory. When OutputDir is set the library writes each artifact under that directory with a predictable, versioned filename — no overwrites, no surprises:

imgClient, _ := endpoint.NewImageClient(cfg)
res, err := imgClient.GenerateImage(ctx, endpoint.ImageOptions{
    Model:     "cogview-4-250304",
    Prompt:    "a red cat at dusk",
    OutputDir: "/var/lib/myapp/images",
    // CreateOutputDir: true,  // uncomment to auto-mkdir
    OnProgress: func(e endpoint.ProgressEvent) {
        log.Printf("phase=%s percent=%d", e.Phase, e.Percent)
    },
})
// res.Images[0].LocalPath -> "/var/lib/myapp/images/a-red-cat-at-dusk-v1.png"

Filenames use {prompt-slug}-v{N}.{ext}, with N picked to never collide with an existing file. Image batches (N>1) use {prompt-slug}-v{N}-{idx}.{ext}. Partial downloads are unlinked on failure. When OutputDir is missing and CreateOutputDir is unset, the call returns an error satisfying errors.Is(err, fs.ErrNotExist).

For video, UnifiedVideoClient.GenerateVideo composes Submit + Wait + Download in one call. OnProgress fires for both polling (ProgressPolling with the latest JobStatus) and download (ProgressDownloading / ProgressComplete). The lazy GeneratedVideo.OpenContent closure remains populated even when OutputDir is set, so stream-through callers can choose between path-based and stream-based access. Callers using the three-phase Submit / Poll / Wait API directly can still get library-managed downloads via endpoint.DownloadVideo(ctx, job, opts).

Documentation

User-facing guides live in docs/:

Getting started — first call for chat, image, and video.
Providers — per-provider setup, capability matrix, Z.ai reasoning mode.
Errors and retries — typed sentinels and RetryPolicy customization.
Downloads — OutputDir, OnProgress, slug-based filenames.
Registries — YAML provider/model registries for canonical-name resolution.
Testing — unit tests, the live-API smoke suite, and adding tests.

Contributing? See CONTRIBUTING.md. Security? See SECURITY.md.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
docs		docs
endpoint		endpoint
tests/e2e		tests/e2e
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

thrum_llm_client

Module path

Package

Installation

Provider support

Reasoning-mode policy (Z.ai)

Typed errors

Image generation

Video generation

Saving generated artifacts to disk

Documentation

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

thrum_llm_client

Module path

Package

Installation

Provider support

Reasoning-mode policy (Z.ai)

Typed errors

Image generation

Video generation

Saving generated artifacts to disk

Documentation

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages