Skip to content

[Bug]: OpenAI Responses response service_tier=default is billed as priority when requested priority #31837

Description

@silencedoctor

What happened?

OpenAI Responses cost tracking can bill a request at priority rates even when the provider response says the request was served as default.

The problematic case is:

  • request asks for service_tier="priority"
  • OpenAI Responses response contains service_tier="default"
  • LiteLLM cost calculation currently prefers the request-level tier before the response tier

That means cost tracking can select input_cost_per_token_priority / output_cost_per_token_priority even though the served tier reported by the provider was default.

Expected behavior: when a provider reports the served tier on the response (or usage), cost tracking should prefer that concrete served tier over the request preference. A response-level default should map to standard pricing.

Steps to reproduce

import litellm
from litellm import completion_cost
from litellm.types.llms.openai import ResponsesAPIResponse

model = "test-openai-responses-default-tier-cost-model"
litellm.register_model(
    model_cost={
        model: {
            "input_cost_per_token": 0.001,
            "output_cost_per_token": 0.002,
            "input_cost_per_token_priority": 0.002,
            "output_cost_per_token_priority": 0.004,
            "litellm_provider": "openai",
            "max_tokens": 8192,
            "mode": "responses",
        }
    }
)

usage = {"input_tokens": 1000, "output_tokens": 500, "total_tokens": 1500}
response = ResponsesAPIResponse(
    id="resp_default_tier",
    created_at=0,
    model=model,
    object="response",
    output=[],
    usage=usage,
    service_tier="default",
)

cost = completion_cost(
    completion_response=response,
    model=model,
    custom_llm_provider="openai",
    optional_params={"service_tier": "priority"},
    service_tier="priority",
)

Observed on current main: cost uses priority pricing.

Expected: cost uses standard/default pricing because the response says the served tier was default.

Relevant log output

In the reproduction above:

standard cost: 2.0
priority cost: 4.0
current cost with request=priority and response=default: 4.0
expected cost: 2.0

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on?

Reproduced on current main (88e03e548716a45284597edf2b7f47a7e6a66d5f).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions