What happened?
OpenAI Responses cost tracking can bill a request at priority rates even when the provider response says the request was served as default.
The problematic case is:
- request asks for
service_tier="priority"
- OpenAI Responses response contains
service_tier="default"
- LiteLLM cost calculation currently prefers the request-level tier before the response tier
That means cost tracking can select input_cost_per_token_priority / output_cost_per_token_priority even though the served tier reported by the provider was default.
Expected behavior: when a provider reports the served tier on the response (or usage), cost tracking should prefer that concrete served tier over the request preference. A response-level default should map to standard pricing.
Steps to reproduce
import litellm
from litellm import completion_cost
from litellm.types.llms.openai import ResponsesAPIResponse
model = "test-openai-responses-default-tier-cost-model"
litellm.register_model(
model_cost={
model: {
"input_cost_per_token": 0.001,
"output_cost_per_token": 0.002,
"input_cost_per_token_priority": 0.002,
"output_cost_per_token_priority": 0.004,
"litellm_provider": "openai",
"max_tokens": 8192,
"mode": "responses",
}
}
)
usage = {"input_tokens": 1000, "output_tokens": 500, "total_tokens": 1500}
response = ResponsesAPIResponse(
id="resp_default_tier",
created_at=0,
model=model,
object="response",
output=[],
usage=usage,
service_tier="default",
)
cost = completion_cost(
completion_response=response,
model=model,
custom_llm_provider="openai",
optional_params={"service_tier": "priority"},
service_tier="priority",
)
Observed on current main: cost uses priority pricing.
Expected: cost uses standard/default pricing because the response says the served tier was default.
Relevant log output
In the reproduction above:
standard cost: 2.0
priority cost: 4.0
current cost with request=priority and response=default: 4.0
expected cost: 2.0
What part of LiteLLM is this about?
SDK (litellm Python package)
What LiteLLM version are you on?
Reproduced on current main (88e03e548716a45284597edf2b7f47a7e6a66d5f).
What happened?
OpenAI Responses cost tracking can bill a request at priority rates even when the provider response says the request was served as
default.The problematic case is:
service_tier="priority"service_tier="default"That means cost tracking can select
input_cost_per_token_priority/output_cost_per_token_priorityeven though the served tier reported by the provider wasdefault.Expected behavior: when a provider reports the served tier on the response (or usage), cost tracking should prefer that concrete served tier over the request preference. A response-level
defaultshould map to standard pricing.Steps to reproduce
Observed on current
main:costuses priority pricing.Expected:
costuses standard/default pricing because the response says the served tier wasdefault.Relevant log output
In the reproduction above:
What part of LiteLLM is this about?
SDK (litellm Python package)
What LiteLLM version are you on?
Reproduced on current
main(88e03e548716a45284597edf2b7f47a7e6a66d5f).