Skip to content

Expose the resolved File Search retrieved chunks (not only cited ones) on the file_search_result step #2651

Description

@pullely-samuel

Environment: google-genai 2.10.0 · Python 3.13 · macOS · Gemini API (not Vertex) · model gemini-3.5-flash with the file_search tool.

Summary

When using File Search via the Interactions API, there is no way to inspect the full set of chunks the retriever returned and injected into the model's context. The retrieved chunk text is observable only indirectly, via model_outputfile_citation.source annotations — i.e. only for chunks the model ends up citing. This blocks reliability/eval work and debugging "what context did the model actually see."

What's observable today

  • file_citation annotations expose source (chunk text), document_uri, file_name, custom_metadata, start_index/end_index. Only for cited chunks. (Side note: the source field appears undocumented — the public File Search docs still describe annotations as metadata-only.)
  • The file_search_result step contains only call_id, type, and an opaque base64 signature (~16KB, encrypted) — no readable retrieved content.
  • file_search_stores.documents.get/list return metadata only; download_media on a document returns 403 PERMISSION_DENIED.
  • Token usage doesn't help: total_input_tokens stays flat and total_tool_use_tokens is ~constant regardless of top_k, so the retrieved-context size isn't reflected anywhere.

Why it matters

Rigorous RAG faithfulness/groundedness evaluation must judge an answer against the actual retrieved context, not the full source document. Today we can only approximate it with the cited chunks. Empirically, at the default top_k the model cites every retrieved chunk (cited count tracks top_k 1:1 up to ~5), so cited ≈ retrieved in that regime — but this is inferred, not guaranteed, and it breaks down at higher top_k (e.g. top_k=10 → 8 cited, top_k=20 → 16 cited), where retrieved-but-uncited chunks become invisible. (Same concern raised re: server-side context being a black box: https://x.com/_philschmid/status/2069458803074986376)

Repro

store = "fileSearchStores/<your-store>"
it = client.interactions.create(
    model="gemini-3.5-flash",
    input="<a question your store can answer>",
    tools=[{"type": "file_search", "file_search_store_names": [store], "top_k": 20}],
    store=True,
)
full = client.interactions.get(it.id)
# file_search_result step: only call_id / type / signature — no retrieved chunks.
# Retrieved chunk text is reachable only via model_output file_citation.source,
# i.e. only the CITED chunks; with a large top_k the cited count is < top_k, so
# the retrieved-but-uncited chunks are invisible.

Request

Surface the resolved File Search retrieval on the file_search_result step (or via an opt-in flag): the full set of retrieved chunks — text + document/source ref + relevance score, including uncited ones. This makes the retrieved context inspectable for evaluation and debugging, and lets developers trim/manage it as suggested in the GA guidance.

Metadata

Metadata

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions