Skip to content

Conversation

@PetrMc
Copy link
Contributor

@PetrMc PetrMc commented Jan 14, 2026

What this PR does / why we need it:

Enables FIELD accessor support for peer metadata to allow waypoint proxies to export source workload tags (istio.source_workload, istio.source_namespace, etc.) in traces. Fixes istio/istio#58348

Stores WorkloadMetadataObject directly in filter state instead of wrapping in CelState. This enables formatters to use %FILTER_STATE(downstream_peer:FIELD:workload)% syntax while maintaining CEL compatibility via serializeAsProto().

Special notes for your reviewer:
this requires envoyproxy/envoy#41697 that is included in Envoy 1.37 (released on Jan 13, 2026) and included in [release-1.29] of Istio

@PetrMc PetrMc requested a review from a team as a code owner January 14, 2026 23:01
@istio-testing istio-testing added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 14, 2026
@zirain
Copy link
Member

zirain commented Jan 14, 2026

can we hold this until we make an agreement of #6748?

Copy link
Contributor

@keithmattix keithmattix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're no longer storing a string in filter state but instead a serializable object; how does that change the contract for when we read this out of filter state? Where's the caller and what does it have to change?

Signed-off-by: Petr McAllister <[email protected]>
Signed-off-by: Petr McAllister <[email protected]>
@istio-testing istio-testing added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 15, 2026
@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 15, 2026

So we're no longer storing a string in filter state but instead a serializable object; how does that change the contract for when we read this out of filter state? Where's the caller and what does it have to change?

Thanks @keithmattix good catch - istio_stats.cc was reading it as CelState and deserializing back to WorkloadMetadataObject.

Updated it to read WorkloadMetadataObject directly with CelState fallback for backward compat. Eliminates the round-trip in the new path bbbf405

does it resolve your concern?

@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 15, 2026

can we hold this until we make an agreement of #6748?

thanks @zirain - #6748 looks like a significant restructure that will need careful review. This PR is a targeted fix for a specific issue - happy to help resolve any conflicts if they come up when #6748 merges.

Signed-off-by: Petr McAllister <[email protected]>
@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 15, 2026

/retest

@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 15, 2026

/retest

@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 15, 2026

/retest

@PetrMc PetrMc added the do-not-merge Block automatic merging of a PR. label Jan 15, 2026
@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 16, 2026

@keithmattix - had to rework this PR to preserve CelState contract. Now WorkloadMetadataObject stored under downstream_peer_obj (new, for FIELD accessor). Adjusted istio_stats.cc now reads from downstream_peer_obj first, falls back to downstream_peer.

@zirain - would you consider giving it a green light?

Thank you!

@krinkinmu
Copy link
Contributor

can we hold this until we make an agreement of #6748?

@zirain do you think it would be difficult to sync back the changes from this repo to Envoy if we agree to move the code to Envoy, is that your concern?

FWIW, from the WG meeting discussion about #6748 there was a concern about keeping the code in one place and tests in another place. I don't think that we've reached a consensus on how to treat that concern (whether it's blocker or not and how to proceed with that in general).

Taking a stance that no PRs can be merged until we agree is a bit too strict, and if there are no major concerns with syncing the repos once we do reach consensus, then it also seems somewhat unnecessary constraint.

Copy link
Contributor

@krinkinmu krinkinmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I'm not sure if downstream_peer_obj and upstream_peer_obj names for filter state keys is the best, but I also don't have any particularly good alternatives either.

Might be worth bringing this PR up in WG to:

  1. Figure out if everyone is on board with the filter state key names (and if somebody could come up with better names)
  2. Make progress on whether we should block proxy PRs while we are still discussing migration to Envoy.

@PetrMc PetrMc added cherrypick/release-1.29 Set this label on a PR to auto-merge it to the release-1.29 branch and removed do-not-merge Block automatic merging of a PR. labels Jan 20, 2026
@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 20, 2026

/retest

@istio-testing istio-testing merged commit 7176bfe into istio:master Jan 20, 2026
7 checks passed
@istio-testing
Copy link
Collaborator

In response to a cherrypick label: new pull request created: #6775

keithmattix pushed a commit to keithmattix/proxy that referenced this pull request Jan 21, 2026
* Add FIELD accessor support for peer metadata in tracing

Signed-off-by: Petr McAllister <[email protected]>

* lint

Signed-off-by: Petr McAllister <[email protected]>

* address PR comments, fix tests

Signed-off-by: Petr McAllister <[email protected]>

* fix test when labels are not present

Signed-off-by: Petr McAllister <[email protected]>

* test fixes

Signed-off-by: Petr McAllister <[email protected]>

* Store both CelState and WorkloadMetadataObject for CEL and FIELD accessor support

Signed-off-by: Petr McAllister <[email protected]>

* Use separate keys for CelState and WorkloadMetadataObject storage

Signed-off-by: Petr McAllister <[email protected]>

* Update istio_stats to use new peer metadata object keys

Signed-off-by: Petr McAllister <[email protected]>

* Changed peerInfoRead to check the *PeerObj

Signed-off-by: Petr McAllister <[email protected]>

---------

Signed-off-by: Petr McAllister <[email protected]>
@zirain
Copy link
Member

zirain commented Jan 21, 2026

@PetrMc what is the diff between the new formatter with %CEL(filter_state['upstream_peer'].workload)%?

@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 21, 2026

| @PetrMc what is the diff between the new formatter with %CEL(filter_state['upstream_peer'].workload)%?

Thanks @zirain - FIELD accessor calls getField() directly on the C++ object - just a string lookup. CEL has to deserialize the protobuf and evaluate the expression each time. Should be faster for per-request tracing tags, though haven't benchmarked it yet.

@zirain
Copy link
Member

zirain commented Jan 21, 2026

will this cost more memory in theoretically?

@zirain
Copy link
Member

zirain commented Jan 21, 2026

What I want to say is that, we need a bechmark to help us make a tradeoff between memory usage increase(the FILTER_STATE way) and cpu usage(the CEL way).

@PetrMc
Copy link
Contributor Author

PetrMc commented Jan 21, 2026

makes sense @zirain - the two-key approach does add memory overhead.

looking at envoy that has substitution_formatter_speed_test.cc:
https://github.com/envoyproxy/envoy/blob/main/test/common/formatter/substitution_formatter_speed_test.cc

we could add similar benchmarks comparing FIELD accessor vs CEL. open to other approaches if you have preferred benchmarking method.

@zirain
Copy link
Member

zirain commented Jan 21, 2026

makes sense @zirain - the two-key approach does add memory overhead.

looking at envoy that has substitution_formatter_speed_test.cc: https://github.com/envoyproxy/envoy/blob/main/test/common/formatter/substitution_formatter_speed_test.cc

we could add similar benchmarks comparing FIELD accessor vs CEL. open to other approaches if you have preferred benchmarking method.

substitution_formatter_speed_test sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherrypick/release-1.29 Set this label on a PR to auto-merge it to the release-1.29 branch size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add source workload identification to waypoint proxy trace

5 participants