-
Notifications
You must be signed in to change notification settings - Fork 615
Split helm UX into dedicated charts for Envoy based kgateway and agentgateway #13062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR splits the monolithic kgateway Helm chart into dedicated charts for Envoy-based kgateway and agentgateway deployments. This architectural change enables users to install either implementation independently, improving deployment flexibility and reducing coupling between the two gateway types.
Key changes include:
- Introduction of new
agentgatewayandagentgateway-crdsHelm charts with dedicated templates and values - Creation of
AgentgatewayParametersCRD for agentgateway-specific configuration - Routing logic in test infrastructure and deployer to support both chart types
- Removal of agentgateway-specific configuration from the kgateway chart
Reviewed changes
Copilot reviewed 86 out of 87 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
install/helm/agentgateway/* |
New Helm chart for agentgateway control plane deployment |
install/helm/agentgateway-crds/* |
New CRD chart for agentgateway resources |
install/helm/kgateway/* |
Removed agentgateway-specific configuration and RBAC rules |
pkg/kgateway/deployer/agentgateway_parameters.go |
New parameter handler for AgentgatewayParameters CRD |
pkg/kgateway/deployer/gateway_parameters.go |
Updated routing logic to distinguish between kgateway and agentgateway |
pkg/deployer/strategicpatch/* |
New strategic merge patch implementation for overlay support |
test/e2e/* |
Test infrastructure updates to support multiple chart types |
test/deployer/testdata/* |
Extensive test data for agentgateway scenarios |
pkg/utils/helmutils/constants.go |
Added constants for agentgateway chart names and URIs |
api/v1alpha1/agentgateway/* |
New AgentgatewayParameters type definitions and generated code |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - list | ||
| - patch | ||
| - watch | ||
| - apiGroups: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are we going to keep something like #13045 to handle upgrades? Or what's the upgrade path for the new helm chart split?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could I need to try it out, there certainly is not a smooth upgrade path and I would argue that is fine becasue the Group name changed too. Maybe we should include the old group name in here for now
| DirectResponses: krttest.GetMockCollection[*v1alpha1.DirectResponse](mock), | ||
| GatewayExtensions: krttest.GetMockCollection[*v1alpha1.GatewayExtension](mock), | ||
| ControllerName: wellknown.DefaultAgwControllerName, | ||
| SystemNamespace: "kgateway-system", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the system namespace still be kgateway-system for both charts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not hard coded anywhere then tests so I think this is up to documentation
| # -- Enable Inference Extension. If enabled, agentgateway.enabled should also be set to true. Enabling inference extension without agentgateway is deprecated in v2.1 and will not be supported in v2.2. | ||
| enabled: false | ||
|
|
||
| # -- List of namespace selectors (OR'ed): each entry can use 'matchLabels' or 'matchExpressions' (AND'ed within each entry if used together). Kgateway includes the selected namespaces in config discovery. For more information, see the docs https://kgateway.dev/docs/latest/install/advanced/#namespace-discovery. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we handling inference extensions in kgateway? Should we also remove
| inferenceExtension: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it should be removed, I removed it from the new agentgateway one as its not relevant, I thought we already deprecated it in the main install
ce60e64 to
d307800
Compare
697fcb7 to
a5a50f0
Compare
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
ae091ab to
5f5033c
Compare
Signed-off-by: Joshua Pritchard <[email protected]>
| // Only create GatewayExtensions collection if Envoy is enabled | ||
| // This CRD is specific to Envoy and not used by agentgateway | ||
| var gwExts krt.Collection[ir.GatewayExtension] | ||
| if settings.EnableEnvoy { | ||
| gwExts = krtcollections.NewGatewayExtensionsCollection(ctx, client, krtOptions) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are several other CRDs/collections that are specific to the Envoy translator,
why is this necessary for GwExt only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Others are in the plugins so those get skipped at a high level but this is in common collections and it needed for agw though these are not leveraged, this should likely get moved from common collections but works for now
| // Treat permission errors as "CRD doesn't exist" rather than failing | ||
| // This allows controllers to start even without RBAC for all API groups | ||
| // Kind of a hack because kgateway and agentgateway share controllers so some of these don't exist | ||
| if errors.IsNotFound(err) || errors.IsForbidden(err) || errors.IsUnauthorized(err) || discovery.IsGroupDiscoveryFailedError(err) || meta.IsNoMatchError(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the functional impact here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really just hides this error that was causing issues when it shouldn't because a collection was never initialized on these anyways, you would still get an error later if you did that. idk why we can't just add the scheme and not worry about if the CRD exists or not because I don't think it hurts anything. In other controllers we have we just do that but maybe I am missing some context.
276d478 to
aab792e
Compare
| // kgateway chart: only Envoy enabled -> lease is "kgateway-envoy" | ||
| // agentgateway chart: only Agentgateway enabled -> lease is "kgateway-agentgateway" | ||
| // both controllers enabled (not typical in tests): lease is "kgateway" | ||
| leaseID := s.getLeaderElectionID() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this test only testing the default installation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is but we could add agentgateway and it would work now and it did need to get the envoy appended to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind though because its the same code its testing either way
| Backends: krt.NewInformer[*agentgateway.AgentgatewayBackend](commoncol.Client), | ||
| // Only watch agentgateway-specific CRDs if agentgateway is enabled | ||
| // This prevents permission errors when only Envoy is enabled | ||
| if commoncol.Settings.EnableAgentgateway { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if agent gateway is disabled, why do we need any of these collections initialized?
i.e. why not return nil early in the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout, these low level gates are a source of some of the test failures, just changed this around
| gatewayExtensions := krt.NewCollection(commoncol.GatewayExtensions, defaultExtBuilder) | ||
| var gatewayExtensions krt.Collection[TrafficPolicyGatewayExtensionIR] | ||
| if commoncol.GatewayExtensions != nil { | ||
| gatewayExtensions = krt.NewCollection(commoncol.GatewayExtensions, defaultExtBuilder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we error if GatewayExtensions is nil? i.e. can this happen normally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch this is gated at a higher level and this won't run at all if Envoy is not enabled. I will remove this
| if s.globalSettings.EnableEnvoy && !s.globalSettings.EnableAgentgateway { | ||
| // Envoy-only controller (kgateway chart) | ||
| leaderElectionID = s.leaderElectionID + "-envoy" | ||
| } else if !s.globalSettings.EnableEnvoy && s.globalSettings.EnableAgentgateway { | ||
| // Agentgateway-only controller (agentgateway chart) | ||
| leaderElectionID = s.leaderElectionID + "-agentgateway" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
controller names are different right? i.e. if they both write a status on the same httproute, will that work ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it should be right because its its by parent https://gateway-api.sigs.k8s.io/api-types/httproute/#parents same as if was just a different GW API implementation
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
| func (p *Provider) getChartLabelSelector() string { | ||
| chartType := p.installContext.GetChartType() | ||
| if chartType == "agentgateway" { | ||
| return "app.kubernetes.io/name=agentgateway" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] can we make consts for this?
pkg/utils/helmutils/constants.go
Outdated
| DefaultAgentGatewayChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway" | ||
| DefaultAgentGatewayCRDChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway-crds" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] or Agw here
| DefaultAgentGatewayChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway" | |
| DefaultAgentGatewayCRDChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway-crds" | |
| DefaultAgentgatewayChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway" | |
| DefaultAgentgatewayCRDChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway-crds" |
pkg/utils/helmutils/constants.go
Outdated
| AgentGatewayChartName = "agentgateway" | ||
| AgentGatewayCRDChartName = "agentgateway-crds" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit]
| AgentGatewayChartName = "agentgateway" | |
| AgentGatewayCRDChartName = "agentgateway-crds" | |
| AgentgatewayChartName = "agentgateway" | |
| AgentgatewayCRDChartName = "agentgateway-crds" |
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
902a26e to
2e89d43
Compare
Signed-off-by: Joshua Pritchard <[email protected]>
8d4cb1e to
497837d
Compare
Signed-off-by: Joshua Pritchard <[email protected]>
497837d to
d21afc0
Compare
|
/re-test |
|
/kick-ci |
|
/retest |
| # December 4, 2025: ~9 minutes | ||
| - cluster-name: 'api-validation' # TODO: rename to cluster-seven | ||
| go-test-args: '-timeout=15m' | ||
| go-test-run-regex: '^TestAPIValidation|^TestKgateway$$/^OAuth$$|^TestCustomGWP$$' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is TestCustomGWP envoy-dataplane specific now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh no they both can be available in all the regular clusters, this didn't actually need to move I think I just left this accidentally when I was reshuffling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general there is a bit of test cleanup to do here that we should followup on now with the chart split there were some workarounds I needed to apply and I don't love how the install is handled for the tests now in context of there being 2 charts, I think we want to change it around anyways to be able to re-use the same install between tests as well.
|
/merge |
Description
Still WIP and depends/includes changes from #13018 but wanted to run CI and get up for review.
Change Type
/kind breaking_change
Changelog
Additional Notes
Fast Follow #13093
Resolves #13024