Skip to content

Conversation

@josh-pritchard
Copy link
Contributor

@josh-pritchard josh-pritchard commented Dec 5, 2025

Description

Still WIP and depends/includes changes from #13018 but wanted to run CI and get up for review.

Change Type

/kind breaking_change

Changelog

Split helm UX into dedicated charts for Envoy based kgateway and agentgateway

Additional Notes

Fast Follow #13093
Resolves #13024

Copilot AI review requested due to automatic review settings December 5, 2025 04:40
@gateway-bot gateway-bot added do-not-merge/release-note-invalid Indicates that a PR should not merge because it's missing one of the release note labels. kind/breaking_change labels Dec 5, 2025
@gateway-bot gateway-bot added release-note and removed do-not-merge/release-note-invalid Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 5, 2025
@josh-pritchard josh-pritchard marked this pull request as draft December 5, 2025 04:44
@josh-pritchard
Copy link
Contributor Author

/retest

@gateway-bot gateway-bot added do-not-merge/release-note-invalid Indicates that a PR should not merge because it's missing one of the release note labels. and removed release-note labels Dec 5, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR splits the monolithic kgateway Helm chart into dedicated charts for Envoy-based kgateway and agentgateway deployments. This architectural change enables users to install either implementation independently, improving deployment flexibility and reducing coupling between the two gateway types.

Key changes include:

  • Introduction of new agentgateway and agentgateway-crds Helm charts with dedicated templates and values
  • Creation of AgentgatewayParameters CRD for agentgateway-specific configuration
  • Routing logic in test infrastructure and deployer to support both chart types
  • Removal of agentgateway-specific configuration from the kgateway chart

Reviewed changes

Copilot reviewed 86 out of 87 changed files in this pull request and generated no comments.

Show a summary per file
File Description
install/helm/agentgateway/* New Helm chart for agentgateway control plane deployment
install/helm/agentgateway-crds/* New CRD chart for agentgateway resources
install/helm/kgateway/* Removed agentgateway-specific configuration and RBAC rules
pkg/kgateway/deployer/agentgateway_parameters.go New parameter handler for AgentgatewayParameters CRD
pkg/kgateway/deployer/gateway_parameters.go Updated routing logic to distinguish between kgateway and agentgateway
pkg/deployer/strategicpatch/* New strategic merge patch implementation for overlay support
test/e2e/* Test infrastructure updates to support multiple chart types
test/deployer/testdata/* Extensive test data for agentgateway scenarios
pkg/utils/helmutils/constants.go Added constants for agentgateway chart names and URIs
api/v1alpha1/agentgateway/* New AgentgatewayParameters type definitions and generated code

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gateway-bot gateway-bot added release-note and removed do-not-merge/release-note-invalid Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 5, 2025
- list
- patch
- watch
- apiGroups:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we going to keep something like #13045 to handle upgrades? Or what's the upgrade path for the new helm chart split?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could I need to try it out, there certainly is not a smooth upgrade path and I would argue that is fine becasue the Group name changed too. Maybe we should include the old group name in here for now

DirectResponses: krttest.GetMockCollection[*v1alpha1.DirectResponse](mock),
GatewayExtensions: krttest.GetMockCollection[*v1alpha1.GatewayExtension](mock),
ControllerName: wellknown.DefaultAgwControllerName,
SystemNamespace: "kgateway-system",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the system namespace still be kgateway-system for both charts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not hard coded anywhere then tests so I think this is up to documentation

# -- Enable Inference Extension. If enabled, agentgateway.enabled should also be set to true. Enabling inference extension without agentgateway is deprecated in v2.1 and will not be supported in v2.2.
enabled: false

# -- List of namespace selectors (OR'ed): each entry can use 'matchLabels' or 'matchExpressions' (AND'ed within each entry if used together). Kgateway includes the selected namespaces in config discovery. For more information, see the docs https://kgateway.dev/docs/latest/install/advanced/#namespace-discovery.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we handling inference extensions in kgateway? Should we also remove

inferenceExtension:

Copy link
Contributor Author

@josh-pritchard josh-pritchard Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it should be removed, I removed it from the new agentgateway one as its not relevant, I thought we already deprecated it in the main install

@josh-pritchard josh-pritchard marked this pull request as ready for review December 9, 2025 19:54
@josh-pritchard josh-pritchard force-pushed the josh/split-helm branch 3 times, most recently from 697fcb7 to a5a50f0 Compare December 10, 2025 07:18
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
@yuval-k yuval-k self-assigned this Dec 10, 2025
Signed-off-by: Joshua Pritchard <[email protected]>
Comment on lines +152 to +157
// Only create GatewayExtensions collection if Envoy is enabled
// This CRD is specific to Envoy and not used by agentgateway
var gwExts krt.Collection[ir.GatewayExtension]
if settings.EnableEnvoy {
gwExts = krtcollections.NewGatewayExtensionsCollection(ctx, client, krtOptions)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are several other CRDs/collections that are specific to the Envoy translator,
why is this necessary for GwExt only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others are in the plugins so those get skipped at a high level but this is in common collections and it needed for agw though these are not leveraged, this should likely get moved from common collections but works for now

Comment on lines +65 to +68
// Treat permission errors as "CRD doesn't exist" rather than failing
// This allows controllers to start even without RBAC for all API groups
// Kind of a hack because kgateway and agentgateway share controllers so some of these don't exist
if errors.IsNotFound(err) || errors.IsForbidden(err) || errors.IsUnauthorized(err) || discovery.IsGroupDiscoveryFailedError(err) || meta.IsNoMatchError(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the functional impact here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really just hides this error that was causing issues when it shouldn't because a collection was never initialized on these anyways, you would still get an error later if you did that. idk why we can't just add the scheme and not worry about if the CRD exists or not because I don't think it hurts anything. In other controllers we have we just do that but maybe I am missing some context.

// kgateway chart: only Envoy enabled -> lease is "kgateway-envoy"
// agentgateway chart: only Agentgateway enabled -> lease is "kgateway-agentgateway"
// both controllers enabled (not typical in tests): lease is "kgateway"
leaseID := s.getLeaderElectionID()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this test only testing the default installation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is but we could add agentgateway and it would work now and it did need to get the envoy appended to it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind though because its the same code its testing either way

Backends: krt.NewInformer[*agentgateway.AgentgatewayBackend](commoncol.Client),
// Only watch agentgateway-specific CRDs if agentgateway is enabled
// This prevents permission errors when only Envoy is enabled
if commoncol.Settings.EnableAgentgateway {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if agent gateway is disabled, why do we need any of these collections initialized?
i.e. why not return nil early in the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout, these low level gates are a source of some of the test failures, just changed this around

gatewayExtensions := krt.NewCollection(commoncol.GatewayExtensions, defaultExtBuilder)
var gatewayExtensions krt.Collection[TrafficPolicyGatewayExtensionIR]
if commoncol.GatewayExtensions != nil {
gatewayExtensions = krt.NewCollection(commoncol.GatewayExtensions, defaultExtBuilder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we error if GatewayExtensions is nil? i.e. can this happen normally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch this is gated at a higher level and this won't run at all if Envoy is not enabled. I will remove this

Comment on lines +289 to +295
if s.globalSettings.EnableEnvoy && !s.globalSettings.EnableAgentgateway {
// Envoy-only controller (kgateway chart)
leaderElectionID = s.leaderElectionID + "-envoy"
} else if !s.globalSettings.EnableEnvoy && s.globalSettings.EnableAgentgateway {
// Agentgateway-only controller (agentgateway chart)
leaderElectionID = s.leaderElectionID + "-agentgateway"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

controller names are different right? i.e. if they both write a status on the same httproute, will that work ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it should be right because its its by parent https://gateway-api.sigs.k8s.io/api-types/httproute/#parents same as if was just a different GW API implementation

Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
func (p *Provider) getChartLabelSelector() string {
chartType := p.installContext.GetChartType()
if chartType == "agentgateway" {
return "app.kubernetes.io/name=agentgateway"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] can we make consts for this?

Comment on lines 11 to 12
DefaultAgentGatewayChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway"
DefaultAgentGatewayCRDChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway-crds"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] or Agw here

Suggested change
DefaultAgentGatewayChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway"
DefaultAgentGatewayCRDChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway-crds"
DefaultAgentgatewayChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway"
DefaultAgentgatewayCRDChartUri = "oci://ghcr.io/kgateway-dev/charts/agentgateway-crds"

Comment on lines 6 to 7
AgentGatewayChartName = "agentgateway"
AgentGatewayCRDChartName = "agentgateway-crds"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]

Suggested change
AgentGatewayChartName = "agentgateway"
AgentGatewayCRDChartName = "agentgateway-crds"
AgentgatewayChartName = "agentgateway"
AgentgatewayCRDChartName = "agentgateway-crds"

Signed-off-by: Joshua Pritchard <[email protected]>
Signed-off-by: Joshua Pritchard <[email protected]>
@josh-pritchard
Copy link
Contributor Author

/re-test

@josh-pritchard
Copy link
Contributor Author

/kick-ci

@josh-pritchard
Copy link
Contributor Author

/retest

# December 4, 2025: ~9 minutes
- cluster-name: 'api-validation' # TODO: rename to cluster-seven
go-test-args: '-timeout=15m'
go-test-run-regex: '^TestAPIValidation|^TestKgateway$$/^OAuth$$|^TestCustomGWP$$'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is TestCustomGWP envoy-dataplane specific now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh no they both can be available in all the regular clusters, this didn't actually need to move I think I just left this accidentally when I was reshuffling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general there is a bit of test cleanup to do here that we should followup on now with the chart split there were some workarounds I needed to apply and I don't love how the install is handled for the tests now in context of there being 2 charts, I think we want to change it around anyways to be able to re-use the same install between tests as well.

@josh-pritchard
Copy link
Contributor Author

/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Split kgateway and agentgateway helm charts

6 participants