Skip to content

Commit 55c5185

Browse files
committed
feat(o11y): add Traefik Gateway API migration
1 parent 2c79713 commit 55c5185

File tree

6 files changed

+284
-79
lines changed

6 files changed

+284
-79
lines changed

o11y/README.md

Lines changed: 73 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,23 @@
11
# Observability Stack Setup
22

33
This document outlines the steps to set up an observability stack using Loki,
4-
Prometheus, and Grafana.
4+
Prometheus, and Grafana with Traefik Gateway API.
5+
6+
## Architecture Notes
7+
8+
**Ingress Controller:** Traefik with Kubernetes Gateway API (migrated from ingress-nginx in Nov 2025)
9+
10+
**Why Gateway API?**
11+
- ingress-nginx retired (EOL: March 2026)
12+
- Gateway API is Kubernetes standard for traffic routing
13+
- Better separation of concerns (infrastructure vs application routing)
14+
15+
**Key Components:**
16+
- **Gateway API v1.4.0**: Standard CRDs for traffic routing
17+
- **Traefik v3.6+**: Gateway API controller
18+
- **Gateway**: Defines HTTPS listener on port 8443 (internal), exposed as 443 externally
19+
- **HTTPRoutes**: Route `/grafana` and `/loki` to respective services
20+
- **Middlewares**: Request buffering (50MB limit), security headers
521

622
## 1. Prerequisites & Initial Setup
723

@@ -36,7 +52,7 @@ Prometheus, and Grafana.
3652
Install and update necessary Helm chart repositories:
3753
3854
```bash
39-
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
55+
helm repo add traefik https://traefik.github.io/charts
4056
helm repo add grafana https://grafana.github.io/helm-charts
4157
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
4258
helm repo update
@@ -78,20 +94,41 @@ helm repo update
7894
kubectl get deployment metrics-server -n kube-system
7995
```
8096
81-
4. **NGINX Ingress Controller** _Create Namespace (if not present, though
82-
usually handled by Helm):_
97+
4. **Gateway API & Traefik**
98+
99+
_Install Gateway API CRDs:_
83100
```bash
84-
kubectl create namespace ingress-nginx # Optional, Helm might create it
101+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml
85102
```
86-
_Install:_
103+
104+
_Verify CRDs:_
87105
```bash
88-
helm upgrade ingress-nginx ingress-nginx/ingress-nginx --namespace ingress-nginx --install -f charts/ingress-nginx/values.yaml
106+
kubectl get crd gateways.gateway.networking.k8s.io
89107
```
90-
_Verify:_
108+
109+
_Install Traefik with Gateway API support:_
110+
```bash
111+
helm upgrade traefik traefik/traefik \
112+
--namespace o11y \
113+
--install \
114+
--values charts/traefik/values.yaml \
115+
--reuse-values=false
116+
```
117+
118+
_Deploy Gateway and HTTPRoutes:_
91119
```bash
92-
kubectl get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
120+
kubectl apply -f k8s/gateway/gateway.yaml
121+
kubectl apply -f k8s/gateway/httproutes.yaml
93122
```
94123
124+
_Verify Gateway status:_
125+
```bash
126+
kubectl get gateway -n o11y o11y-gateway
127+
kubectl get httproute -n o11y
128+
```
129+
130+
Expected: Gateway PROGRAMMED = True, 2 HTTPRoutes (grafana-route, loki-route)
131+
95132
## 4. Observability Stack Deployment
96133
97134
1. **Loki Deployment (Log Aggregation)** _Install:_
@@ -139,27 +176,44 @@ helm repo update
139176
[Kubernetes cluster monitoring (via Prometheus) - ID 315](https://grafana.com/grafana/dashboards/315-kubernetes-cluster-monitoring-via-prometheus/)
140177
or create your own.
141178
142-
## 5. Ingress & DNS Configuration
179+
## 5. Gateway & DNS Configuration
180+
181+
> **Note:** Gateway and HTTPRoutes are deployed in Step 3.4 above. This section covers DNS configuration only.
143182
144-
1. **Ingress Configuration (Grafana/Loki)** _Apply:_
183+
1. **Verify Gateway Configuration**
145184
185+
_Check Gateway status:_
146186
```bash
147-
kubectl apply -f k8s/ingress/o11y-ingress.yaml -n o11y
187+
kubectl get gateway -n o11y o11y-gateway
148188
```
189+
Expected: `PROGRAMMED = True`
149190
150-
_Verify:_
151-
191+
_Check HTTPRoutes:_
152192
```bash
153-
kubectl get ingress -n o11y
193+
kubectl get httproute -n o11y
154194
```
195+
Expected: 2 routes (grafana-route, loki-route)
155196
156197
2. **DNS Configuration (Manual)**
157-
1. Get Load Balancer IP for the NGINX Ingress controller:
198+
199+
1. Get Load Balancer IP for Traefik:
158200
```bash
159-
kubectl get svc -n ingress-nginx ingress-nginx-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}' | cat
201+
kubectl get svc -n o11y traefik -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
202+
echo
203+
```
204+
205+
2. Update A record in Cloudflare for `o11y.freecodecamp.net` to point to the Traefik LoadBalancer IP.
206+
207+
3. Test endpoints before DNS update (optional):
208+
```bash
209+
TRAEFIK_IP=$(kubectl get svc -n o11y traefik -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
210+
211+
# Test Grafana
212+
curl -k -H "Host: o11y.freecodecamp.net" https://$TRAEFIK_IP/grafana/api/health
213+
214+
# Test Loki
215+
curl -k -H "Host: o11y.freecodecamp.net" https://$TRAEFIK_IP/loki/api/v1/status/buildinfo
160216
```
161-
2. Create an A record in Cloudflare for `o11y.freecodecamp.net` pointing to
162-
the retrieved IP.
163217
164218
## 6. Final Verification
165219

o11y/charts/ingress-nginx/values.yaml

Lines changed: 0 additions & 18 deletions
This file was deleted.

o11y/charts/traefik/values.yaml

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Traefik Gateway API configuration for o11y stack
2+
# Resource limits match the previous ingress-nginx configuration
3+
4+
deployment:
5+
replicas: 2
6+
7+
nodeSelector:
8+
kubernetes.io/os: linux
9+
10+
resources:
11+
requests:
12+
cpu: 50m
13+
memory: 128Mi
14+
limits:
15+
cpu: 100m
16+
memory: 256Mi
17+
18+
# Enable Gateway API provider
19+
providers:
20+
kubernetesGateway:
21+
enabled: true
22+
# Enable experimental features for full Gateway API support
23+
experimentalChannel: false
24+
25+
# Disable Helm chart's built-in Gateway creation - we create our own
26+
gateway:
27+
enabled: false
28+
29+
# Create a LoadBalancer service (DigitalOcean will provision one)
30+
service:
31+
enabled: true
32+
type: LoadBalancer
33+
annotations: {}
34+
35+
# EntryPoint timeout configuration (replaces nginx proxy timeouts)
36+
# Using high ports internally, LoadBalancer maps to 80/443 externally
37+
ports:
38+
web:
39+
port: 8000
40+
exposedPort: 80
41+
protocol: TCP
42+
websecure:
43+
port: 8443
44+
exposedPort: 443
45+
protocol: TCP
46+
tls:
47+
enabled: true
48+
transport:
49+
respondingTimeouts:
50+
readTimeout: "0s" # Unlimited - for large log uploads to Loki
51+
writeTimeout: "0s" # Unlimited - for long-running Grafana queries
52+
idleTimeout: "180s" # 3 minutes - keep connections alive
53+
54+
# Access logs for debugging
55+
logs:
56+
general:
57+
level: INFO
58+
access:
59+
enabled: true
60+
61+
# Prometheus metrics for o11y integration
62+
metrics:
63+
prometheus:
64+
enabled: true
65+
66+
# Security headers
67+
securityContext:
68+
capabilities:
69+
drop: [ALL]
70+
add: [NET_BIND_SERVICE]
71+
readOnlyRootFilesystem: true
72+
runAsNonRoot: true
73+
runAsUser: 65532
74+
75+
# Pod security context
76+
podSecurityContext:
77+
fsGroup: 65532

o11y/k8s/gateway/gateway.yaml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: Gateway
3+
metadata:
4+
name: o11y-gateway
5+
namespace: o11y
6+
spec:
7+
gatewayClassName: traefik
8+
listeners:
9+
- name: websecure
10+
protocol: HTTPS
11+
port: 8443
12+
hostname: o11y.freecodecamp.net
13+
tls:
14+
mode: Terminate
15+
certificateRefs:
16+
- name: o11y-secret-cloudflare-origin-cert
17+
allowedRoutes:
18+
namespaces:
19+
from: Same

o11y/k8s/gateway/httproutes.yaml

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
---
2+
# Buffering Middleware - Request body size limit and retry logic
3+
# Replaces nginx.ingress.kubernetes.io/proxy-body-size annotation
4+
apiVersion: traefik.io/v1alpha1
5+
kind: Middleware
6+
metadata:
7+
name: buffering
8+
namespace: o11y
9+
spec:
10+
buffering:
11+
maxRequestBodyBytes: 52428800 # 50MB max request body
12+
memRequestBodyBytes: 10485760 # 10MB threshold for disk buffering
13+
retryExpression: "IsNetworkError() && Attempts() < 2"
14+
15+
---
16+
# Headers Middleware - Security and proxy headers
17+
# Replaces custom header annotations from ingress-nginx
18+
apiVersion: traefik.io/v1alpha1
19+
kind: Middleware
20+
metadata:
21+
name: secure-headers
22+
namespace: o11y
23+
spec:
24+
headers:
25+
customRequestHeaders:
26+
X-Forwarded-Proto: "https"
27+
28+
---
29+
# Grafana HTTPRoute - Handles both / and /grafana paths
30+
# Replaces previous Ingress resource
31+
apiVersion: gateway.networking.k8s.io/v1
32+
kind: HTTPRoute
33+
metadata:
34+
name: grafana-route
35+
namespace: o11y
36+
spec:
37+
parentRefs:
38+
- name: o11y-gateway
39+
namespace: o11y
40+
hostnames:
41+
- o11y.freecodecamp.net
42+
rules:
43+
# Root path to Grafana
44+
- matches:
45+
- path:
46+
type: PathPrefix
47+
value: /
48+
filters:
49+
# Apply middlewares via ExtensionRef (Gateway API standard pattern)
50+
- type: ExtensionRef
51+
extensionRef:
52+
group: traefik.io
53+
kind: Middleware
54+
name: buffering
55+
- type: ExtensionRef
56+
extensionRef:
57+
group: traefik.io
58+
kind: Middleware
59+
name: secure-headers
60+
backendRefs:
61+
- name: grafana
62+
port: 80
63+
# Explicit /grafana path
64+
- matches:
65+
- path:
66+
type: PathPrefix
67+
value: /grafana
68+
filters:
69+
- type: ExtensionRef
70+
extensionRef:
71+
group: traefik.io
72+
kind: Middleware
73+
name: buffering
74+
- type: ExtensionRef
75+
extensionRef:
76+
group: traefik.io
77+
kind: Middleware
78+
name: secure-headers
79+
backendRefs:
80+
- name: grafana
81+
port: 80
82+
83+
---
84+
# Loki HTTPRoute - Handles /loki path for log ingestion
85+
# Replaces previous Ingress resource
86+
apiVersion: gateway.networking.k8s.io/v1
87+
kind: HTTPRoute
88+
metadata:
89+
name: loki-route
90+
namespace: o11y
91+
spec:
92+
parentRefs:
93+
- name: o11y-gateway
94+
namespace: o11y
95+
hostnames:
96+
- o11y.freecodecamp.net
97+
rules:
98+
- matches:
99+
- path:
100+
type: PathPrefix
101+
value: /loki
102+
filters:
103+
- type: ExtensionRef
104+
extensionRef:
105+
group: traefik.io
106+
kind: Middleware
107+
name: buffering
108+
- type: ExtensionRef
109+
extensionRef:
110+
group: traefik.io
111+
kind: Middleware
112+
name: secure-headers
113+
backendRefs:
114+
- name: loki-gateway
115+
port: 80

0 commit comments

Comments
 (0)