Zero-Downtime EKS Service Migration Using Weighted ALB Routing & Cross-Namespace Proxies
Introduction
When you need to migrate a critical service from one namespace to another in Kubernetes while maintaining zero downtime, traditional blue-green deployments fall short. The challenge becomes even more complex when your services live in different namespaces and you're using AWS Application Load Balancer (ALB) for ingress.
In this post, I'll walk you through a battle-tested approach we used to migrate our auth service from v1 to v2 across different namespaces with zero downtime and gradual traffic shifting.
The Challenge
Our scenario involved migrating one of our service i.e auth-service
from the auth-service-v1-dev
namespace to auth-service-v2-dev
namespace. The constraints were:
Zero downtime during migration
Weighted traffic distribution (canary deployments)
Both services fronted by the same AWS ALB
ALB controller limitation: can only see services within the same namespace as the ingress
Consideration
There were 3 options considered:
Path based migration: Only route traffic to certain path to v2 and slowly migrate rest of the paths
Weighted: Only route certain percentage of traffic to v2 ( 90/10, 80/20,60/40… )
Hybrid of path & weighted: Combination of 1 & 2
After planning and communicating with engineering teams, we decided that weighted migration would be the best path forward for the team.
Assumption
Before proceeding, we had few things already setup like:
ALB as ingress
External DNS handling DNS management / DNS-01 Challenge
Cert-Manager handling Certificate handling
For our setup, we had following validated:
Both applications running on same EKS cluster
Both applications fronted by ALB as an ingress
Both applications under the same ALB group name
v1 application is in auth-service-v1-dev namespace
v2 application is in auth-service-v2-dev namespace
Both applications expose the same functionality on different ports i.e v1 is exposing app at port 8002 and v2 is exposing app at port 8080
The "Teleporter Proxy" Pattern
We can’t directly apply ALB weighted ingress configuration for this migration, as ALB controller can only see services in the same namespace as the ingress. It can't directly route to auth-service-v2-dev-auth-service-v2 because that service lives in a different namespace.
We create a "proxy" service in the v1 namespace that ALB can see, but it secretly forwards all traffic to the v2 namespace using the ClusterIP. We call it the “Teleporter Proxy” Pattern. Let’s see how it is implemented first, then we will go about configuring the whole migration.
The core approach was what we call the "Teleporter Proxy" pattern. Since ALB can't directly route to services in different namespaces, we create a proxy service in the source namespace that forwards traffic to the target namespace using ClusterIP addresses.
Here's how the traffic flow works:
ALB → Proxy Service (v1 namespace) → Real Service (v2 namespace) → v2 Pods
Traffic Flow:
ALB sends 20% traffic → auth-service-v2.auth-service-v1-dev:8080 (proxy)
Proxy forwards →
123.10.28.114
:8080 (real v2 service)Real v2 service routes to v2 pods
Response flows back through the same path
It's like having a traffic forwarding service - ALB sends traffic to the local address, but it gets automatically forwarded to the real destination in another neighborhood (namespace). The proxy service acts as a bridge, allowing ALB to register targets while secretly forwarding traffic across namespace boundaries.
This gives us cross-namespace weighted routing while keeping the services properly isolated in their own namespaces.
┌─────────────────────────────────────────────────────────────────┐
│ Source Namespace │
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ Real Service │ │ Teleporter Proxy Service │ |
│ │ │ │ (No Selector/Manual Endpoints) │ │
│ └─────────────────┘ └─────────────────┬───────────────┘ │
└───────────────────────────────────────────┼──────────────────────┘
│
│ Teleports traffic via ClusterIP
▼
┌─────────────────────────────────────────────────────────────────┐
│ Target Namespace │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Target Service │ │
│ │ ClusterIP: 172.20.x.x │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Pod 1 │ │ Pod 2 │ │ Pod 3 │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Migration Summary
Just implementing the patten was not enough, we had to work on making the migration weighted and seamless. In order to do so, we had to leverage the ALB target group configuration.
Since target groups are created and attached automatically based on the configuration & ingress annotations, we had to create a service which will basically act as a proxy to the v2 namespace. However, there is another problem here.
ExternalName type service, which is Kubernetes' native way to reference external namespaces, is not supported by ALB. Hence we need to manually create a ClusterIP service and endpoints, and attach the ClusterIP address from the v2 namespace service to the v1 namespace service ( pattern as explained above ). Once deployed with weighted configuration with service name reference, it creates a target group in AWS console with proper targets that ALB can route to, enabling weighted traffic distribution between v1 and v2.
Slowly we change the weighted migration percentage to offload traffic to v2, reaching from (100/0) to (0/100). At this moment, all traffic is going to the v2 namespace.
Pre-Migration
Step 1 : Pre-Configure Service v2 Proxy and Endpoints
What this step does:
Getting the ClusterIP of the v2 service in auth-service-v2-dev namespace. We need that to configure our teleporter proxy.
Creating a "teleporter proxy" service in the v1 namespace (auth-service-v1-dev)
Manually pointing the proxy service to the v2 service's ClusterIP which essentially punches a hole to from V1 to V2 directly.
ok, let’s try to implement above,
Get the Cluster IP of the v2 Service
kubectl get svc auth-service-v2-dev-auth-service-v2 -n auth-service-v2-dev -o jsonpath='{.spec.clusterIP}'
Create the service in the v1 namespace using the following commands:
# Step 1: Get the v2 service ClusterIP
kubectl get svc auth-service-v2-dev-auth-service-v2 -n auth-service-v2-dev
# Note: ClusterIP is 123.10.28.114 (from your output)
# Step 2: Create proxy service and endpoints in v1 namespace
kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: auth-service-v2-teleporter
namespace: auth-service-v1-dev
labels:
app: auth-service-v2
version: v2
spec:
type: ClusterIP
ports:
- name: jmx-metrics
port: 9020
protocol: TCP
targetPort: 9020
- name: http
port: 8080
protocol: TCP
targetPort: 8080
- name: metrics
port: 9000
protocol: TCP
targetPort: 9000
---
apiVersion: v1
kind: Endpoints
metadata:
name: auth-service-v2-teleporter
namespace: auth-service-v1-dev
subsets:
- addresses:
- ip: 123.10.28.114 # ClusterIP from auth-service-v2-dev namespace
ports:
- name: jmx-metrics
port: 9020
protocol: TCP
- name: http
port: 8080
protocol: TCP
- name: metrics
port: 9000
protocol: TCP
EOF
The v2 service in auth-service-v1-dev will now proxy traffic to the actual v2 service in auth-service-v2-dev, and ALB can register it as targets for weighted load balancing!
Step 2: Configure v1 Helm for Weighted Routing
What this step does:
Creates a real ALB target group for ALB target registration
Makes application ready for weighted routing between v1 and v2
Update your v1 Helm configuration in auth-service-v1-dev namespace:
name: auth-service
# ... existing configuration ...
ingress:
enabled: true
# ... existing configuration ...
alb.ingress.kubernetes.io/target-group-attributes: slow_start.duration_seconds=120,deregistration_delay.timeout_seconds=30
alb.ingress.kubernetes.io/target-type: ip
+ alb.ingress.kubernetes.io/group.order: "40" # Lower priority than V1
+ alb.ingress.kubernetes.io/actions.weighted-routing: |
+ {
+ "type": "forward",
+ "forwardConfig": {
+ "targetGroups": [
+ {
+ "serviceName": "auth-service-v1-dev-auth-service",
+ "servicePort": "8002",
+ "weight": 100
+ },
+ {
+ "serviceName": "auth-service-v2-teleporter",
+ "servicePort": "8080",
+ "weight": 0
+ }
+ ]
+ }
+ }
hosts:
- host: auth-service.dev.mycompany.com
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
- name: auth-service-v1-dev-auth-service
+ name: weighted-routing # CHANGED: Use action name instead of direct service
+ port:
- name: http
+ name: use-annotation # CHANGED: Use annotation instead of http port
Step 3: Register target in the target group created by ALB from console
This was a manual thing but you would need to add the endpoints for the service and register them as targets in target groups. For some reason, ALB doesn’t automatically populate them. My guess it because the target are referencing cross name space
Go to Console > EC2 > Load Balancers > [ YOUR ALB ] > Listeners :443 > [ Search for domain ] > Click new Target Group ( one with 0 % ) > Register Targets
You can get Target IP by looking at targets from v2 service target group.
Step 4: Configure v2 Helm for alb target group ordering ( StandBy Mode )
Update your v2 Helm configuration in auth-service-v2-dev namespace:
ingress:
enabled: true
alb: true
annotations:
# ... existing configuration ...
alb.ingress.kubernetes.io/group.name: dev-internal-alb
+ alb.ingress.kubernetes.io/group.order: "10" # Lower priority than V1
ingressClassName: dev-alb-public
hosts:
- host: auth-service-v2.dev.mycompany.com
paths:
- path: /*
# ... existing configuration ...
Step 5: Apply the helm changes to both app and monitor.
Apply helm changes and monitor the application. There should not be any impact to the application or existing traffic
Migration
Phase 1: Start Canary ( 10% traffic to V2 )
Change the weight value and apply helm configuration in v1 application
alb.ingress.kubernetes.io/actions.weighted-routing: |
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "auth-service-v1-dev-auth-service",
"servicePort": "8002",
+ "weight": 90
},
{
"serviceName": "auth-service-v2-teleporter",
"servicePort": "8080",
+ "weight": 10
}
]
}
}
Phase 2: Increase v2 Traffic ( 30% traffic to v2 )
Same as above
alb.ingress.kubernetes.io/actions.weighted-routing: |
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "auth-service-v1-dev-auth-service",
"servicePort": "8002",
+ "weight": 70
},
{
"serviceName": "auth-service-v2-teleporter",
"servicePort": "8080",
+ "weight": 30
}
]
}
}
Phase 3: Balanced split ( 50% traffic to v2 )
Same as above
alb.ingress.kubernetes.io/actions.weighted-routing: |
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "auth-service-v1-dev-auth-service",
"servicePort": "8002",
+ "weight": 50
},
{
"serviceName": "auth-service-v2-teleporter",
"servicePort": "8080",
+ "weight": 50
}
]
}
}
Phase 4: Majority traffic to v2 ( 80% traffic to v2 )
Same as above
alb.ingress.kubernetes.io/actions.weighted-routing: |
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "auth-service-v1-dev-auth-service",
"servicePort": "8002",
+ "weight": 20
},
{
"serviceName": "auth-service-v2-teleporter",
"servicePort": "8080",
+ "weight": 80
}
]
}
}
Phase 5: Full traffic to v2 ( 100% traffic to v2 )
Same as above
alb.ingress.kubernetes.io/actions.weighted-routing: |
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "auth-service-v1-dev-auth-service",
"servicePort": "8002",
+ "weight": 0
},
{
"serviceName": "auth-service-v2-teleporter",
"servicePort": "8080",
+ "weight": 100
}
]
}
}
Phase 6: Final DNS Switch
Since we still want to use the existing DNS for v2, we need to change the host value in v2 namespace to be the current production DNS and update the group order to be higher than v1.
Step 1: Update v2 to take over production DNS
Change the group order to be much higher than v1 and set the host value to be auth-service.dev.mycompany.com and apply the change
Update v2 helmfile.yaml
- name: auth-service-v2-production
namespace: auth-service-v2-dev
values:
- applications:
- name: auth-service-v2
ingress:
enabled: true # Now enable v2 ingress
ingressClassName: dev-alb-public
annotations:
alb.ingress.kubernetes.io/group.name: dev-internal-alb
+ alb.ingress.kubernetes.io/group.order: '200' # HIGHER than v1
# ... other annotations ...
hosts:
- host: auth-service.dev.mycompany.com # SAME as v1 production DNS
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: auth-service-v2-dev-auth-service-v2
port:
name: http
Step 2: Update v1 to Lower priority and Different DNS
Update v1 helmfile.yaml
- name: auth-service
values:
- applications:
- name: auth-service
ingress:
annotations:
alb.ingress.kubernetes.io/group.order: '50' # LOWER than v2
# Remove weighted routing annotation
hosts:
- host: auth-service-v1.dev.mycompany.com # DIFFERENT DNS for rollback
Monitor and Observe
Monitor ALB target group health
aws elbv2 describe-target-health --target-group-arn <target-group-arn>
Monitor application metrics
kubectl top pods -n auth-service-v1-dev
kubectl top pods -n auth-service-v2-dev
Check service endpoints
kubectl get endpoints auth-service-v2-teleporter -n auth-service-v1-dev
kubectl get endpoints -n auth-service-v2-dev
Monitor logs
# v1
kubectl logs -n auth-service-v1-dev -l app.kubernetes.io/name=auth-service-v1-dev --tail=100 -f
# v2
kubectl logs -n auth-service-v2-dev -l app.kubernetes.io/name=auth-service-v2-dev --tail=100 -f
Rollback
The beauty of this approach is the instant rollback capability. Simply flip the group order priorities:
1. Change v1 Helmfile group order and weight back to v1
Here we bump the priority ( group order for v1 , to larger than v2, but also route traffic to service in v1 namespace ). Applying this will immediately rollback application to v1 namespace
name: auth-service
# ... existing configuration ...
ingress:
enabled: true
# ... existing configuration ...
alb.ingress.kubernetes.io/target-group-attributes: slow_start.duration_seconds=120,deregistration_delay.timeout_seconds=30
alb.ingress.kubernetes.io/target-type: ip
+ alb.ingress.kubernetes.io/group.order: "60" # Higher priority than V1
alb.ingress.kubernetes.io/actions.weighted-routing: |
{
"type": "forward",
"forwardConfig": {
"targetGroups": [
{
"serviceName": "auth-service-v1-dev-auth-service",
"servicePort": "8002",
+ "weight": 100
},
{
"serviceName": "auth-service-v2-teleporter",
"servicePort": "8080",
+ "weight": 0
}
]
}
}
hosts:
- host: auth-service.dev.mycompany.com
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
+ name: weighted-routing
port:
+ name: use-annotation
2. Change v2 Helmfile group order and host back to v2 DNS
We bump group order in v2 to be higher than v1 but also change host value ( DNS ) to make sure they don’t have any conflict.
ingress:
enabled: true
alb: true
annotations:
# **All annotations below configure the target group and not the ALB itself**
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80},{"HTTPS":443}]'
# Protocol for target group health check and routing
alb.ingress.kubernetes.io/backend-protocol: HTTP
alb.ingress.kubernetes.io/healthcheck-path: "/actuator/health/readiness"
alb.ingress.kubernetes.io/healthcheck-port: "9000"
+ alb.ingress.kubernetes.io/group.order: '40'
# health check success codes
alb.ingress.kubernetes.io/success-codes: "200-301"
alb.ingress.kubernetes.io/target-group-attributes: "slow_start.duration_seconds=120,deregistration_delay.timeout_seconds=30"
alb.ingress.kubernetes.io/target-type: ip
# name of the shared ingress group
alb.ingress.kubernetes.io/group.name: dev-internal-alb
ingressClassName: dev-alb-public
hosts:
+ - host: auth-service-v2.dev.mycompany.com
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: auth-service-v2-dev-auth-service-v2
port:
name: http
Post-Migration Cleanup
After successful migration and DNS switch, perform following steps for clean up of existing v1
Remove v1 resources from eks ( Ingress & deployment )
Scale down v1 deployment
kubectl scale deployment auth-service-v1-dev-auth-service -n auth-service-v1-dev --replicas=0
Delete ingress
kubectl delete ingress auth-service-weighted-ingress -n auth-service-v1-dev
Remove proxy service ( no longer needed )
kubectl delete service auth-service-v2-teleporter -n auth-service-v1-dev
kubectl delete endpoints auth-service-v2-teleporter -n auth-service-v1-dev
Update the monitoring and alerting dashboards to point to V2 namespace
Keep eye on traffic and dashboard for any anomalies
Verification commands
Before migration
while true; do
echo -n "$(date '+%H:%M:%S') - "
curl -s -w " Status: %{http_code}\n" --location 'https://auth-service.dev.mycompany.com/ping'
sleep 1
done
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
During migration
while true; do
echo -n "$(date '+%H:%M:%S') - "
curl -s -w " Status: %{http_code}\n" --location 'https://auth-service.dev.mycompany.com/ping'
sleep 1
done
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V2":true} Status: 200 <== occasional 200 from v2 and weighted traffic dist
15:00:52 - {"V2":true} Status: 200<== occasional 200 from v2 and weighted traffic dist
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
15:00:52 - {"V1":true} Status: 200
After Migration
while true; do
echo -n "$(date '+%H:%M:%S') - "
curl -s -w " Status: %{http_code}\n" --location 'https://auth-service.dev.mycompany.com/ping'
sleep 1
done
15:00:52 - {"V2":true} Status: 200
15:00:52 - {"V2":true} Status: 200
15:00:52 - {"V2":true} Status: 200
Key Benefits
True Zero Downtime: Traffic never stops flowing during migration
Gradual Risk Mitigation: Start with 10% traffic and observe
Instant Rollback: One configuration change reverts everything
Cross-Namespace Support: Works despite ALB controller limitations
Production Battle-Tested: Handles real-world complexity
Lessons Learned
ExternalName services don't work with ALB - manual endpoints are required
Group order is crucial for ALB rule precedence
Monitor target group health throughout the process
Keep rollback DNS ready for emergency situations
Clean up proxy services after successful migration
Conclusion
This teleporter proxy pattern solves a real limitation in Kubernetes networking while providing the safety and control needed for production migrations. The approach scales to any cross-namespace migration scenario and provides the operational confidence teams need when moving critical services.
The combination of weighted routing, namespace bridging, and priority-based rollback creates a robust migration framework that minimizes risk while maximizing control. Whether you're migrating between versions, namespaces, or even clusters, these patterns provide a solid foundation for zero-downtime operations.
Note: This is not the only solution, there are many ways you can achieve same thing. As there is a saying, “there are more than one way to fry a fish”, however given our circumstance and setup, this implementation delivered the desired result. For example, you can do these much easier if your cluster has gateway PI or service mesh like Istio deployed. You can do the same using separate ALB but all of these have their pros and cons. For our case, it was practical, one time migration, safe, and requires no major infra changes.
Have you implemented similar cross-namespace migration patterns? Share your experiences and alternative approaches in the comments below.