-
Notifications
You must be signed in to change notification settings - Fork 140
Fix HPA race condition by reading deployment replicas instead of HPA status #4214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…status When HPA is enabled, read the current Deployment.Spec.Replicas directly instead of HPA.Status.DesiredReplicas, which is eventually consistent and lags behind deployment changes. This prevents the controller from overwriting HPA's replica count with stale values, eliminating pod churn and connection drops. Fixes race condition where HPA scales down → NGF reads stale HPA status → NGF overwrites deployment with old replica count → pods restart.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4214 +/- ##
==========================================
+ Coverage 86.02% 86.04% +0.02%
==========================================
Files 131 131
Lines 14111 14120 +9
Branches 35 35
==========================================
+ Hits 12139 12150 +11
+ Misses 1770 1769 -1
+ Partials 202 201 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| func (p *NginxProvisioner) determineReplicas( | ||
| objectMeta metav1.ObjectMeta, | ||
| deploymentCfg ngfAPIv1alpha2.DeploymentSpec, | ||
| ) *int32 { | ||
| replicas := deploymentCfg.Replicas | ||
|
|
||
| if !isAutoscalingEnabled(&deploymentCfg) { | ||
| return replicas | ||
| } | ||
|
|
||
| ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) | ||
| defer cancel() | ||
|
|
||
| hpa := &autoscalingv2.HorizontalPodAutoscaler{} | ||
| err := p.k8sClient.Get(ctx, types.NamespacedName{ | ||
| Namespace: objectMeta.Namespace, | ||
| Name: objectMeta.Name, | ||
| }, hpa) | ||
| if err != nil { | ||
| return replicas | ||
| } | ||
|
|
||
| existingDeployment := &appsv1.Deployment{} | ||
| err = p.k8sClient.Get(ctx, types.NamespacedName{ | ||
| Namespace: objectMeta.Namespace, | ||
| Name: objectMeta.Name, | ||
| }, existingDeployment) | ||
|
|
||
| if err == nil && existingDeployment.Spec.Replicas != nil { | ||
| replicas = existingDeployment.Spec.Replicas | ||
| } | ||
|
|
||
| return replicas | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach but I was wondering if there was a way to not set our spec replica when HPA is enabled (or set it initially to the config value but once HPA is detected, we don't let controller handle it scaling for deployment )and document that, since reconciliation is what leads us to the race issue. I was wanting to understand the limitations of that approach. What I was thinking is not patching replicas to the deployment when HPA enabled?
The issue could be on HPA's deletion in cluster and controller knowing it and falling back.
And I am assuming if HPA when once enabled and used, is deleted then we patch the deployment with that information and default to replica count in config?
| // HPA Replicas Management Strategy: | ||
| // | ||
| // When an HPA is managing a deployment, we must read the current deployment's replicas | ||
| // from the cluster and use that value, rather than trying to set our own value or read | ||
| // from HPA.Status.DesiredReplicas (which is eventually consistent and stale). | ||
| // | ||
| // Why we can't use HPA.Status.DesiredReplicas: | ||
| // - HPA.Status updates lag behind Deployment.Spec.Replicas changes | ||
| // - When HPA scales down: HPA writes Deployment.Spec → then updates its own Status | ||
| // - If we read Status during this window, we get the OLD value and overwrite HPA's new value | ||
| // - This creates a race condition causing pod churn | ||
| // | ||
| // Our approach: | ||
| // - When HPA exists: Read current deployment replicas from cluster and use that | ||
| // - When HPA doesn't exist yet: Set replicas for initial deployment creation | ||
| // - When HPA exists but Deployment doesn't exist yet: Set replicas for initial deployment creation | ||
| // - When HPA is disabled: Set replicas normally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I feel like this is more function implementation details and should go inside the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this as a good overview before jumping into details since its so intertwined. outside makes more sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you verified that the situation described in the bug report was resolved?
Yeah we should try to bottleneck the reconciliation/patch logic and see how that goes(stability eventually happens) since this was a bigger pain point for prod environments |
Proposed changes
Problem: When autoscaling.enable: true is configured in the Helm chart, the NGF controller updates the deployment and modifies the spec.replicas field in conflict with the HPA. This causes the deployment to scale up and down in the same second, resulting in constant pod churn and preventing the HPA from scaling up or down consistently.
Solution: When HPA is enabled, read the current Deployment.Spec.Replicas directly instead of HPA.Status.DesiredReplicas, which is eventually consistent and lags behind deployment changes. This prevents the controller from overwriting HPA's replica count with stale values, eliminating pod churn and connection drops.
Testing: Unit and local testing
Please focus on (optional): If you any specific areas where you would like reviewers to focus their attention or provide
specific feedback, add them here.
Closes #4007
Checklist
Before creating a PR, run through this checklist and mark each as complete.
Release notes
If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.