OCI Cluster autoscaler not using secondary nodepool, when OCI is Out of Host Capacity

**Which component are you using?**: 
Cluster-autoscaler



**What version of the component are you using?**:
1.32.2



Component version:

**What k8s version are you using (`kubectl version`)?**:

<details><summary><code>kubectl version</code> Output</summary><br><pre>
$ kubectl version
Server Version: v1.32.1
</pre></details>

**What environment is this in?**:

OCI


**What did you expect to happen?**:
When a nodepool is marked as `status: Unhealthy` due to OCI being `Out of Host Capacity`, then it should stop trying to schedule Pending pods on that nodepool and switch to a different nodepool with lower priority.


**What happened instead?**:
Cluster autoscaler keeps trying to template the Pending pod on an upcoming node, that is not getting created. As seen from the logs:

> Pod <pod-name> can be moved to template-node-for-<nodepool id>-upcoming-0

Same nodepool is set to `Unhealthy` in the `cluster-autoscaler-status` configmap, due to OCI being out of capacity and unable to remove the upcoming node, since OCI has not given upcoming nodes an ID yet 

> Found 1 instances with errorCode OutOfResource.InternalError in nodeGroup <nodepool id>
>  Deleting 1 from <nodepool id> node group because of create errors
> Error while trying to delete nodes from<nodepool id>: Node  doesn't have an instance id so it can't be deleted.




**How to reproduce it (as minimally and precisely as possible)**:
Create 2 nodepools with different instance types in OKE cluster, where one nodepool is out of capacity on OCI. And configure Cluster autoscaler with different priority, with the nodepool that is out of capacity having higher priority. Then create new pods to be scheduled on these nodepools and wait for cluster-autoscaler to mark the nodepool as unhealthy and not scale up the secondary nodepool


**Anything else we need to know?**:
Not sure if it changes the behaviour, but the nodepool is trying scale from 0 nodes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OCI Cluster autoscaler not using secondary nodepool, when OCI is Out of Host Capacity #8700

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OCI Cluster autoscaler not using secondary nodepool, when OCI is Out of Host Capacity #8700

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions