Skip to content

Conversation

@mariafromano-25
Copy link
Contributor

/kind cleanup

Continuing the efforts of uber Issue: kubernetes/kubernetes#134172
complementary to the changes in kubernetes/kubernetes#134924

  • Removes NodeSwap from the --focus-regex, --skip-regex, --test-args, and --ginko.skip of jobs that test alpha/beta features.
  • feature-gates will be addressed separately
  • unsure if swap test lanes should be completely removed as well such as pull-kubernetes-node-swap-ubuntu-serial, or pull-kubernetes-node-swap-fedora-serial since there are already conformance tests like pull-kubernetes-node-swap-conformance-ubuntu-serial. Will keep them in the code unless advised otherwise.

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/config Issues or PRs related to code in /config size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. area/jobs labels Oct 28, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @mariafromano-25. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the area/release-eng Issues or PRs related to the Release Engineering subproject label Oct 28, 2025
@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Oct 28, 2025
@mariafromano-25
Copy link
Contributor Author

/assign @SergeyKanzhelev

@k8s-ci-robot k8s-ci-robot added the sig/release Categorizes an issue or PR as relevant to SIG Release. label Oct 28, 2025
@k8s-ci-robot k8s-ci-robot added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Oct 28, 2025
@kannon92
Copy link
Contributor

/hold

kubernetes/kubernetes#134924

We should keep the swap features alone as this should not run in all lanes (swap is not enabled on all nodes).

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2025
- '--node-test-args=--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=/usr/bin/containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service"'
- --node-tests=true
- --provider=gce
- --test_args=--nodes=8 --focus="\[Feature:.+\]|\[Feature\]" --skip="\[Flaky\]|\[Serial\]|\[Feature:NodeSwap\]|\[Feature:InPlacePodVerticalScaling\]|\[Feature:UserNamespacesSupport]|\[Feature:UserNamespacesPodSecurityStandards]|\[Feature:KubeletCredentialProviders]|\[Feature:LockContention\]|\[Feature:SCTPConnectivity\]|\[Alpha\]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to keep swap exclusion here. When all features will mean "special environment is needed" than this whole test lane can be removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now please revert this change

- '--node-test-args=--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=/usr/bin/containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service"'
- --node-tests=true
- --provider=gce
- --test_args=--nodes=8 --focus="\[NodeConformance\]" --skip="\[Flaky\]|\[Slow\]|\[Serial\]|\[Feature:NodeSwap\]|\[Alpha\]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how we ended up with this combination? Are there any tests that are marked as nodeConformance AND NodeSwap at the same time? If so - let's clean those up.

And yes, this clean up is accurate and must be done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I found 5 tests that list both. In this case, I will remove the [NodeConfromance] label from them since the Swap feature indeed needs a special env.

`./_output/local/go/bin/e2e_node.test --list-tests | grep "[NodeConformance]" | grep "[Feature:NodeSwap]"

k8s.io/kubernetes/test/e2e_node/swap_test.go:89: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with a critical pod - should avoid swap
k8s.io/kubernetes/test/e2e_node/swap_test.go:83: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Best-effort
k8s.io/kubernetes/test/e2e_node/swap_test.go:84: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Burstable
k8s.io/kubernetes/test/e2e_node/swap_test.go:85: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Burstable with memory request equals to limit
k8s.io/kubernetes/test/e2e_node/swap_test.go:86: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Guaranteed`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe leave as is since the test in swap_test.go might intend to run in a standard conformance environment (where swap is disabled) and verify that pods configured with swap-related settings (ex: QoS) still behave correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its okay to apply @SergeyKanzhelev request to k/k.

We can probably remove NodeConformance from the swap_test.go labels.

- --image-family=cos-stable
- --image-project=cos-cloud
- --provider=gce
- --test_args=--ginkgo.skip=\[Driver:.gcepd\]|\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]|\[Feature:NodeSwap\] --minStartupPods=8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good clean up. The [Feature:.+\] already covers the [Feature:NodeSwap\]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I have applied that similar logic to the redundant labels above^

- '--node-test-args=--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service" --extra-log="{\"name\": \"containerd.log\", \"journalctl\": [\"-u\", \"containerd*\"]}"'
- --node-tests=true
- --provider=gce
- --test_args=--nodes=1 --timeout=4h --focus="\[Serial\]" --skip="\[Flaky\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:NodeSwap\]|\[Feature:DynamicResourceAllocation\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very strange test job. Removing this feature here will likely break the job. Can you check how many Feature tests this job runs? I think ideally it should skip all features or just focus with NodeConformance. But we may not be able to make this clean up just yet before cleaning other features

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so based on the test filters, it runs 424 tests that are marked with a [Feature] tag.
For reference, I used this command to list and count the tests:
./_output/local/go/bin/e2e_node.test --list-tests --ginkgo.focus="\[Serial\]" --ginkgo.skip="\[Flaky\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:NodeSwap\]|\[Feature:DynamicResourceAllocation\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]" | grep "\[Feature" | wc -l
Given that there are so many running, I can wait on the clean up until other features to not break anything.

- --gcp-zone=us-central1-b
- --ginkgo-parallel=30
- --provider=gce
# [Driver: gcepd] tests are skipped because CSIMigrationGCE is on in 1.23 and can be un-skipped once the cluster-up process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is outdated. Please remove (can be separate PR)

- --provider=gce
# [Driver: gcepd] tests are skipped because CSIMigrationGCE is on in 1.23 and can be un-skipped once the cluster-up process
# uses cloud-provider-gcp. see issue https://github.com/kubernetes/cloud-provider-gcp/issues/293
- --test_args=--ginkgo.skip=\[Slow\]|\[Serial\]|\[Disruptive\]|\[Driver:.gcepd\]|\[Flaky\]|\[Feature:.+\]|\[Feature:RuntimeHandler\]|\[Feature:NodeSwap\] --minStartupPods=8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no-op clean up since the [Feature:.+\]| already skips NodeSwap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove other features here as well

- --provider=gce
# *Manager jobs are skipped because they have corresponding test lanes with the right image
# These jobs in serial get partially skipped and are long jobs.
- --test_args=--nodes=1 --timeout=4h --focus="\[Serial\]" --skip="\[Flaky\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:CPUManager\]|\[Feature:MemoryManager\]|\[Feature:TopologyManager\]|\[Feature:NodeSwap\]|\[Feature:DynamicResourceAllocation\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely all the features in https://testgrid.k8s.io/sig-node-presubmits#pr-node-kubelet-serial-containerd soon be cleaned up and we can just skip all Feature+

@SergeyKanzhelev
Copy link
Member

I commented on a first few jobs. Please apply the same logic to others

@mariafromano-25 mariafromano-25 force-pushed the cleanup-swap-test-infra branch from 93789b9 to 8253c24 Compare October 30, 2025 20:21
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 30, 2025
@kannon92
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 30, 2025
- --focus-regex=\[Serial\]
- --use-dockerized-build=true
- --target-build-arch=linux/arm64
- --skip-regex=\[Flaky\]|\[Slow\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:DynamicResourceAllocation\]|\[Feature:NodeSwap\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we likely will need to preserver the NodeSwap skipping here. This job is not setting any swap memory on the node

- --focus-regex=\[Serial\]
- --use-dockerized-build=true
- --target-build-arch=linux/arm64
- --skip-regex=\[Flaky\]|\[Slow\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:DynamicResourceAllocation\]|\[Feature:NodeSwap\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep this skip of NodeSwap. Removing it from here will likely result in failing tests

- --focus-regex=\[Serial\]
- --use-dockerized-build=true
- --target-build-arch=linux/arm64
- --skip-regex=\[Flaky\]|\[Slow\]|\[Benchmark\]|\[NodeSpecialFeature:.+\]|\[NodeSpecialFeature\]|\[NodeAlphaFeature:.+\]|\[NodeAlphaFeature\]|\[NodeFeature:Eviction\]|\[Feature:DynamicResourceAllocation\]|\[NodeFeature:NodeSwap\]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in 1.32 branch was it called Feature or NodeFeature? It might have changed

Copy link
Contributor Author

@mariafromano-25 mariafromano-25 Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was called NodeFeature. Should I leave it at that? I just changed it to Feature to keep it consistent with 1.33 and 1.34

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We are trying not to touch the release jobs much - code never changes there so all tags typically stay as-is

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mariafromano-25
Once this PR has been reviewed and has the lgtm label, please ask for approval from sergeykanzhelev. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Issues or PRs related to code in /config area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

Development

Successfully merging this pull request may close these issues.

4 participants