-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Cleanup NodeSwap Feature Labels #35791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Cleanup NodeSwap Feature Labels #35791
Conversation
|
Hi @mariafromano-25. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/assign @SergeyKanzhelev |
|
/hold We should keep the swap features alone as this should not run in all lanes (swap is not enabled on all nodes). |
| - '--node-test-args=--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=/usr/bin/containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service"' | ||
| - --node-tests=true | ||
| - --provider=gce | ||
| - --test_args=--nodes=8 --focus="\[Feature:.+\]|\[Feature\]" --skip="\[Flaky\]|\[Serial\]|\[Feature:NodeSwap\]|\[Feature:InPlacePodVerticalScaling\]|\[Feature:UserNamespacesSupport]|\[Feature:UserNamespacesPodSecurityStandards]|\[Feature:KubeletCredentialProviders]|\[Feature:LockContention\]|\[Feature:SCTPConnectivity\]|\[Alpha\]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to keep swap exclusion here. When all features will mean "special environment is needed" than this whole test lane can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now please revert this change
| - '--node-test-args=--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=/usr/bin/containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service"' | ||
| - --node-tests=true | ||
| - --provider=gce | ||
| - --test_args=--nodes=8 --focus="\[NodeConformance\]" --skip="\[Flaky\]|\[Slow\]|\[Serial\]|\[Feature:NodeSwap\]|\[Alpha\]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how we ended up with this combination? Are there any tests that are marked as nodeConformance AND NodeSwap at the same time? If so - let's clean those up.
And yes, this clean up is accurate and must be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I found 5 tests that list both. In this case, I will remove the [NodeConfromance] label from them since the Swap feature indeed needs a special env.
`./_output/local/go/bin/e2e_node.test --list-tests | grep "[NodeConformance]" | grep "[Feature:NodeSwap]"
k8s.io/kubernetes/test/e2e_node/swap_test.go:89: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with a critical pod - should avoid swap
k8s.io/kubernetes/test/e2e_node/swap_test.go:83: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Best-effort
k8s.io/kubernetes/test/e2e_node/swap_test.go:84: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Burstable
k8s.io/kubernetes/test/e2e_node/swap_test.go:85: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Burstable with memory request equals to limit
k8s.io/kubernetes/test/e2e_node/swap_test.go:86: [sig-node] Swap [LinuxOnly] [Feature:NodeSwap] [Serial] [NodeConformance] with configuration QOS Guaranteed`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe leave as is since the test in swap_test.go might intend to run in a standard conformance environment (where swap is disabled) and verify that pods configured with swap-related settings (ex: QoS) still behave correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its okay to apply @SergeyKanzhelev request to k/k.
We can probably remove NodeConformance from the swap_test.go labels.
| - --image-family=cos-stable | ||
| - --image-project=cos-cloud | ||
| - --provider=gce | ||
| - --test_args=--ginkgo.skip=\[Driver:.gcepd\]|\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]|\[Feature:NodeSwap\] --minStartupPods=8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good clean up. The [Feature:.+\] already covers the [Feature:NodeSwap\]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I have applied that similar logic to the redundant labels above^
| - '--node-test-args=--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service" --extra-log="{\"name\": \"containerd.log\", \"journalctl\": [\"-u\", \"containerd*\"]}"' | ||
| - --node-tests=true | ||
| - --provider=gce | ||
| - --test_args=--nodes=1 --timeout=4h --focus="\[Serial\]" --skip="\[Flaky\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:NodeSwap\]|\[Feature:DynamicResourceAllocation\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very strange test job. Removing this feature here will likely break the job. Can you check how many Feature tests this job runs? I think ideally it should skip all features or just focus with NodeConformance. But we may not be able to make this clean up just yet before cleaning other features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so based on the test filters, it runs 424 tests that are marked with a [Feature] tag.
For reference, I used this command to list and count the tests:
./_output/local/go/bin/e2e_node.test --list-tests --ginkgo.focus="\[Serial\]" --ginkgo.skip="\[Flaky\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:NodeSwap\]|\[Feature:DynamicResourceAllocation\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]" | grep "\[Feature" | wc -l
Given that there are so many running, I can wait on the clean up until other features to not break anything.
| - --gcp-zone=us-central1-b | ||
| - --ginkgo-parallel=30 | ||
| - --provider=gce | ||
| # [Driver: gcepd] tests are skipped because CSIMigrationGCE is on in 1.23 and can be un-skipped once the cluster-up process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is outdated. Please remove (can be separate PR)
| - --provider=gce | ||
| # [Driver: gcepd] tests are skipped because CSIMigrationGCE is on in 1.23 and can be un-skipped once the cluster-up process | ||
| # uses cloud-provider-gcp. see issue https://github.com/kubernetes/cloud-provider-gcp/issues/293 | ||
| - --test_args=--ginkgo.skip=\[Slow\]|\[Serial\]|\[Disruptive\]|\[Driver:.gcepd\]|\[Flaky\]|\[Feature:.+\]|\[Feature:RuntimeHandler\]|\[Feature:NodeSwap\] --minStartupPods=8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is no-op clean up since the [Feature:.+\]| already skips NodeSwap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove other features here as well
| - --provider=gce | ||
| # *Manager jobs are skipped because they have corresponding test lanes with the right image | ||
| # These jobs in serial get partially skipped and are long jobs. | ||
| - --test_args=--nodes=1 --timeout=4h --focus="\[Serial\]" --skip="\[Flaky\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:CPUManager\]|\[Feature:MemoryManager\]|\[Feature:TopologyManager\]|\[Feature:NodeSwap\]|\[Feature:DynamicResourceAllocation\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely all the features in https://testgrid.k8s.io/sig-node-presubmits#pr-node-kubelet-serial-containerd soon be cleaned up and we can just skip all Feature+
|
I commented on a first few jobs. Please apply the same logic to others |
93789b9 to
8253c24
Compare
|
/ok-to-test |
| - --focus-regex=\[Serial\] | ||
| - --use-dockerized-build=true | ||
| - --target-build-arch=linux/arm64 | ||
| - --skip-regex=\[Flaky\]|\[Slow\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:DynamicResourceAllocation\]|\[Feature:NodeSwap\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we likely will need to preserver the NodeSwap skipping here. This job is not setting any swap memory on the node
| - --focus-regex=\[Serial\] | ||
| - --use-dockerized-build=true | ||
| - --target-build-arch=linux/arm64 | ||
| - --skip-regex=\[Flaky\]|\[Slow\]|\[Benchmark\]|\[Feature:Eviction\]|\[Feature:DynamicResourceAllocation\]|\[Feature:NodeSwap\]|\[Feature:HugePages\]|\[Feature:PodLevelResources\] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep this skip of NodeSwap. Removing it from here will likely result in failing tests
| - --focus-regex=\[Serial\] | ||
| - --use-dockerized-build=true | ||
| - --target-build-arch=linux/arm64 | ||
| - --skip-regex=\[Flaky\]|\[Slow\]|\[Benchmark\]|\[NodeSpecialFeature:.+\]|\[NodeSpecialFeature\]|\[NodeAlphaFeature:.+\]|\[NodeAlphaFeature\]|\[NodeFeature:Eviction\]|\[Feature:DynamicResourceAllocation\]|\[NodeFeature:NodeSwap\] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in 1.32 branch was it called Feature or NodeFeature? It might have changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was called NodeFeature. Should I leave it at that? I just changed it to Feature to keep it consistent with 1.33 and 1.34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We are trying not to touch the release jobs much - code never changes there so all tags typically stay as-is
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mariafromano-25 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind cleanup
Continuing the efforts of uber Issue: kubernetes/kubernetes#134172
complementary to the changes in kubernetes/kubernetes#134924
--focus-regex,--skip-regex,--test-args, and--ginko.skipof jobs that test alpha/beta features.pull-kubernetes-node-swap-ubuntu-serial, orpull-kubernetes-node-swap-fedora-serialsince there are already conformance tests likepull-kubernetes-node-swap-conformance-ubuntu-serial. Will keep them in the code unless advised otherwise.