-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Explicitly forget the zero remaining elements in vec::IntoIter::fold().
#148486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…()`. This seems to help LLVM notice that dropping the elements in the destructor is not necessary.
|
rustbot has assigned @Mark-Simulacrum. Use |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Explicitly forget the zero remaining elements in `vec::IntoIter::fold()`.
This comment has been minimized.
This comment has been minimized.
| #[no_mangle] | ||
| pub fn vec_into_iter_drop_option(v: vec::Vec<(usize, Option<Bomb>)>) -> usize { | ||
| // CHECK-NOT: panic | ||
| // CHECK-NOT: Bomb$u20$as$u20$core..ops..drop..Drop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: CHECK-NOT tests are generally quite fragile, being prone to accidentally checking for something that never existed at all, or having the implementation in the standard library change such that a previously-useful test no longer does anything useful.
Thus I would suggest that whatever you're checking for here you also write another function that intentionally does trigger whatever you expect to not be there in this one, with positive CHECKs for the same things that are CHECK-NOTs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. However, I encounter a problem: I add more tests to this file and LLVM decides that they should call drop_in_place::<IntoIter>() instead of inlining it, defeating the entire point of this change. That’s a sign that this optimization is fragile and more work is needed, I guess. (Or maybe the perf results will show that it’s good despite that.)
I am now trying to write more explicit code that doesn’t rely on inlining to help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve pushed a new commit 5fa286d that performs explicit forgetting instead of relying on inlining and dead code elimination to achieve anything.
library/alloc/src/vec/into_iter.rs
Outdated
| // There are in fact no remaining elements to forget, but by doing this we can avoid | ||
| // potentially generating a needless loop to drop the elements that cannot exist at | ||
| // this point. | ||
| self.forget_remaining_elements(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍, makes sense to me. It being a consuming method also means that any writes into self that this does should optimize out comparatively easily. (Assuming, at least, that we get dead_on_return on the self parameter properly.)
This has the same end goal as the previous commit but does not rely on compiler optimizations to delete the unwanted code; instead it enters a code path which explicitly frees the allocation and forgets the `IntoIter`.
|
Finished benchmarking commit (ae97583): comparison URL. Overall result: ❌✅ regressions and improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 3.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -2.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary 0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 473.413s -> 473.632s (0.05%) |
|
@bors try @rust-timer queue |
|
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
This comment has been minimized.
This comment has been minimized.
Explicitly forget the zero remaining elements in `vec::IntoIter::fold()`.
|
Queued 48bf163 with parent 53efb3d, future comparison URL. |
[Original description:]
This seems to help LLVM notice that dropping the elements in the destructor ofIntoIteris not necessary. In cases it doesn’t help, it should be cheap since it is just one assignment.This PR adds a function to
vec::IntoIter()which is used used byfold()andspec_extend(), when those operations complete, to forget the zero remaining elements and only deallocate the allocation, ensuring that there will never be a useless loop to drop zero remaining elements when the iterator is dropped.This is my first ever attempt at this kind of codegen micro-optimization in the standard library, so please let me know what should go into the PR or what sort of additional systematic testing might indicate this is a good or bad idea.