-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Optimize BundleInserter::insert compile time #20647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
james7132
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some comparative benchmarks to at least show that this isn't regressing runtime performance.
| let new_archetype = &*new_archetype; | ||
| // SAFETY: We have no outstanding mutable references to world as they were dropped | ||
| let mut deferred_world = unsafe { self.world.into_deferred() }; | ||
| let deferred_world = unsafe { self.world.into_deferred() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparse_sets and table aren't dropped yet here. You probably need to put the above code into a block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be ok? The borrow checker might be smart enough to know that they aren't used after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, its fine, but im gonna make it explicit anyways
|
The compile time win seems to be variable per platform, its not as much as win on Windows as it is on Mac it seems. Additional testing would be appreciated |
|
Running on windows seems to be pretty in the noise with default number of jobs (this pr: 2m01s,2m04s, 2m03s) vs (main: 2m05s, 2m01s, 2m04s). Gain about 14s with 1 job: main: 23m 20s; pr: 23m 06s. Probably bottlenecked by some other crate. I seem to be getting regressions in the perf |
|
Wonder if a cfg-attr profile=release inline(always) would fix it |
seems to have helped. A few of the benches are actually faster now and the others seem to be in the noise. |
|
also ran |
Are the compilation speed gains still present with that on? |
Moreover nowadays inlining is also performed at the MIR level, not just at the LLVM level. |
Objective
Solution
Main: (92s total)

This PR: (82s total)
