⚡️ Speed up function _calculate_npmi_core by 19%
#168
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 19% (0.19x) speedup for
_calculate_npmi_coreinmlflow/store/analytics/trace_correlation.py⏱️ Runtime :
142 microseconds→120 microseconds(best of90runs)📝 Explanation and details
The optimized code achieves an 18% speedup through three key micro-optimizations that reduce Python's overhead:
What was optimized:
max(-1.0, min(1.0, npmi))call - Replaced the nested function call (which creates intermediate values) with a fasterif/elif/elsebranch that directly returns the clamped value-(log_n11 - log_N)tolog_N - log_n11, eliminating the unary negation operationlog_n11_plus_log_Nandlog_n1_plus_log_n2to reduce repeated arithmetic operations in the PMI calculationWhy it's faster:
The original
max(-1.0, min(1.0, npmi))creates two function call frames and an intermediate value, while the optimized branching logic performs direct comparisons. Python's function call overhead is significant for such simple operations. The line profiler shows this optimization saves ~30 nanoseconds per call (from 41127ns to various branch times).Performance characteristics from tests:
test_basic_independent_events(28.9% faster)Impact on workloads:
Given that
_calculate_npmi_coreis called fromcalculate_npmi_from_countsandcalculate_smoothed_npmiin trace correlation analysis, this optimization will provide meaningful speedups for MLflow's analytics pipeline, especially when processing large numbers of trace correlations where this function is called repeatedly.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-_calculate_npmi_core-mhx1tj3fand push.