-
Notifications
You must be signed in to change notification settings - Fork 83
Add kw arg to normalize kernel in distance weights. #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #791 +/- ##
=======================================
+ Coverage 85.4% 85.4% +0.1%
=======================================
Files 150 151 +1
Lines 15971 15995 +24
=======================================
+ Hits 13632 13661 +29
+ Misses 2339 2334 -5
🚀 New features to boost your workflow:
|
|
Can we get it to |
Kernels in PySALThis note collects issues, background, and proposals for refactoring the treatment of kernels in PySAL, especially in relation to bandwidth handling, standardization, and cross-module reuse. BackgroundConceptual OverviewIn nonparametric statistics, kernels are generally used for smoothing. At a given focal point, a weighted average of surrounding observations is computed, with weights that decrease with distance from the focal point. The kernel function defines the distance decay pattern of these weights. In spatial analysis, kernels serve multiple purposes, which can be classified into three conceptual cases:
Current ImplementationLocation: Notation
Kernel TableA
Derivations: Do Kernels Integrate to 1?Triangular✅ Integrates to 1 Uniform✅ Integrates to 1 Quadratic (Epanechnikov)✅ Integrates to 1 Quartic (Biweight)✅ Integrates to 1 Gaussian✅ Integrates to 1 (by definition of the standard normal distribution) Why Set
|
|
One thing we need to figure out is handling of diagonal filling as a user typically does not know the value of |
|
100% w/ the In this pattern, when a weight is built & kernel calculated, we set the |
|
We've decided against this |
|
No matter what we do, we need to fix this, so that when distances are calculated, we populate the self-weight with |
Co-authored-by: knaaptime <[email protected]>
for more information, see https://pre-commit.ci
|
@jGaboardi do we want to add |
|
No, min should have the minimal set of deps to test our ability to work only with required deps. We should conditionally skip if pyproj is not available. |
for more information, see https://pre-commit.ci
martinfleis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a weird situation here right now.
The kernels.kernel function supports decay and taper but these are not exposed from within graph builders that use kernels. In graph._kernel we do use the keyword taper but don't pass it to the actual kernel.
I am fine with merging this as is (except that one note in the notebook) to get it in, but then we should ensure this propagates to the actual builders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Gaussian kernel, a decay parameter controls how quickly weights fall off with distance. A larger decay implies a faster decrease of weights as distance increases.
This implies that decay can be a numeric value but it is a bool. And controls whether the kernel is interpreted as a distance-decay function or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I meant to revise that but missed it.
Let me revise.
for more information, see https://pre-commit.ci
These have now been propagated to the builders. |
| def build_kernel( | ||
| cls, | ||
| data, | ||
| kernel="gaussian", | ||
| k=None, | ||
| bandwidth=None, | ||
| metric="euclidean", | ||
| p=2, | ||
| coplanar="raise", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we expose decay and taper also in here? Because this is the public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same applies to build_distance_band and build_knn build_triangulation and build_travel_cost which` also allow kernels.
But again, happy to keep this for a follow up. However, without these being exposed directly in class methods, they are useful only internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the close review and catching these. Let me add these and ensure we have all kernel related changes incorporated before we merge.

This is responding to the discussion in #790.
The default
$$K(z) = \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2}$$
normalize=Truepreserves the current behavior of the Gaussian kernel weights (heavily used inspreg):Setting
$$K(z) = e^{-\frac{1}{2} z^2}$$
normalize=Falseprovides an option to use unnormalized Gaussian weights: