Skip to content

Improve discussion of array types in "Working With Units" tutorial #2185

@jthielen

Description

@jthielen

What can be better?

The apparent/user-facing failure to attach units to an array by typically-shown "multiplication on the right" has been a long-standing issue (almost always with masked arrays) which frequently arises in support questions (e.g., https://mobile.twitter.com/Nolan_Meister/status/1455629079110209553) and off-handed gripes I've heard about MetPy. While the masked array issue in particular continues to await resolution of numpy/numpy#15200, I think the crux of the issue is teaching users to know what kind of array they are working with, and then providing very clear direction on how to treat units in combination with that array type.

Here's what I'd have in mind (but please do suggest improvements/alternatives!):

  1. In "Getting Started," change "In general, using units is only a small step on top of using the numpy.ndarray object." to something like "It's important to know what kind of array you are using units with. Luckily, for each kind of array, using units is just a small step on top of using that object!"

  2. In the next section "Adding Units to Data," replace the first two snippets "The easiest way..." and "It is also possible..." with a table like the following:

How should I add units to my data? That depends on your data type!

Data Type Recommended Approach Example
Scalars (like floats and integers), numpy.ndarray, Sparse.COO, Dask.Array Construct Quantity object
or multiply by unit
units.Quantity(array, 'hPa') or array * units.hPa
xarray.DataArray with a 'units' attribute 1) If using only as function input, just use as-is
2) If doing operations outside of MetPy calculations, call .metpy.quantify() after subsetting
1) mpcalc.geostrophic_wind(ds['geopotential_height_isobaric'])
2) temperature = ds['temperature'].metpy.quantify()
xarray.DataArray without a 'units' attribute Multiply by unit temperature = ds['temperature'] * units.K
numpy.MaskedArray (like those from netcdf4-python) Construct Quantity object units.Quantity(array, 'hPa')

Curious about these differences? Take a look at "{INSERT TITLE HERE}" below to learn more.

  1. Change the "When using masked arrays..." in "Common Mistakes" to recommend the Quantity constructor and point to the table in (2).

  2. Add a comment in the "Common Mistakes" section to the effect of: "Do you still have a units problem that isn't addressed here? Reach out to support {...}. Unit arrays involve lots of different libraries interacting with each other, so we want to be able to help sort out any issues when those libraries aren't cooperating as expected."

  3. Add a brief explainer after "Common Mistakes" about why there are different ways to add units for those that are curious (e.g., a much less technical/more applied version of https://pint.readthedocs.io/en/stable/numpy.html#Technical-Commentary)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Area: DocsAffects documentationArea: UnitsPertains to unit informationType: EnhancementEnhancement to existing functionality

    Type

    No type

    Projects

    Status

    No status

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions