Skip to content

KenTSUI-dev/ppath

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PATH Model Post-Processing Tool (ppath)

ppath is a specialized CLI tool designed to process WRF (Weather Research and Forecasting) and CMAQ (Community Multiscale Air Quality) NetCDF output files generated by the PATH Model. It allows users to:

  1. Load multiple NetCDF files lazily (memory efficient).
  2. Calculate new variables using custom formulas.
  3. Export data to ArcGIS Voxel Layer compliant NetCDF files.
  4. Extract time-series data for specific 3D coordinates into CSV format.

Installation

Option 1: Install into a New Conda Environment

  1. Clone the Repository:

    git clone <repository_url>
    cd ppath
  2. Set up the Environment and Install the Package:

    # Create the environment from the provided file
    conda env create -f environment.yml
    
    # Activate the environment
    conda activate ppath

Option 2: Install into an Existing Conda Environment

If you already have a working Conda environment and want to add ppath to it, follow these steps:

  1. Update dependencies and unstall the package to your environment:

    conda env update --file environment.yml --prune -n my_env
    
    # Activate the environment
    conda activate my_env

Usage

Once installed, the tool can be run using the ppath command. You need to provide the mode (wrf or cmaq) and the path to a JSON configuration file.

Syntax

conda activate ppath # or my_env    
ppath [mode] [config_path]

Examples

1. Run WRF Pipeline:

ppath wrf examples/wrf_cfg.json

2. Run CMAQ Pipeline:

ppath cmaq examples/cmaq_cfg.json

Configuration File Structure

The tool relies on a JSON configuration file. Below are the specifications for WRF and CMAQ.

1. WRF Configuration Structure

{
  "source": [
    {
      "label": "WRF",
      "input_nc": [
        "D:/data/wrf/wrfout_d01_2025-11-17_00_00_00",
        "D:/data/wrf/wrfout_d01_2025-11-17_01_00_00"
      ]
    }
  ],
  "calc_var": [
    {
      "var": "actual_temperature",
      "unit": "K",
      "formula": "(WRF.T+300) * (((WRF.P+WRF.PB)/100000)**(287/1004))"
    }
  ],
  "voxel_ouput": [
    {
      "output_nc": "D:/output/WRF_voxel.nc",
      "selected_var": ["U", "V", "actual_temperature"],
      "selected_lay": [0, 1, 2],
      "epsg": 102100
    }
  ],
  "ts_ouput": [
    {
      "output_csv": "D:/output/WRF_station.csv",
      "selected_var": ["U", "V", "actual_temperature"],
      "station": [
        {
          "label": "Station_A",
          "west_east": 139,
          "south_north": 49,
          "bottom_top": 0
        }
      ]
    }
  ]
}

2. CMAQ Configuration Structure

{
  "source": [
    {
      "label": "ACONC",
      "input_nc": [
          "D:/data/cmaq/CCTM.benchmark.ACONC.d04.2025323_00.ncf",
          "D:/data/cmaq/CCTM.benchmark.ACONC.d04.2025323_04.ncf",
          "D:/data/cmaq/CCTM.benchmark.ACONC.d04.2025323_16.ncf"
      ]
    },
    {
      "label": "APMDIAG",
      "input_nc": [
          "D:/data/cmaq/CCTM.benchmark.APMDIAG.d04.2025323_00.ncf",
          "D:/data/cmaq/CCTM.benchmark.APMDIAG.d04.2025323_04.ncf",
          "D:/data/cmaq/CCTM.benchmark.APMDIAG.d04.2025323_16.ncf"
      ]
    }
  ],
  "calc_var": [
    {
      "var": "PM25_",
      "unit": "ug/m3",
      "formula": "(ACONC.ASO4I+ACONC.ANO3I+ACONC.ANH4I+ACONC.ANAI+ACONC.ACLI +ACONC.AECI+ACONC.ALVPO1I+ACONC.ASVPO1I+ACONC.ASVPO2I+ACONC.APOCI+ACONC.APNCOMI+ACONC.ALVOO1I+ACONC.ALVOO2I+ACONC.ASVOO1I+ACONC.ASVOO2I+ACONC.AOTHRI)*APMDIAG.PM25AT+(ACONC.ASO4J+ACONC.ANO3J+ACONC.ANH4J+ACONC.ANAJ+ACONC.ACLJ+ACONC.AECJ+ACONC.ALVPO1J+ACONC.ASVPO1J+ACONC.ASVPO2J+ACONC.APOCJ+ACONC.ASVPO3J+ACONC.AIVPO1J+ACONC.APNCOMJ+ACONC.AISO1J+ACONC.AISO2J+ACONC.AISO3J+ACONC.AMT1J+ACONC.AMT2J+ACONC.AMT3J+ACONC.AMT4J+ACONC.AMT5J+ACONC.AMT6J+ACONC.AMTNO3J+ACONC.AMTHYDJ+ACONC.AGLYJ+ACONC.ASQTJ+ACONC.AORGCJ+ACONC.AOLGBJ+ACONC.AOLGAJ+ACONC.ALVOO1J+ACONC.ALVOO2J+ACONC.ASVOO1J+ACONC.ASVOO2J+ACONC.ASVOO3J+ACONC.APCSOJ+ACONC.AAVB1J+ACONC.AAVB2J+ACONC.AAVB3J+ACONC.AAVB4J+ACONC.AOTHRJ+ACONC.AFEJ+ACONC.ASIJ+ACONC.ATIJ+ACONC.ACAJ+ACONC.AMGJ+ACONC.AMNJ+ACONC.AALJ+ACONC.AKJ)*APMDIAG.PM25AC+(ACONC.ASOIL+ACONC.ACORS+ACONC.ASEACAT+ACONC.ACLK+ACONC.ASO4K+ACONC.ANO3K+ACONC.ANH4K)*APMDIAG.PM25CO"
    },
    {
      "var": "PM10_",
      "unit": "ug/m3",
      "formula": "(ACONC.ASO4I+ACONC.ANO3I+ACONC.ANH4I+ACONC.ANAI+ACONC.ACLI +ACONC.AECI+ACONC.ALVPO1I+ACONC.ASVPO1I+ACONC.ASVPO2I+ACONC.APOCI+ACONC.APNCOMI+ACONC.ALVOO1I+ACONC.ALVOO2I+ACONC.ASVOO1I+ACONC.ASVOO2I+ACONC.AOTHRI)*APMDIAG.PM10AT+(ACONC.ASO4J+ACONC.ANO3J+ACONC.ANH4J+ACONC.ANAJ+ACONC.ACLJ+ACONC.AECJ+ACONC.ALVPO1J+ACONC.ASVPO1J+ACONC.ASVPO2J+ACONC.APOCJ+ACONC.ASVPO3J+ACONC.AIVPO1J+ACONC.APNCOMJ+ACONC.AISO1J+ACONC.AISO2J+ACONC.AISO3J+ACONC.AMT1J+ACONC.AMT2J+ACONC.AMT3J+ACONC.AMT4J+ACONC.AMT5J+ACONC.AMT6J+ACONC.AMTNO3J+ACONC.AMTHYDJ+ACONC.AGLYJ+ACONC.ASQTJ+ACONC.AORGCJ+ACONC.AOLGBJ+ACONC.AOLGAJ+ACONC.ALVOO1J+ACONC.ALVOO2J+ACONC.ASVOO1J+ACONC.ASVOO2J+ACONC.ASVOO3J+ACONC.APCSOJ+ACONC.AAVB1J+ACONC.AAVB2J+ACONC.AAVB3J+ACONC.AAVB4J+ACONC.AOTHRJ+ACONC.AFEJ+ACONC.ASIJ+ACONC.ATIJ+ACONC.ACAJ+ACONC.AMGJ+ACONC.AMNJ+ACONC.AALJ+ACONC.AKJ)*APMDIAG.PM10AC+(ACONC.ASOIL+ACONC.ACORS+ACONC.ASEACAT+ACONC.ACLK+ACONC.ASO4K+ACONC.ANO3K+ACONC.ANH4K)*APMDIAG.PM10CO"
    },
    {
      "var": "NO2_",
      "unit": "ug/m3",
      "formula": "ACONC.NO2*1880"
    }
  ],
  "voxel_ouput": [
    {
      "output_nc": "D:/output/CMAQ_voxel.nc",
      "selected_var": ["NO2_", "PM25AT", "PM25_Total"],
      "selected_lay": [0],
      "epsg": 102100
    }
  ],
  "ts_ouput": [
    {
      "output_csv": "D:/output/CMAQ_station.csv",
      "selected_var": ["NO2_", "PM25AT", "PM25_Total"],
      "station": [
        {
          "label": "Station_B",
          "col": 50,
          "row": 60,
          "lay": 0
        }
      ]
    }
  ]
}

Configuration Details

Key Description
source List of input files. label is used to reference variables in formulas (e.g., WRF.T). input_nc is a list of file paths.
calc_var Define new variables. formula supports Python math syntax. Use Label.Variable to reference data.
voxel_ouput Exports data to NetCDF formatted for ArcGIS Voxel Layers. selected_var determines which variables are written.
selected_lay (Optional) List of integer indices indicating which vertical layers to include in the voxel output. If omitted, all layers are included. Special Case (Single Layer): If only one layer is selected (e.g., [0]), the tool automatically duplicates this layer to create a "volume" of depth 2. This is done because many Voxel Layer visualization tools require a 3D volume rather than a flat 2D slice.
epsg (Optional) The EPSG code used to determine the output Coordinate Reference System (CRS) of the voxel layer. If not specified, it defaults to 102100 (Web Mercator).
ts_ouput Extracts time-series data to CSV.
station Defines 3D grid points for extraction. WRF indices: west_east, south_north, bottom_top. CMAQ indices: col, row, lay.

Technical Notes

  1. WRF Destaggering: WRF variables on staggered grids (U, V, W) are automatically interpolated to the mass grid center during Voxel output generation.
  2. CMAQ Projections: The tool reads IOAPI projection headers (LCC) to generate accurate Lat/Lon coordinates for the Voxel layer.
  3. CMAQ TFLAG: When merging CMAQ files (e.g., ACONC + APMDIAG), the tool intelligently concatenates the TFLAG variable along the VAR dimension to ensure metadata integrity is maintained in the merged dataset.
  4. Lazy Evaluation: The tool uses xarray and dask to handle large datasets without loading everything into RAM.

About

PATH Post Processing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages