Note
Go to the end to download the full example code.
PointStat: Python embedding for ASOS/METAR cloud obs to verify GFS
model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight.conf
Scientific Objective
This use case is intended to demonstrate using readily available ASOS/METAR cloud observations of cloud fraction and cloud base height (ceiling) to verify GFS forecasts of low cloud height and multiple cloud layer fractions. Python embedding is used to serve the observations from their NetCDF file format to PointStat. The use case also demonstrates interpolating the pressure of the low cloud altitude from GFS to a height value using the geopotential height field in order to compare with the observations.
Version Added
METplus version 6.1
Datasets
- Forecast: Global Forecast System (GFS) 0.25 degree global grid
Low cloud fraction Middle cloud fraction High cloud fraction Low cloud altitude (pressure) Geopotential height
- Observation: Global METAR reports from NCAR RDA
Low cloud area fraction Middle cloud area fraction High cloud area fraction Low cloud base altitude
Climatology: None
Location: All of the input data required for this use case can be found in a sample data tarball. Each use case category will have one or more sample data tarballs. It is only necessary to download the tarball with the use case’s dataset and not the entire collection of sample data. Click here to access the METplus releases page and download sample data for the appropriate release: https://github.com/dtcenter/METplus/releases This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.
METplus Components
The only tool this use case calls is PointStat. Within PointStat a Python script is used for ingesting forecast data only for verifying the low cloud altitude, and also for ingesting the observations for all variables once for each valid time requested.
METplus Workflow
Beginning time (INIT_BEG): 2024-03-07 00:00 UTC
End time (INIT_END): 2024-03-07 00:00 UTC
Increment between beginning and end times (INIT_INCREMENT): None
Sequence of forecast leads to process (LEAD_SEQ): 6
Only the 6-h forecast from the 00 UTC forecast on 20240307 is verified. PointStat computes contingency table counts (CTC), contingency table statistics (CTS), and continuous statistics (CNT) for the data variables being verified.
METplus Configuration
METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line, i.e. parm/use_cases/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight.conf
[config]
# Processes to run
PROCESS_LIST = PointStat(lcld_frac),PointStat(mcld_frac),PointStat(hcld_frac),PointStat(lcld_alt)
# Time info
LOOP_BY = INIT
INIT_TIME_FMT = %Y%m%d%H
INIT_BEG = 2024030700
INIT_END = 2024030700
LEAD_SEQ = 6
##########
# File I/O
##########
FCST_POINT_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/GFS_0.25
PYEMBED_OBS_INPUT_DIR = {INPUT_BASE}/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/metar
OBS_POINT_STAT_INPUT_TEMPLATE = PYTHON_NUMPY = {PARM_BASE}/use_cases/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/read_asos_ceil.py {PYEMBED_OBS_INPUT_DIR}/Surface_METAR_{valid?fmt=%Y%m%d}_0000.nc:{OBS_VAR1_NAME}
POINT_STAT_OUTPUT_DIR = {OUTPUT_BASE}/point_stat
POINT_STAT_OUTPUT_TEMPLATE={POINT_STAT_OUTPUT_DIR}/{OBS_VAR1_NAME}
############
# Field Info
############
MODEL = gfs
[lcld_frac]
FCST_POINT_STAT_INPUT_TEMPLATE = {init?fmt=%Y%m%d%H}/gfs.0p25.{init?fmt=%Y%m%d%H}.f{lead?fmt=%3H}.grib2
FCST_VAR1_NAME = LCDC
FCST_VAR1_LEVELS = L0
FCST_VAR1_OPTIONS = GRIB2_ipdtmpl_index=8;GRIB2_ipdtmpl_val={lead?fmt=%H};
OBS_VAR1_NAME = low_cloud_area_fraction
OBS_VAR1_LEVELS = L0
[mcld_frac]
FCST_POINT_STAT_INPUT_TEMPLATE = {init?fmt=%Y%m%d%H}/gfs.0p25.{init?fmt=%Y%m%d%H}.f{lead?fmt=%3H}.grib2
FCST_VAR1_NAME = MCDC
FCST_VAR1_LEVELS = L0
FCST_VAR1_OPTIONS = GRIB2_ipdtmpl_index=8;GRIB2_ipdtmpl_val={lead?fmt=%H};
OBS_VAR1_NAME = middle_cloud_area_fraction
OBS_VAR1_LEVELS = L0
[hcld_frac]
FCST_POINT_STAT_INPUT_TEMPLATE = {init?fmt=%Y%m%d%H}/gfs.0p25.{init?fmt=%Y%m%d%H}.f{lead?fmt=%3H}.grib2
FCST_VAR1_NAME = HCDC
FCST_VAR1_LEVELS = L0
FCST_VAR1_OPTIONS = GRIB2_ipdtmpl_index=8;GRIB2_ipdtmpl_val={lead?fmt=%H};
OBS_VAR1_NAME = high_cloud_area_fraction
OBS_VAR1_LEVELS = L0
[lcld_alt]
PYEMBED_FCST_INPUT_DIR = {INPUT_BASE}/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/GFS_0.25
PYEMBED_FCST_INPUT_TEMPLATE = {init?fmt=%Y%m%d%H}/gfs.0p25.{init?fmt=%Y%m%d%H}.f{lead?fmt=%3H}.grib2
PYEMBED_FCST_PATH = {PARM_BASE}/use_cases/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/read_interp_gfs_025.py
PYEMBED_FCST_ARGS = {PYEMBED_FCST_INPUT_DIR}/{PYEMBED_FCST_INPUT_TEMPLATE} {valid?fmt=%Y%m%d_%H%M%S} {lead?fmt=%H}
FCST_POINT_STAT_INPUT_TEMPLATE = PYTHON_NUMPY
FCST_VAR1_NAME = {PYEMBED_FCST_PATH} {PYEMBED_FCST_ARGS}
FCST_VAR1_LEVELS = L0
FCST_VAR1_OPTIONS = GRIB2_ipdtmpl_index=8;GRIB2_ipdtmpl_val={lead?fmt=%H};
OBS_VAR1_NAME = low_cloud_base_altitude
OBS_VAR1_LEVELS = L0
#########################
# PointStat configuration
#########################
[config]
POINT_STAT_INTERP_TYPE_METHOD = BILIN
POINT_STAT_INTERP_TYPE_WIDTH = 2
POINT_STAT_OUTPUT_FLAG_CTC = STAT
POINT_STAT_OUTPUT_FLAG_CTS = STAT
POINT_STAT_OUTPUT_FLAG_CNT = STAT
POINT_STAT_REGRID_TO_GRID = NONE
POINT_STAT_REGRID_METHOD = BILIN
POINT_STAT_REGRID_WIDTH = 2
POINT_STAT_GRID = FULL
POINT_STAT_POLY =
POINT_STAT_STATION_ID =
POINT_STAT_MESSAGE_TYPE = ADPSFC
OBS_POINT_STAT_WINDOW_BEGIN = -1799
OBS_POINT_STAT_WINDOW_END = 1799
MET Configuration
METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.
YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!
If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings
PointStatConfig_wrapped
////////////////////////////////////////////////////////////////////////////////
//
// Point-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////
//
// Output model name to be written
//
// model =
${METPLUS_MODEL}
//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}
////////////////////////////////////////////////////////////////////////////////
//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// May be set separately in each "field" entry
//
censor_thresh = [];
censor_val = [];
cat_thresh = [ NA ];
cnt_thresh = [ NA ];
cnt_logic = UNION;
wind_thresh = [ NA ];
wind_logic = UNION;
eclv_points = 0.05;
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}
rank_corr_flag = FALSE;
//
// Forecast and observation fields to be verified
//
fcst = {
${METPLUS_FCST_FILE_TYPE}
//field = [
${METPLUS_FCST_FIELD}
${METPLUS_FCST_CLIMO_MEAN_DICT}
${METPLUS_FCST_CLIMO_STDEV_DICT}
}
obs = {
${METPLUS_OBS_FILE_TYPE}
//field = [
${METPLUS_OBS_FIELD}
${METPLUS_OBS_CLIMO_MEAN_DICT}
${METPLUS_OBS_CLIMO_STDEV_DICT}
}
////////////////////////////////////////////////////////////////////////////////
//
// Point observation filtering options
// May be set separately in each "obs.field" entry
//
// message_type =
${METPLUS_MESSAGE_TYPE}
sid_exc = [];
//obs_quality_inc =
${METPLUS_OBS_QUALITY_INC}
//obs_quality_exc =
${METPLUS_OBS_QUALITY_EXC}
//duplicate_flag =
${METPLUS_DUPLICATE_FLAG}
//obs_summary =
${METPLUS_OBS_SUMMARY}
//obs_perc_value =
${METPLUS_OBS_PERC_VALUE}
//
// Mapping of message type group name to comma-separated list of values.
//
//message_type_group_map =
${METPLUS_MESSAGE_TYPE_GROUP_MAP}
//obtype_as_group_val_flag =
${METPLUS_OBTYPE_AS_GROUP_VAL_FLAG}
////////////////////////////////////////////////////////////////////////////////
//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}
//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}
//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Land/Sea mask
// For LANDSF message types, only use forecast grid points where land = TRUE.
// For WATERSF message types, only use forecast grid points where land = FALSE.
// land_mask.flag may be set separately in each "obs.field" entry.
//
//land_mask = {
${METPLUS_LAND_MASK_DICT}
//
// Topography
// For SURFACE message types, only use observations where the topo - station
// elevation difference meets the use_obs_thresh threshold.
// For the observations kept, when interpolating forecast data to the
// observation location, only use forecast grid points where the topo - station
// difference meets the interp_fcst_thresh threshold.
// topo_mask.flag may be set separately in each "obs.field" entry.
//
//topo_mask = {
${METPLUS_TOPO_MASK_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Point observation time window
//
// obs_window = {
${METPLUS_OBS_WINDOW_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Verification masking regions
//
//mask = {
${METPLUS_MASK_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Confidence interval settings
//
ci_alpha = [ 0.05 ];
boot = {
interval = PCTILE;
rep_prop = 1.0;
n_rep = 0;
rng = "mt19937";
seed = "";
}
////////////////////////////////////////////////////////////////////////////////
//
// Interpolation methods
//
//interp = {
${METPLUS_INTERP_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// HiRA verification method
//
//hira = {
${METPLUS_HIRA_DICT}
////////////////////////////////////////////////////////////////////////////////
// Threshold for SEEPS p1 (Probability of being dry)
//seeps_p1_thresh =
${METPLUS_SEEPS_P1_THRESH}
////////////////////////////////////////////////////////////////////////////////
//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}
////////////////////////////////////////////////////////////////////////////////
//ugrid_dataset =
${METPLUS_UGRID_DATASET}
//ugrid_max_distance_km =
${METPLUS_UGRID_MAX_DISTANCE_KM}
//ugrid_coordinates_file =
${METPLUS_UGRID_COORDINATES_FILE}
////////////////////////////////////////////////////////////////////////////////
//point_weight_flag =
${METPLUS_POINT_WEIGHT_FLAG}
tmp_dir = "${MET_TMP_DIR}";
// output_prefix =
${METPLUS_OUTPUT_PREFIX}
//version = "V10.0.0";
////////////////////////////////////////////////////////////////////////////////
${METPLUS_MET_CONFIG_OVERRIDES}
Python Embedding
This use case calls a Python embedding script to read the observations for all variables, and a Python embedding script to read the forecast only when verifying the low cloud base altitude (ceiling) forecasts. The observation Python script is read_asos_ceil.py, which reads NetCDF formatted METAR observations and prepares the data in a format that MET expects. The forecast Python script for low cloud base altitude is read_interp_gfs_025.py. This script opens a GRIB2 formatted GFS file and reads in the 3D geopotential height field, the 2D low cloud base altitude (pressure) field, and the 2D orography (surface topography) field. It uses 1-D logarithmic interpolation from MetPy to interpolate the geopotential height field (converted to meters AGL from meters MSL using the orography field) to the height pressure altitude of the low cloud base altitude field from the model. This interpolation is done by a function contained in a third Python embedding file, gfs_025_interp_funcs.py. Because the Python multiprocessing module is used to parallelize the interpolation at each grid cell, this function had to exist in a separate file that read_interp_gfs_025.py imports from. The gridded low cloud base altitude field in meters AGL is passed back to PointStat for comparison with the obs. The location of each script is
parm/use_cases/model_applications/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/read_asos_ceil.py
import sys
import os
from netCDF4 import Dataset
import numpy as np
from datetime import datetime,timedelta
# Set the MET 11-column "typ", "lvl", and "qc" variables
msg_type = ['ADPSFC']
level = [-9999.]
qc_string = ['NA']
# Require an input file and variable name in the format:
# path_to_file:variable_name
if len(sys.argv) < 2:
print(f"ERROR: {__file__} - Must provide at least 1 input file argument")
sys.exit(1)
# Separate out the input file and the variable name
input_file, var_name = sys.argv[1].split(":")
# Ensure the input file exists, and exit if it does not
if not os.path.exists(input_file):
print(f'ERROR: Input file does not exist: {input_file}')
sys.exit(1)
# Open the NetCDF input file
nc = Dataset(input_file, 'r')
# Load variables (no qc vars present in file)
latitude = nc.variables['latitude'][:]
longitude = nc.variables['longitude'][:]
altitude = nc.variables['altitude'][:] # Station altitude [m]
station_ids = nc.variables['station_id'][:]
parent_index = nc.variables['parent_index'][:]
time_observation = nc.variables['time_observation'][:] # seconds since 1970-01-01 00 UTC
observation = nc.variables[var_name][:]
# Close the NetCDF input file after all data have been acquired
nc.close()
# Set up variables to hold the "sid", "lat", "lon", and "elv" variables in the MET 11-column data format
stn_id = np.empty(len(parent_index),dtype="U10")
stn_lat = np.zeros(len(parent_index))
stn_lon = np.zeros(len(parent_index))
stn_elev = np.zeros(len(parent_index))
# Fill those variables with data from the input file
for i in range(len(parent_index)):
stn_id[i] = station_ids[parent_index[i]].tobytes().decode('utf-8').strip()
stn_lat[i] = latitude[parent_index[i]]
stn_lon[i] = longitude[parent_index[i]]
stn_elev[i] = altitude[parent_index[i]]
# Fill the "vld" column with time information formatted the way MET expects
epoch = datetime(1970, 1, 1)
formatted_times = [(epoch + timedelta(seconds=int(time))).strftime('%Y%m%d_%H%M%S') for time in time_observation]
vld_time = formatted_times
# Fill the "typ" variable
msg_type = msg_type*len(parent_index)
# Fill the "var" variable
var = [var_name]*len(parent_index)
# Fill the "lvl" variable
level = level*len(parent_index)
# Set the "hgt" column in the MET 11-column data to the same as "lvl"
height = level
# Set the "qc" variable
qc_string = qc_string*len(parent_index)
# Convert the "obs" value to float, and if it is cloud fraction then
# multiply the obs variable by 100 to match the forecast
obs_value = observation.astype(float)
if 'fraction' in var_name:
obs_value = obs_value*100.0
# Fill any missing data with the MET missing data value of -9999.
obs_val = obs_value.filled(-9999.)
# Create the point_data object MET expects
point_data = []
point_data = [[typ,sid,vid,lat,lon,elv,var,lvl,hgt,qc,obs] for typ,sid,vid,lat,lon,elv,var,lvl,hgt,qc,obs in tuple(zip(msg_type,stn_id,vld_time,stn_lat,stn_lon,stn_elev,var,level,height,qc_string,obs_val))]
parm/use_cases/model_applications/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight/read_interp_gfs_025.py
import datetime
import multiprocessing
import sys
import xarray as xr
from gfs_025_interp_funcs import interp_column
model_file = sys.argv[1]
valid = sys.argv[2]
leadhours = sys.argv[3]
# Process time information
valid = datetime.datetime.strptime(valid,'%Y%m%d_%H%M%S')
init = valid-datetime.timedelta(hours=int(leadhours))
cbz_var = xr.open_dataset(model_file,engine='cfgrib',filter_by_keys={'typeOfLevel':'lowCloudBottom'},indexpath='')
gph_var = xr.open_dataset(model_file,engine='cfgrib',filter_by_keys={'typeOfLevel':'isobaricInhPa','shortName':'gh'},indexpath='')
top_var = xr.open_dataset(model_file,engine='cfgrib',filter_by_keys={'typeOfLevel':'surface','shortName':'orog'},indexpath='')
if len(cbz_var.data_vars) != 1:
print("UNABLE TO DETERMINE CLOUD BASE HEIGHT VAR NAME IN read_interp_gfs_025.py")
sys.exit(1)
if len(gph_var.data_vars) != 1:
print("UNABLE TO DETERMINE GEOPOTENTIAL HEIGHT VAR NAME IN read_interp_gfs_025.py")
sys.exit(1)
if len(top_var.data_vars) != 1:
print("UNABLE TO DETERMINE OROGRAPHY VAR NAME IN read_interp_gfs_025.py")
sys.exit(1)
cbz_var_name = list(cbz_var.data_vars)[0]
gph_var_name = list(gph_var.data_vars)[0]
top_var_name = list(top_var.data_vars)[0]
# The geopotential height field is in meters above mean sea level (MSL). To convert the geopotential height field
# from meters MSL to meters AGL, we add the orography to the geopotential height field prior to interpolating so
# that the result of the interpolation is meters AGL to match the observations.
gph_var[gph_var_name] = gph_var[gph_var_name]+top_var[top_var_name]
# Stack the cloud bottom pressure to 1D where each cell is treated like a site (site ID, sid)
cbzstack = cbz_var[cbz_var_name].stack(sid=("latitude","longitude"))
# Stack the GPH to 1D where each "column"/grid cell is like a site (site ID, sid)
gphstack = gph_var[gph_var_name].stack(sid=("latitude","longitude"))
# array to hold the results
resstack = xr.full_like(cbzstack,-9999.).rename('lcld_alt')
# Condition for masking
mask_cond = cbzstack<=(gphstack.isobaricInhPa.max(dim='isobaricInhPa').values * 100.0)
# Mask the data
cbzmask = cbzstack[mask_cond]
gphmask = gphstack[:,mask_cond]
# Get a pool of workers
mp = multiprocessing.Pool(multiprocessing.cpu_count()-2)
# Compute the interpolated height of the cloud base pressure at each site
cells_to_process = cbzmask.sizes['sid']
print("")
print(f'INTERPOLATING CLOUD BASE HEIGHT AT {cells_to_process} GRID CELLS')
print("")
result = mp.starmap(interp_column,([cbzmask,gphmask,sidx] for sidx in list(range(0,cells_to_process))))
# Re-populate the DataArray
resstack[mask_cond] = result
# Re-populate the results at the 2D cells they belong
met_data = resstack.unstack()
met_data = met_data.values
attrs = {
'valid': valid.strftime('%Y%m%d_%H%M%S'),
'init': init.strftime('%Y%m%d_%H%M%S'),
'lead': '%02d0000' % (int(leadhours)),
'accum': '000000',
'name': 'lcld_alt',
'long_name': 'cloud_bottom_height',
'level': "L0",
'units': 'm',
'grid': 'G193'
}
For more information on the basic requirements to utilize Python Embedding in METplus, please refer to the MET User’s Guide section on Python embedding.
User Scripting
This use case does not use additional scripts.
Running METplus
Pass the use case configuration file to the run_metplus.py script along with any user-specific system configuration files if desired:
run_metplus.py /path/to/METplus/parm/use_cases/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight.conf /path/to/user_system.conf
See Running METplus for more information.
Expected Output
A successful run will output the following both to the screen and to the logfile:
INFO: METplus has successfully finished running.
Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in {OUTPUT_BASE}/model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight and will contain the following files:
* point_stat_060000L_20240307_060000V.stat
organized in directories for each variable being verified:
* high_cloud_area_fraction
* low_cloud_area_fraction
* middle_cloud_area_fraction
* low_cloud_base_altitude
Each file should contain corresponding statistics for the line type(s) requested.
Keywords
Note
PointStatToolUseCase
PythonEmbeddingFileUseCase
GRIB2FileUseCase
CloudsAppUseCase
Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.