Note
Go to the end to download the full example code.
EnsembleStat: Using Python Embedding for Aerosol Optical Depth
model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.conf
Scientific Objective
To provide useful statistical information on the relationship between observation data for aersol optical depth (AOD) to an ensemble forecast. These values can be used to help correct ensemble member deviations from observed values.
Version Added
METplus version 4.0
Datasets
Forecast: International Cooperative for Aerosol Prediction (ICAP) ensemble netCDF file, 7 members
Observation: Aggregate netCDF file with MODIS observed AOD field
Climatology: None
Location: All of the input data required for this use case can be found in a sample data tarball. Each use case category will have one or more sample data tarballs. It is only necessary to download the tarball with the use case’s dataset and not the entire collection of sample data. Click here to access the METplus releases page and download sample data for the appropriate release: https://github.com/dtcenter/METplus/releases This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.
METplus Components
This use case utilizes the METplus EnsembleStat wrapper to read in files using Python Embedding.
METplus Workflow
Beginning time (INIT_BEG): 201608150000
End time (INIT_END): 201608150000
Increment between beginning and end times (INIT_INCREMENT): 06H
Sequence of forecast leads to process (LEAD_SEQ): 12H
EnsembleStat is the only tool called in this example. It processes a single run time with seven ensemble members, with each ensemble member receiving its own verification. Preprocessing of the ensemble forecast data is completed with Python Embedding, which takes 4 inputs: the full path to the forecast file, variable name, valid time of verification, and ensemble member number. The script passes back the variable field requested to EnsembleStat for verification. A similar process is completed for the observation data, which is preprocessed by a separate Python Embedding script which takes 3 inputs: the full path to the observation file, group name that contains the variable field, and variable name. The script passes back the requested variable field and begins the verification process. Three of the ensemble members do not have data for the AOD field, so EnsembleStat will only process four of the members for statistics. After a successful run, EnsembleStat will create the requested output and its corresponding files.
METplus Configuration
METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line, i.e. parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.conf
[config]
# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.html
# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide
###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###
PROCESS_LIST = EnsembleStat
###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
# INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
# VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###
LOOP_BY = INIT
INIT_TIME_FMT = %Y%m%d%H%M
INIT_BEG=201608150000
INIT_END=201608150000
INIT_INCREMENT=06H
LEAD_SEQ = 12H
###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###
FCST_ENSEMBLE_STAT_INPUT_DATATYPE = PYTHON_NUMPY
FCST_ENSEMBLE_STAT_INPUT_DIR =
FCST_ENSEMBLE_STAT_INPUT_TEMPLATE = 0, 1, 2, 3
OBS_ENSEMBLE_STAT_INPUT_GRID_DATATYPE = PYTHON_NUMPY
OBS_ENSEMBLE_STAT_GRID_INPUT_DIR = {INPUT_BASE}/model_applications/air_quality_and_comp/aod
OBS_ENSEMBLE_STAT_GRID_INPUT_TEMPLATE = PYTHON_NUMPY
ENSEMBLE_STAT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod
###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###
CONFIG_DIR = {PARM_BASE}/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod
FCST_VAR1_NAME = {CONFIG_DIR}/forecast_embedded.py {OBS_ENSEMBLE_STAT_GRID_INPUT_DIR}/icap_{init?fmt=%Y%m%d%H}_aod.nc:total_aod:{valid?fmt=%Y%m%d%H%M}:MET_PYTHON_INPUT_ARG
OBS_VAR1_NAME = {CONFIG_DIR}/analysis_embedded.py {OBS_ENSEMBLE_STAT_GRID_INPUT_DIR}/AGGR_HOURLY_{valid?fmt=%Y%m%d}T{valid?fmt=%H%M}_1deg_global_archive.nc:aod_nrl_total:Mean
###
# EnsembleStat Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#ensemblestat
###
MODEL = ICAP
OBTYPE = NRL_AOD
ENSEMBLE_STAT_OBS_WINDOW_BEGIN = -5400
ENSEMBLE_STAT_OBS_WINDOW_END = 5400
ENSEMBLE_STAT_N_MEMBERS = 4
ENSEMBLE_STAT_ENS_THRESH = 0.1
ENSEMBLE_STAT_VLD_THRESH = 0.1
ENSEMBLE_STAT_REGRID_TO_GRID = NONE
ENSEMBLE_STAT_OUTPUT_PREFIX =
ENSEMBLE_STAT_NC_ORANK_FLAG_LATLON = TRUE
ENSEMBLE_STAT_NC_ORANK_FLAG_RAW = TRUE
ENSEMBLE_STAT_OUTPUT_FLAG_ECNT = STAT
MET Configuration
METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.
YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!
If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings
EnsembleStatConfig_wrapped
///////////////////////////////////////////////////////////////////////////////
//
// Ensemble-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////
//
// Output model name to be written
//
${METPLUS_MODEL}
//
// Output description to be written
// May be set separately in each "obs.field" entry
//
${METPLUS_DESC}
//
// Output observation type to be written
//
${METPLUS_OBTYPE}
////////////////////////////////////////////////////////////////////////////////
//
// Verification grid
//
${METPLUS_REGRID_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// May be set separately in each "field" entry
//
${METPLUS_CENSOR_THRESH}
${METPLUS_CENSOR_VAL}
cat_thresh = [];
nc_var_str = "";
//ens_member_ids =
${METPLUS_ENS_MEMBER_IDS}
//control_id =
${METPLUS_CONTROL_ID}
////////////////////////////////////////////////////////////////////////////////
//prob_cat_thresh =
${METPLUS_PROB_CAT_THRESH}
//prob_pct_thresh =
${METPLUS_PROB_PCT_THRESH}
//eclv_points =
${METPLUS_ECLV_POINTS}
////////////////////////////////////////////////////////////////////////////////
//
// Forecast and observation fields to be verified
//
fcst = {
${METPLUS_FCST_FILE_TYPE}
${METPLUS_ENS_THRESH}
${METPLUS_VLD_THRESH}
${METPLUS_FCST_FIELD}
${METPLUS_FCST_CLIMO_MEAN_DICT}
${METPLUS_FCST_CLIMO_STDEV_DICT}
}
obs = {
${METPLUS_OBS_FILE_TYPE}
${METPLUS_OBS_FIELD}
${METPLUS_OBS_CLIMO_MEAN_DICT}
${METPLUS_OBS_CLIMO_STDEV_DICT}
}
////////////////////////////////////////////////////////////////////////////////
//
// Point observation filtering options
// May be set separately in each "obs.field" entry
//
//message_type =
${METPLUS_MESSAGE_TYPE}
sid_exc = [];
//obs_thresh =
${METPLUS_OBS_THRESH}
//obs_quality_inc =
${METPLUS_OBS_QUALITY_INC}
//obs_quality_exc =
${METPLUS_OBS_QUALITY_EXC}
//duplicate_flag =
${METPLUS_DUPLICATE_FLAG}
obs_summary = NONE;
obs_perc_value = 50;
//skip_const =
${METPLUS_SKIP_CONST}
//
// Observation error options
// Set dist_type to NONE to use the observation error table instead
// May be set separately in each "obs.field" entry
//
obs_error = {
//flag =
${METPLUS_OBS_ERROR_FLAG}
dist_type = NONE;
dist_parm = [];
inst_bias_scale = 1.0;
inst_bias_offset = 0.0;
min = NA; // Valid range of data
max = NA;
}
//
// Mapping of message type group name to comma-separated list of values.
//
message_type_group_map = [
{ key = "SURFACE"; val = "ADPSFC,SFCSHP,MSONET"; },
{ key = "ANYAIR"; val = "AIRCAR,AIRCFT"; },
{ key = "ANYSFC"; val = "ADPSFC,SFCSHP,ADPUPA,PROFLR,MSONET"; },
{ key = "ONLYSF"; val = "ADPSFC,SFCSHP"; }
];
//obtype_as_group_val_flag =
${METPLUS_OBTYPE_AS_GROUP_VAL_FLAG}
//
// Ensemble bin sizes
// May be set separately in each "obs.field" entry
//
//ens_ssvar_bin_size =
${METPLUS_ENS_SSVAR_BIN_SIZE}
//ens_phist_bin_size =
${METPLUS_ENS_PHIST_BIN_SIZE}
////////////////////////////////////////////////////////////////////////////////
//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}
//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}
//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Point observation time window
//
//obs_window = {
${METPLUS_OBS_WINDOW_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Verification masking regions
//
mask = {
//grid =
${METPLUS_MASK_GRID}
//poly =
${METPLUS_MASK_POLY}
sid = [];
llpnt = [];
}
////////////////////////////////////////////////////////////////////////////////
//
// Confidence interval settings
//
//ci_alpha =
${METPLUS_CI_ALPHA}
////////////////////////////////////////////////////////////////////////////////
//
// Interpolation methods
//
//interp = {
${METPLUS_INTERP_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Gridded verification output types
// May be set separately in each "obs.field" entry
//
//nc_orank_flag = {
${METPLUS_NC_ORANK_FLAG_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Random number generator
//
rng = {
type = "mt19937";
seed = "1";
}
////////////////////////////////////////////////////////////////////////////////
//grid_weight_flag =
${METPLUS_GRID_WEIGHT_FLAG}
//point_weight_flag =
${METPLUS_POINT_WEIGHT_FLAG}
//output_prefix =
${METPLUS_OUTPUT_PREFIX}
//version = "V9.0";
////////////////////////////////////////////////////////////////////////////////
tmp_dir = "${MET_TMP_DIR}";
${METPLUS_MET_CONFIG_OVERRIDES}
Python Embedding
This use case uses two Python embedding scripts to read input data: one for the forecast ensemble data, and one for the observation data. The script processing the ensemble data receives four input arguments: the full path to the forecast file, variable name, valid time of verification, and ensemble member number. Since seven ensemble members are being verified, this script will run seven times. The processing is very simple, with the script grabbing the initialization time from the file name, calculating the lead by finding the difference between the valid time argument and the initialization time, grabbing the variable name and index corresponding to the ensemble member input value, and then masking bad data (anything less than -800) to the expected METplus bad data value of -9999. The latitude and longitude variables are also extracted, and all of the information is returned to METplus for verification via array and accompanying attribute dictionary.
The second script for observational data behaves very similarly to the ensemble data script. The script receives three inputs at runtime: the full path to the observation file, group name that contains the variable field, and variable name. The requested variable field is extracted from the group name provided at runtime, bad data is deemed to be any value less than -800 and reset to METplus’ bad value of -9999, and the data array is inverted to properly align with METplus’ expected orientation. The latitude and longitude variables are also extracted, and all of the information is returned to METplus for verification via array and accompanying attribute dictionary.
For more information on the basic requirements to utilize Python Embedding in METplus, please refer to the MET User’s Guide section on Python embedding
parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod/forecast_embedded.py
import sys
import re
import numpy as np
import datetime as dt
from netCDF4 import Dataset, chartostring
#grab input from user
#should be (1)input file using full path (2) variable name (3) valid time for the forecast in %Y%m%d%H%M format and (4) ensemble member number, all separated by ':' characters
#program can only accept that 1 input, while still maintaining user flexability to change multiple
#variables, including valid time, ens member, etc.
input_file, var_name, val_time, ens_mem = sys.argv[1].split(':')
ens_mem = int(ens_mem)
val_time = dt.datetime.strptime(val_time,"%Y%m%d%H%M")
try:
#set pointers to file and group name in file
f = Dataset(input_file, 'r')
v = f[var_name]
#grab intialization time from file name and hold
#also compute the lead time
i_time_ind = input_file.split("_").index("aod.nc")-1
i_time = input_file.split("_")[i_time_ind]
i_time_obj = dt.datetime.strptime(i_time,"%Y%m%d%H")
lead, rem = divmod((val_time - i_time_obj).total_seconds(), 3600)
print("Ensemble Member evaluation for: "+f.members.split(',')[ens_mem])
#checks if the the valid time for the forecast from user is present in file.
#Exits if the time is not present with a message
if not val_time.timestamp() in f['time'][:]:
print("valid time of "+str(val_time)+" is not present. Check file initialization time, passed valid time.")
f.close()
sys.exit(1)
#grab index in the time array for the valid time provided by user (val_time)
val_time_ind = np.where(f['time'][:] == val_time.timestamp())[0][0]
#grab data from file
lat = np.float64(f.variables['lat'][:])
lon = np.float64(f.variables['lon'][:])
var = np.float64(v[val_time_ind:val_time_ind+1,ens_mem:ens_mem+1,::-1,:])
var[var < -800] = -9999
#squeeze out all 1d arrays, add fill value
met_data = np.squeeze(var).copy()
except NameError:
print("Can't find input file")
sys.exit(1)
##########
#create a metadata dictionary
attrs = {
'valid': str(val_time.strftime("%Y%m%d"))+'_'+str(val_time.strftime("%H%M%S")),
'init': i_time[:-2]+'_'+i_time[-2:]+'0000',
'name': var_name,
'long_name': 'UNKNOWN',
'lead': str(int(lead)),
'accum': '00',
'level': 'UNKNOWN',
'units': 'UNKNOWN',
'grid': {
'name': 'Global 1 degree',
'type': 'LatLon',
'lat_ll': -89.5,
'lon_ll': -179.5,
'delta_lat': 1.0,
'delta_lon': 1.0,
'Nlon': f.dimensions['lon'].size,
'Nlat': f.dimensions['lat'].size,
}
}
#print some output to show script ran successfully
print("Input file: " + repr(input_file))
print("Variable name: " + repr(var_name))
print("valid time: " + repr(val_time.strftime("%Y%m%d%H%M")))
print("Attributes:\t" + repr(attrs))
f.close()
parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod/analysis_embedded.py
import sys
import re
import numpy as np
import datetime as dt
from netCDF4 import Dataset, chartostring
#grab input from user
#should be (1)input file using full path (2) group name for the variable and (3) variable name
input_file, group_name, var_name = sys.argv[1].split(':')
try:
#set pointers to file and group name in file
f = Dataset(input_file, 'r')
g = f.groups[group_name]
#grab time from file name and hold
v_time_ind = input_file.split("_").index("HOURLY")+1
v_time = input_file.split("_")[v_time_ind]
#grab data from file
lat = np.float64(f.variables['latitude'][:])
lon = np.float64(f.variables['longitude'][:])
#the data is defined by (lon, lat), so it needs to be transposed
#in addition to being filled by fill value if data is missing
var_invert = np.float64(g.variables[var_name][:,::-1])
var_invert[var_invert < -800] = -9999
met_data = var_invert.T.copy()
except NameError:
print("Can't find input file")
sys.exit(1)
##########
#create a metadata dictionary
attrs = {
'valid': str(v_time.split('T')[0])+'_'+str(v_time.split('T')[1])+'00',
'init': str(v_time.split('T')[0])+'_'+str(v_time.split('T')[1])+'00',
'name': group_name+'_'+var_name,
'long_name': 'UNKNOWN',
'lead': '00',
'accum': '00',
'level': 'UNKNOWN',
'units': 'UNKNOWN',
'grid': {
'name': 'Global 1 degree',
'type': 'LatLon',
'lat_ll': -89.5,
'lon_ll': -179.5,
'delta_lat': 1.0,
'delta_lon': 1.0,
'Nlon': f.dimensions['longitude'].size,
'Nlat': f.dimensions['latitude'].size,
}
}
#print some output to show script ran successfully
print("Input file: " + repr(input_file))
print("Group name: " + repr(group_name))
print("Variable name: " + repr(var_name))
print("Attributes:\t" + repr(attrs))
f.close()
User Scripting
User Scripting is not used in this use case.
Running METplus
Pass the use case configuration file to the run_metplus.py script along with any user-specific system configuration files if desired:
run_metplus.py /path/to/METplus/parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.conf /path/to/user_system.conf
See Running METplus for more information.
Expected Output
A successful run will output the following both to the screen and to the logfile:
INFO: METplus has successfully finished running.
Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in {OUTPUT_BASE}/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod and will contain the following files:
* ensemble_stat_aod_20160815_120000V_ecnt.txt
* ensemble_stat_aod_20160815_120000V_ens.nc
* ensemble_stat_aod_20160815_120000V_orank.nc
* ensemble_stat_aod_20160815_120000V_phist.txt
* ensemble_stat_aod_20160815_120000V_relp.txt
* ensemble_stat_aod_20160815_120000V_rhist.txt
* ensemble_stat_aod_20160815_120000V_ssvar.txt
* ensemble_stat_aod_20160815_120000V.stat
Keywords
Note
EnsembleStatToolUseCase
PythonEmbeddingFileUseCase
AirQualityAndCompAppUseCase
PythonEmbeddingFileUseCase
Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.