Setting up the model

Execution of TCRM is controlled by reading the simulation settings from a configuration file. The configuration file is a text file, and can be edited in any text editor (e.g. Notepad, Wordpad, vi, emacs, gedit). An example configuration file is provided in the examples folder to give users a starting point.

The configuration file

The TCRM configuration file is divided into a series of sections, each with a set of option/value pairs. Most options have default values and may not need to be specified in the configuration file. One value that has no default is the Region gridLimit option. This defines the model domain and must be set in any configuration file used.

Actions

This section defines which components of TCRM will be executed. The options are:

  • DownloadData - download input datasets (defaults are included)

  • DataProcess - process the input TC track database

  • ExecuteStat - calculate the TC statistics over the model domain

  • ExecuteTrackGenerator - generate a set of stochastic TC tracks

  • ExecuteWindfield - Calculate the wind field around a set of TC tracks

  • ExecuteHazard - Calculate the return period wind speeds from a set of wind field files

  • PlotHazard - Plot the return period wind speed maps and return period curves for locations in the model domain

  • PlotData - Plot some basic statistical analyses of the input TC track database

  • ExecuteEvaluate - Evaluate a set of stochastic TC tracks, comparing to the input TC track database.

All options are boolean (i.e. True or False).

[Actions]
DataProcess = True
ExecuteStat = True
ExecuteTrackGenerator = True
ExecuteWindfield = True
ExecuteHazard = True
PlotHazard = True
PlotData = False
ExecuteEvaluate = False
CreateDatabase = True
DownloadData = True

Region

This section defines the simulation domain and the size of the grid over which statistics are calculated. The simulation domain (gridLimit) is specified as a Python dict with keys of xMin, xMax, yMin and yMax. This sets the domain over which the wind fields and hazard will be calculated. Stochastic tracks are generated over a broader domain (called the “track domain”). The gridSpace option controls the size of the grid cells, which are used for calculating statistics. At this time, the values here must be integer values, but can be different in the x (east-west) and y (north-south) directions. The gridInc option control the incremental increase in grid cell size when insufficient observations are located within a grid cell (see the StatInterface description):

[Region]
gridLimit = {'xMin': 113.0, 'xMax': 124.0, 'yMin': -24.0, 'yMax': -13.0}
gridSpace = {'x':1.0,'y':1.0}
gridInc = {'x':1.0,'y':0.5}

DataProcess

This section controls aspects of the processing of the input track database. Firstly, the InputFile option specifies the file to be processed. A relative or absolute path can be used. If no path name is included (as in the example below), then TCRM assumes the file is stored in the input path. If using an automatically downloaded dataset, then this file name must match the name specified in the appropriate dataset section (which is named by the Source option in this section) of the configuration file (further details below).

The Source option is a string value that acts as a pointer to a subsequent section in the configuration file, that holds details of the input track file structure. The additional section must have the same label as set here (the label is case-sensitive).

The StartSeason and FilterSeason options control what years of the input track database are used in calibrating the model. In the default case, only data from 1981 onwards is used for model calibration. If FilterSeasons = False, no season filtering is performed and the full input track database is used.

[DataProcess]
InputFile = ibtracs.since1980.list.v04r00.csv
StartSeason = 1981
FilterSeasons = True
Source = IBTRACS

StatInterface

The StatInterface section controls the methods used to calculate distributions of TC parameters from the input track database.

kdeType specifies the kernel used in the kernel density estimation method for creating probability density functions that are used in selecting initial values for the stochastic TC events (e.g. longitude, latitude, initial pressure, speed and bearing). kdeStep defines the increment in the generated probability density functions and cumulative distribution functions.

Options for kdeType ::

‘gau’ ‘epa’ ‘uni’ ‘tri’ ‘biw’ ‘triw’ ‘cos’ ‘cos2’

kde2DType is deprecated.

minSamplesCell sets the minimum number of valid observations in each grid cell that are required for calculating the distributions, variances and autocorrelations used in the TrackGenerator module. If there are insufficient valid observations, then the bounds of the grid cell are incrementally increased (in steps as specified by the gridInc values) until sufficient observations are found.

[StatInterface]
kdeType = gau
kde2DType = gau
kdeStep = 0.2
minSamplesCell = 100

TrackGenerator

The TrackGenerator section controls the stochastic track generation module. It is here that users can control the number of events and the number of years generated.

The NumSimulations option sets the number of TC event sets that will be generated. Any integer number of events (up to 1,000,000) is possible. YearsPerSimulation sets the number of simulated years that will be generated for each event set. For evaluating hazard, the value should be set to 1, as the extreme value distribution fitting process assumes annual maxima. The annual frequency of events is based on a Poisson distribution around the mean annual frequency, which is determined from the input track database.

For track model evaluations, it is recommended to set YearsPerSimulation to a similar number to the number of years in the input track database. For example, in our testing that used data from 1981–2013, we set the value to 30.

NumTimeSteps controls the maximum lifetime an event can exist for. TimeStep sets the time interval (in hours) for the track generator.

SeasonSeed and TrackSeed are used to fix the random number generator on parallel systems to ensure truly random numbers on each individual processor. If they are absent, the seed is set using an integer representation of the current time, and is recorded in the output metadata (e.g. attributes in the netcdf files).

[TrackGenerator]
NumSimulations = 500
YearsPerSimulation = 1
NumTimeSteps = 360
TimeStep = 1.0
SeasonSeed = 1
TrackSeed = 1

This example will generate 500 realisations of one year of TC activity, with hourly timesteps to a maximum of 360 hours.

WindfieldInterface

The WindfieldInterface section controls how the wind fields from each track in the simulated tracks are calculated. There are two main components to the wind field – the radial profile and the boundary layer model.

The profileType option sets the radial profile used. Valid values are:

  • holland – the radial profile of Holland (1980) 1

  • powell – Similar to the Holland profile, but uses a variable beta parameter that is a function of latitude and size. 2

  • schloemer – From Schloemer (1954) – essentially the Holland profile with a beta value of 1 3

  • willoughby – From Willoughby and Rahn (2004). Again, the Holland profile, with beta a function of the maximum wind speed, radius to maximum wind and latitude 4

  • jelesnianski – From Jelesnianski (1966). 5

  • doubleHolland – A double exponential profile from McConochie et al. (2004) 6

The windFieldType value selects the boundary layer model used. Three boundary layer models have been implemented:

  • kepert – the linearised boundary layer model of Kepert (2001) 7

  • hubbert – a vector addition of forward speed and tangential wind speed from Hubbert et al. (1994) 8

  • mcconochie – a second vector addition model, from McConochie et al. (2004) 6

The beta option specifies the β parameter used in the Holland wind profile. The additional β options (beta1 and beta2) are used in the doubleHolland wind profile, which is a double exponential profile, therefore requiring two β parameters.

thetaMax is used in the McConochie and Hubbert boundary layer models to specify the azimuthal location of the maximum wind speed under the translating storm.

Margin defines the spatial extent over which the wind field is calculated and is in units of degrees. A margin of 5 is recommended for hazard models, to ensure low wind speeds from distant TCs are incorporated into the fitting procedure.

Resolution is the horizontal resolution (in degrees) of the wind fields. Values should be no larger than 0.05 degrees, as the absolute peak of the radial profile may not be adequately resolved, leading to an underestimation of the maximum wind speeds.

Domain is an alternative to setting Margin. If set to “bounded” (default), the wind field domain will be determined by the Margin option. If set to “full”, the wind field domain will be set to match Region - gridLimit.

If this option is chosen, the execution time will be significantly longer. Recommended for single scenarios only.

[WindfieldInterface]
profileType = holland
windFieldType = kepert
beta = 1.3
beta1 = 1.3
beta2 = 1.3
thetaMax = 70.0
Margin = 2
Resolution = 0.05
Domain = bounded

Hazard

The Hazard section controls how the model calculates average recurrence (ARI) wind speeds, and whether to calculate confidence ranges.

ExtremeValueDistribution sets the method for calculating ARI wind speeds. Options are “emp” (empirical), “power” (power law), “GPD” (Generalised pareto distribution) or “GEV” (Generalised Extreme Value distribution).

The Years option is a comma separated list of integer values that specifies the return periods for which wind speeds will be calculated. For ExtremeValueDistribution = emp, the years cannot exceed the total number of simulated years in the TrackGenerator options.

MinimumRecords sets the minimum number of values required for performing the fitting procedure at a given grid point.

CalculateCI sets whether the hazard module will calculate confidence ranges using a bootstrap resampling method. If True, the module will run the fitting process multiple times and calculate upper and lower percentile values of the resulting return period wind speeds. The PercentileRange option sets the range – for a value of 90, the module will calculatae the 5th and 95th percentile values. SampleSize sets the number of randomly selected values that will be used in each realisation of the extreme value fitting procedure for calculating the confidence range.

SmoothPlots will apply a gaussian filter to the data before plotting on maps to minimise the inference of lines on maps. This may cause the maps to have large areas of no data due to the filtering function.

[Hazard]
ExtremeValueDistribution = emp
Years = 2,5,10,20,25,50,100,200,250,500,1000
MinimumRecords = 50
CalculateCI = True
PercentileRange = 90
SampleSize = 50
PlotSpeedUnits = mps
SmoothPlots = True

RMW

The RMW section contains a single option: GetRMWDistFromInputData. Set this value to True if the input track database has reliable data on the radius to maximum winds.

If no suitable data exists (GetRMWDistFromInputData = False), TCRM will use a regression model to determine RMW from the intensity and latitude of the storm.

[RMW]
GetRMWDistFromInputData = False

Input

The Input section sets the source of some supplementary data, as well as the datasets to be automatically downloaded. The LandMask option specifies the path to a netcdf file (supplied) that contains a land/sea mask. The MSLPFile option specifies the path to a netcdf file (downloaded) that contains daily long-term mean sea level pressure data (e.g. from a NCEP/NCAR reanalysis products). The LocationFile option specifies the path to a point shape file that contains the longitude and latitude of locations for which to extract hazard information at the conclusion of a simulation.

The Datasest option is a comma separated list of values indicating the data that should be downloaded on first execution. For each value in the list, there must be a corresponding section in the configuration file, that has options of URL (the URL of the data to be downloaded), path (where to store the data once it has been downloaded) and filename (the filename to give to the data once downloaded).

In the example below, for the IBTRACS dataset, there are additional options that describe the format of the track database with the same name. This is a legitimate approach, so long as there are no duplicate options.

Note that the filename option in the IBTRACS section matches the InputFile option in the DataProcess section, and the filename in the LTMSLP section matches the MSLPFile in the Input section.

The CoastlineGates option specifies the path to a comma-delimited text file that holds the points of a series of coastline gates that are used in the Evaluate.landfallRates module.

[Input]
LocationFile = input/stationlist.shp
LandMask = input/landmask.nc
MSLPFile = MSLP/slp.day.ltm.nc
Datasets = IBTRACS,LTMSLP
CoastlineGates = input/gates.csv

[IBTRACS]
URL = https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.since1980.list.v04r00.csv
path = input
Filename = ibtracs.since1980.list.v04r00.csv
Columns = tcserialno,season,num,skip,skip,skip,date,skip,lat,lon,skip,pressure
FieldDelimiter = ,
NumberOfHeadingLines = 2
PressureUnits = hPa
LengthUnits = km
SpeedUnits = kph
DateFormat = %Y-%m-%d %H:%M:%S

[LTMSLP]
URL = ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis.derived/surface/slp.day.1981-2010.ltm.nc
path = MSLP
filename = slp.day.ltm.nc

Output

The Output section defines the destination of the model output. Set the Path option to the directory where you wish to store the data. Paths can be relative or absolute. By default, output is stored in a subdirectory of the working directory named output.

[Output]
Path = output

Logging

The Logging section controls how the model records progress to file (and optionally STDOUT). LogFile option specifies the name of the log file. If no path is given, then the log file will be stored in the current working directory. For parallel execution, a separate log file is created for each thread, with the rank of the process appended to the name of the file.

The LogLevel is one of the Logging levels. Default is INFO.

The Verbose option allows users to print all logging messages to the standard output (default False). This can be useful when attempting to identify problems with execution. For parallel execution, this is set to False (to prevent repeated messages being printed to the screen).

Setting the ProgressBar option to True will display a simple progress bar on the screen to indicate the status of the model execution (default False). This will be turned off if TCRM is executed on a parallel system, or if it is run in batch mode.

If Datestamp = True, a timestamp will be included in the filename for the log file (default False).

[Logging]
LogFile = main.log
LogLevel = INFO
Verbose = False
ProgressBar = False
Datestamp = False

Source format options

For the input data source specified in the DataProcess ‣ Source option, there must be a corresponding section of the given name. In this example case, the source is specified as IBTRACS (the same as one of the Dataset options). The IBTRACS section therefore controls both the download dataset options, and specifies the textural format of the input track database.

The options that relate to the dataset download are URL, path and filename. URL specifies the location of the data to be downloaded. The path option specifies the path name for the storage location of the dataset. The filename option gives the name of the file to be saved (this can be different from the name of the dataset).

The remaining options relate to the format of the track database. Columns is a comma-separated list of the column names in the input database. If a column is to be ignored, it should be named skip. The FieldDelimiter is the delimiter used in the input track database (it’s assumed that the input file is a text format file!). The NumberOfHeadingLines indicates the number of text lines at the top of the file that should be ignored (usually this is column headers – due to the multiple lines used in some track databases, TCRM does not attempt to decipher the column names from the header. PressureUnits, LengthUnits and SpeedUnits specify the units the numerical values of pressure, distance and speed (respectively) used in the input track database. The DateFormat option is a string represenation of the date format used in the track database. The format should use Python’s datetime formats.

[IBTRACS]
URL = https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.since1980.list.v04r00.csv
path = input
Filename = ibtracs.since1980.list.v04r00.csv
Columns = tcserialno,season,num,skip,skip,skip,date,skip,lat,lon,skip,pressure
FieldDelimiter = ,
NumberOfHeadingLines = 2
PressureUnits = hPa
LengthUnits = km
SpeedUnits = kph
DateFormat = %Y-%m-%d %H:%M:%S

References

1

Holland, G. J. (1980): An Analytic Model of the Wind and Pressure Profiles in Hurricanes. Monthly Weather Review, 108

2

Powell, M., G. Soukup, S. Cocke, S. Gulati, N. Morisseau-Leroy, S. Hamid, N. Dorst, and L. Axe (2005): State of Florida hurricane loss projection model: Atmospheric science component. Journal of Wind Engineering and Industrial Aerodynamics, 93, 651–674

3

Schloemer, R. W. (1954): Analysis and synthesis of hurricane wind patterns over Lake Okeechobee. NOAA Hydrometeorology Report 31, 1954

4

Willoughby, H. E. and M. E. Rahn (2004): Parametric Representation of the Primary Hurricane Vortex. Part I: Observations and Evaluation of the Holland (1980) Model. Monthly Weather Review, 132, 3033–3048

5

Jelesnianski, C. P. (1966): Numerical Computations of Storm Surges without Bottom Stress. Monthly Weather Review, 94, 379–394

6(1,2)

McConochie, J. D., T. A. Hardy, and L. B. Mason (2004): Modelling tropical cyclone over-water wind and pressure fields. Ocean Engineering, 31, 1757–1782

7

Kepert, J. D. (2001): The Dynamics of Boundary Layer Jets within the Tropical Cyclone Core. Part I: Linear Theory. J. Atmos. Sci., 58, 2469–2484

8

Hubbert, G. D., G. J. Holland, L. M. Leslie and M. J. Manton (1991): A Real-Time System for Forecasting Tropical Cyclone Storm Surges. Weather and Forecasting, 6, 86–97