hazard package¶
Submodules¶
hazard.evd module¶
evd
– Calculate extreme value distributions¶
Calculate the parameters for the GEV distribution, using the method of L-moments to estimate the parameters. The function will not only fit the distribution parameters, but also calculate the return period values for specified return periods.
References: Hosking, J. R. M., 1990: L-moments: Analysis and Estimation of Distributions using Linear Combinations of Order Statistics. Journal of the Royal Statistical Society, 52, 1, 105-124.
-
class
EMPDistribution
(intervals, numsim, nodata, minrecords)¶ Bases:
hazard.evd.ExtremeValueDistribution
Empirical average recurrence intervals
Note
Empirical ARI values cannot be calculated for ARIs
greater than the number of simulated years. If using this distribution, the configuration option Years in the Hazard section must not contain values greater than the number of simulations (TrackGenerator – NumSimulation).
-
fit
(data)¶ Calculate the average recurrence interval values, given a set of data values
- Parameters
data –
numpy.ndarray
of data values to use to
calculate return levels
- Returns
return levels at given intervals, along with
parameters for the distribution (if calculated)
-
-
class
ExtremeValueDistribution
(intervals, numsim, nodata, minrecords)¶ Bases:
object
Abstract extreme value distribution model
-
calculate
(data, *args, **kwargs)¶
-
fit
(data, *args, **kwargs)¶ Calculate the average recurrence interval values, given a set of data values
- Parameters
data –
numpy.ndarray
of data values to use to
calculate return levels
- Returns
return levels at given intervals, along with
parameters for the distribution (if calculated)
-
-
class
GEVDistribution
(intervals, numsim, nodata, minrecords)¶ Bases:
hazard.evd.ExtremeValueDistribution
Generalised extreme value distribution, used for block maxima
-
fit
(data)¶ Calculate the average recurrence interval values, given a set of data values
- Parameters
data –
numpy.ndarray
of data values to use to
calculate return levels
- Returns
return levels at given intervals, along with
parameters for the distribution (if calculated)
-
-
class
GPDDistribution
(intervals, numsim, nodata, minrecords, threshold=99.5)¶ Bases:
hazard.evd.ExtremeValueDistribution
Generalised Pareto Distribution, fitted using peaks-over-threshold approach. The threshold is an arbitrary percentile of the data values (default =99.5)
-
fit
(data)¶ Calculate the average recurrence interval values, given a set of data values
- Parameters
data –
numpy.ndarray
of data values to use to
calculate return levels
- Returns
return levels at given intervals, along with
parameters for the distribution (if calculated)
-
-
class
POWERDistribution
(intervals, numsim, nodata, minrecords)¶ Bases:
hazard.evd.ExtremeValueDistribution
Fit a function of the form w = A - B * ARI^C to the empirical average recurrence intervals (ARI), where A > 0, 0 < B< A and C < 0.
-
fit
(data)¶ Calculate the average recurrence interval values, given a set of data values
- Parameters
data –
numpy.ndarray
of data values to use to
calculate return levels
- Returns
return levels at given intervals, along with
parameters for the distribution (if calculated)
-
-
allSubclasses
(cls)¶ Recursively find all subclasses of a given class.
-
empfit
(data, intervals, numsim, nodata=- 9999.0, minrecords=50)¶ Calculate empirical ARI values for a collection of wind speed records.
- Parameters
data –
numpy.ndarray
of data valuesintervals –
numpy.ndarray
of years for which to calculate return period values. The values will be determined empirically, then interpolated to these intervals.numsim (int) – number of simulations created.
nodata (float) – value to insert if fit does not converge.
minrecords (int) – minimum number of valid observations required to perform fitting.
- Parameters
Rpeval – numpy.array of return period wind speed values
location – location parameter
scale – scale parameter
shape – shape parameter
-
evargs
(name)¶
-
evfunc
(name)¶
-
gevfit
(data, intervals, nodata=- 9999.0, minrecords=50, yrspersim=1)¶ Calculate extreme value distribution parameters using the Lmoments module. Return period values are not calculated if the shape parameter is negative or zero.
- Parameters
data (
numpy.ndarray
) – array of data values. Values represent max events for each year of simulation at a single grid boxintervals (
numpy.ndarray
) – array of years for which to calculate return period values.nodata (float) – value to insert if fit does not converge.
minRecords (int) – minimum number of valid observations required to perform fitting.
yrspersim (int) – data represent block maxima - this gives the length of each block in years.
- Parameters
w – numpy.array of return period wind speed values
loc – location parameter
scale – scale parameter
shp – shape parameter
-
gpdReturnLevel
(intervals, mu, shape, scale, rate, npyr=365.25)¶ Calculate return levels for specified intervals for a distribution with the given threshold, scale and shape parameters.
- Parameters
intervals –
numpy.ndarray
or float of recurrence intervals to evaluate return levels for.mu (float) – Threshold parameter (also called location).
shape (float) – Shape parameter.
scale (float) – Scale parameter.
rate (float) – Rate of exceedances (i.e. number of observations greater than mu, divided by total number of observations).
npyr (float) – Number of observations per year.
- Returns
return levels for the specified recurrence intervals.
-
gpdfit
(data, intervals, numsim, nodata=- 9999, minrecords=50, threshold=99.5)¶ Fit a Generalised Pareto Distribution to the data. For a quick evaluation, we use the 99.5th percentile as a threshold.
- Parameters
data (
numpy.ndarray
) – array of data values.intervals (
numpy.ndarray
) – array of years for which to calculate return period values.numsim (int) – number of simulations created.
missingValue (float) – value to insert if fit does not converge.
minrecords (int) – minimum number of valid observations required to perform fitting.
threshold (float) – Threshold for performing the fitting. Default is the 99.5th percentile
- Parameters
Rpeval – numpy.array of return period wind speed values
location – location parameter
scale – scale parameter
shape – shape parameter
-
islocal
(func)¶
-
powerfit
(data, intervals, numsim, nodata=- 9999.0, minrecords=50)¶ Fit a modified power law to empirical return period values
w(t; a, b, c) = a - b * t^c
where a > 0, 0 < b < a and c < 0.
This function calculates the empirical return periods for the given set of wind speed values, then fits the function to the return period wind speeds. In this sense, the returned location, scale and shape parameters do not refer to the parameters of any given distribution, but rather the coefficients of the fitted function.
- Parameters
data –
numpy.ndarray
of data valuesintervals –
numpy.ndarray
of years for which to calculate return period values.numsim (int) – number of simulations created.
nodata (float) – value to insert if fit does not converge.
minrecords (int) – minimum number of valid observations required to perform fitting.
- Parameters
w – numpy.array of return period wind speed values
location – location parameter
scale – scale parameter
shape – shape parameter
Module contents¶
hazard
– Hazard calculation¶
This module contains the core objects for the return period hazard calculation.
Hazard calculations can be run in parallel using MPI if the mpi4py library is found and TCRM is run using the mpirun command. For example, to run with 10 processors:
mpirun -n 10 python tcrm.py cairns.ini
hazard
can be correctly initialised and started by
calling the :meth: run with the location of a configFile:
import hazard
hazard.run('cairns.ini')
-
class
HazardCalculator
(configFile, tilegrid, numSim, minRecords, yrsPerSim, calcCI=False, evd='GEV')¶ Bases:
object
Calculate return period wind speeds using GEV fitting
-
calculateHazard
(tilelimits)¶ Load input hazard data and then calculate the return period and distribution parameters for a given tile. The extreme value distribution used in the calculation can be set in the config file. The default distribution is set to GEV.
- Parameters
tilelimits – tuple of tile limits
- Parameters
Rp – numpy.ndarray of return period wind speed values for each lat/lon
loc – numpy.ndarray of location parameters for each lat/lon
scale – numpy.ndarray of scale parameters for each lat/lon
shp – numpy.ndarray of shape parameters for each lat/lon
RpUpper – Upper CI return period wind speed values for each lat/lon
RpLower – Lower CI return period wind speed values for each lat/lon
-
dumpHazardFromTiles
(tiles, progressCallback=None)¶ Iterate over tiles to calculate return period hazard levels
- Parameters
tileiter – generator that yields tuples of tile dimensions.
-
saveHazard
()¶ Save hazard data to a netCDF file.
-
-
class
Tile
(number, input_limits, output_limits)¶ Bases:
object
Tile object.
Because there is a buffer region around the outer edge of the
dict
gridLimit, the indices where data is pulled from (the wind field files) are different from those where the data is stored (the output hazard array).This object holds the index ranges for the input array and output array, to indicate this relationship.
-
class
TileGrid
(gridLimit, wf_lon, wf_lat, xstep=100, ystep=100)¶ Bases:
object
Tiling to minimise MemoryErrors and enable parallelisation.
-
getDomainExtent
()¶ Return the longitude and latitude values that lie within the modelled domain
- Return lon
numpy.ndarray
containing longitude values- Return lat
numpy.ndarray
containing latitude values
-
getGridLimit
(k)¶ Return the limits for tile k. x-indices correspond to the east-west coordinate, y-indices correspond to the north-south coordinate.
- Parameters
k (int) – tile number
- Return x1
minimum x-index for tile k
- Return x2
maximum x-index for tile k
- Return y1
minimum y-index for tile k
- Return y2
maximum y-index for tile k
-
tileGrid
()¶ Defines the indices required to subset a 2D array into smaller rectangular 2D arrays (of dimension x_step * y_step).
-
-
aggregateWindFields
(inputPath, numSimulations, tilelimits)¶ Aggregate wind field data into annual maxima for use in fitting extreme value distributions.
- Parameters
inputPath (str) – path to individual wind field files.
numSimulations (int) – Number of simulated years of activity.
-
calculateCI
(Vr, years, nodata, minRecords, yrsPerSim=1, sample_size=50, prange=90)¶ Fit a GEV to the wind speed records for a 2-D extent of wind speed values, providing a confidence range by resampling at random from the input values.
- Parameters
Vr – numpy.ndarray of wind speeds (3-D - event, lat, lon)
years – numpy.ndarray of years for which to evaluate return period values.
nodata (float) – missing data value.
minRecords (int) – minimum number of valid wind speed values required to fit distribution.
yrsPerSim (int) – Values represent block maxima - this value indicates the time span of the block (default 1).
sample_size (int) – number of records to randomly sample for calculating confidence interval of the fit.
prange (float) – percentile range.
- Parameters
RpUpper – Upper CI return period wind speed values for each lat/lon
RpLower – Lower CI return period wind speed values for each lat/lon
-
calculateEMP
(Vr, years, numsim, nodata, minRecords, yrsPerSim)¶ Calculate empirical return levels the wind speed records for a 2-D extent of wind speed values
- Parameters
Vr – numpy.ndarray of wind speeds (3-D - event, lat, lon) block maxima processed with aggregateWindRecords
years – numpy.ndarray of years for which to evaluate return period values
nodata (float) – missing data value.
minRecords (int) – minimum number of valid wind speed values required to fit distribution.
yrsPerSim (int) – Taken from the config file
GEV fit parameters and return period wind speeds for each grid cell in simulation domain
- Parameters
Rp – numpy.ndarray of return period wind speed values
loc – numpy.ndarray of location parameters in the domain of Vr
scale – numpy.ndarray of scale parameters in the domain of Vr
shp – numpy.ndarray of shape parameters in the domain of Vr
-
calculateGEV
(Vr, years, nodata, minRecords, yrsPerSim)¶ Fit a GEV to the wind speed records for a 2-D extent of wind speed values
- Parameters
Vr – numpy.ndarray of wind speeds (3-D - event, lat, lon) block maxima processed with aggregateWindRecords
years – numpy.ndarray of years for which to evaluate return period values
nodata (float) – missing data value.
minRecords (int) – minimum number of valid wind speed values required to fit distribution.
yrsPerSim (int) – Taken from the config file
GEV fit parameters and return period wind speeds for each grid cell in simulation domain
- Parameters
Rp – numpy.ndarray of return period wind speed values
loc – numpy.ndarray of location parameters in the domain of Vr
scale – numpy.ndarray of scale parameters in the domain of Vr
shp – numpy.ndarray of shape parameters in the domain of Vr
-
calculateGPD
(Vr, years, numsim, nodata, minRecords, yrsPerSim)¶ Fit a GPD to the wind speed records for a 2-D extent of wind speed values
- Parameters
inputPath – path to individual wind field files.
tilelimits (tuple) – tuple of index limits of a tile.
years – numpy.ndarray of years for which to evaluate return period values
numsim (int) – number of simulations created.
nodata (float) – missing data value.
minRecords (int) – minimum number of valid wind speed values required to fit distribution.
yrsPerSim (int) – Taken from the config file
GPD fit parameters and return period wind speeds for each grid cell in simulation domain
- Parameters
Rp – numpy.ndarray of return period wind speed values
loc – numpy.ndarray of location parameters in the domain of Vr
scale – numpy.ndarray of scale parameters in the domain of Vr
shp – numpy.ndarray of shape parameters in the domain of Vr
-
calculatePower
(Vr, years, numsim, nodata, minRecords, yrsPerSim)¶ Fit a GPD to the wind speed records for a 2-D extent of wind speed values
- Parameters
inputPath – path to individual wind field files.
tilelimits (tuple) – tuple of index limits of a tile.
years – numpy.ndarray of years for which to evaluate return period values
numSim (int) – number of simulations created.
nodata (float) – missing data value.
minRecords (int) – minimum number of valid wind speed values required to fit distribution.
yrsPerSim (int) – Taken from the config file
GPD fit parameters and return period wind speeds for each grid cell in simulation domain
- Parameters
Rp – numpy.ndarray of return period wind speed values
loc – numpy.ndarray of location parameters in the domain of Vr
scale – numpy.ndarray of scale parameters in the domain of Vr
shp – numpy.ndarray of shape parameters in the domain of Vr
-
getTileLimits
(tilegrid, tilenums)¶ Generate a list of tuples of the x- and y- limits of a tile
- Parameters
tilegrid –
TileGrid
instancetilenums – list of tile numbers (must be sequential)
- Returns
list of tuples of tile limits
-
getTiles
(tilegrid)¶ Helper to obtain a generator that yields tile numbers
- Parameters
tilegrid –
TileGrid
instance
-
loadFile
(filename, limits)¶ Load a subset of the data from the given file, with the extent of the subset specified in the limits tuple
- Parameters
filename (str) – str full path to file to load.
limits (tuple) – tuple of index limits of a tile.
- Returns
2-D numpy.ndarray of wind speed values.
-
loadFilesFromPath
(inputPath, tilelimits)¶ Load wind field data for each subset into a 3-D array.
- Parameters
inputPath (str) – str path to wind field files.
tilelimits (tuple) – tuple of index limits of a tile.
- Returns
3-D numpy.narray of wind field records.
-
run
(configFile, callback=None)¶ Run the hazard calculations.
This will attempt to run the calculation in parallel by tiling the domain, but also provides a sane fallback mechanism to execute in serial.
- Parameters
configFile (str) – path to configuration file
-
setDomain
(inputPath)¶ Establish the full extent of input wind field files
- Parameters
inputPath (str) – path of folder containing wind field files
- Returns
Longitudes and latitudes of the wind field grid.
- Return type
numpy.ndarray