hazard package

Submodules

hazard.evd module

evd – Calculate extreme value distributions

Calculate the parameters for the GEV distribution, using the method of L-moments to estimate the parameters. The function will not only fit the distribution parameters, but also calculate the return period values for specified return periods.

References: Hosking, J. R. M., 1990: L-moments: Analysis and Estimation of Distributions using Linear Combinations of Order Statistics. Journal of the Royal Statistical Society, 52, 1, 105-124.

class EMPDistribution(intervals, numsim, nodata, minrecords)

Bases: hazard.evd.ExtremeValueDistribution

Empirical average recurrence intervals

Note

Empirical ARI values cannot be calculated for ARIs

greater than the number of simulated years. If using this distribution, the configuration option Years in the Hazard section must not contain values greater than the number of simulations (TrackGenerator – NumSimulation).

fit(data)

Calculate the average recurrence interval values, given a set of data values

Parameters

datanumpy.ndarray of data values to use to

calculate return levels

Returns

return levels at given intervals, along with

parameters for the distribution (if calculated)

class ExtremeValueDistribution(intervals, numsim, nodata, minrecords)

Bases: object

Abstract extreme value distribution model

calculate(data, *args, **kwargs)
fit(data, *args, **kwargs)

Calculate the average recurrence interval values, given a set of data values

Parameters

datanumpy.ndarray of data values to use to

calculate return levels

Returns

return levels at given intervals, along with

parameters for the distribution (if calculated)

class GEVDistribution(intervals, numsim, nodata, minrecords)

Bases: hazard.evd.ExtremeValueDistribution

Generalised extreme value distribution, used for block maxima

fit(data)

Calculate the average recurrence interval values, given a set of data values

Parameters

datanumpy.ndarray of data values to use to

calculate return levels

Returns

return levels at given intervals, along with

parameters for the distribution (if calculated)

class GPDDistribution(intervals, numsim, nodata, minrecords, threshold=99.5)

Bases: hazard.evd.ExtremeValueDistribution

Generalised Pareto Distribution, fitted using peaks-over-threshold approach. The threshold is an arbitrary percentile of the data values (default =99.5)

fit(data)

Calculate the average recurrence interval values, given a set of data values

Parameters

datanumpy.ndarray of data values to use to

calculate return levels

Returns

return levels at given intervals, along with

parameters for the distribution (if calculated)

class POWERDistribution(intervals, numsim, nodata, minrecords)

Bases: hazard.evd.ExtremeValueDistribution

Fit a function of the form w = A - B * ARI^C to the empirical average recurrence intervals (ARI), where A > 0, 0 < B< A and C < 0.

fit(data)

Calculate the average recurrence interval values, given a set of data values

Parameters

datanumpy.ndarray of data values to use to

calculate return levels

Returns

return levels at given intervals, along with

parameters for the distribution (if calculated)

allSubclasses(cls)

Recursively find all subclasses of a given class.

empfit(data, intervals, numsim, nodata=- 9999.0, minrecords=50)

Calculate empirical ARI values for a collection of wind speed records.

Parameters
  • datanumpy.ndarray of data values

  • intervalsnumpy.ndarray of years for which to calculate return period values. The values will be determined empirically, then interpolated to these intervals.

  • numsim (int) – number of simulations created.

  • nodata (float) – value to insert if fit does not converge.

  • minrecords (int) – minimum number of valid observations required to perform fitting.

Parameters
  • Rpevalnumpy.array of return period wind speed values

  • location – location parameter

  • scale – scale parameter

  • shape – shape parameter

evargs(name)
evfunc(name)
gevfit(data, intervals, nodata=- 9999.0, minrecords=50, yrspersim=1)

Calculate extreme value distribution parameters using the Lmoments module. Return period values are not calculated if the shape parameter is negative or zero.

Parameters
  • data (numpy.ndarray) – array of data values. Values represent max events for each year of simulation at a single grid box

  • intervals (numpy.ndarray) – array of years for which to calculate return period values.

  • nodata (float) – value to insert if fit does not converge.

  • minRecords (int) – minimum number of valid observations required to perform fitting.

  • yrspersim (int) – data represent block maxima - this gives the length of each block in years.

Parameters
  • wnumpy.array of return period wind speed values

  • loc – location parameter

  • scale – scale parameter

  • shp – shape parameter

gpdReturnLevel(intervals, mu, shape, scale, rate, npyr=365.25)

Calculate return levels for specified intervals for a distribution with the given threshold, scale and shape parameters.

Parameters
  • intervalsnumpy.ndarray or float of recurrence intervals to evaluate return levels for.

  • mu (float) – Threshold parameter (also called location).

  • shape (float) – Shape parameter.

  • scale (float) – Scale parameter.

  • rate (float) – Rate of exceedances (i.e. number of observations greater than mu, divided by total number of observations).

  • npyr (float) – Number of observations per year.

Returns

return levels for the specified recurrence intervals.

gpdfit(data, intervals, numsim, nodata=- 9999, minrecords=50, threshold=99.5)

Fit a Generalised Pareto Distribution to the data. For a quick evaluation, we use the 99.5th percentile as a threshold.

Parameters
  • data (numpy.ndarray) – array of data values.

  • intervals (numpy.ndarray) – array of years for which to calculate return period values.

  • numsim (int) – number of simulations created.

  • missingValue (float) – value to insert if fit does not converge.

  • minrecords (int) – minimum number of valid observations required to perform fitting.

  • threshold (float) – Threshold for performing the fitting. Default is the 99.5th percentile

Parameters
  • Rpevalnumpy.array of return period wind speed values

  • location – location parameter

  • scale – scale parameter

  • shape – shape parameter

islocal(func)
powerfit(data, intervals, numsim, nodata=- 9999.0, minrecords=50)

Fit a modified power law to empirical return period values

w(t; a, b, c) = a - b * t^c

where a > 0, 0 < b < a and c < 0.

This function calculates the empirical return periods for the given set of wind speed values, then fits the function to the return period wind speeds. In this sense, the returned location, scale and shape parameters do not refer to the parameters of any given distribution, but rather the coefficients of the fitted function.

Parameters
  • datanumpy.ndarray of data values

  • intervalsnumpy.ndarray of years for which to calculate return period values.

  • numsim (int) – number of simulations created.

  • nodata (float) – value to insert if fit does not converge.

  • minrecords (int) – minimum number of valid observations required to perform fitting.

Parameters
  • wnumpy.array of return period wind speed values

  • location – location parameter

  • scale – scale parameter

  • shape – shape parameter

Module contents

hazard – Hazard calculation

This module contains the core objects for the return period hazard calculation.

Hazard calculations can be run in parallel using MPI if the mpi4py library is found and TCRM is run using the mpirun command. For example, to run with 10 processors:

mpirun -n 10 python tcrm.py cairns.ini

hazard can be correctly initialised and started by calling the :meth: run with the location of a configFile:

import hazard
hazard.run('cairns.ini')
class HazardCalculator(configFile, tilegrid, numSim, minRecords, yrsPerSim, calcCI=False, evd='GEV')

Bases: object

Calculate return period wind speeds using GEV fitting

calculateHazard(tilelimits)

Load input hazard data and then calculate the return period and distribution parameters for a given tile. The extreme value distribution used in the calculation can be set in the config file. The default distribution is set to GEV.

Parameters

tilelimitstuple of tile limits

Parameters
  • Rpnumpy.ndarray of return period wind speed values for each lat/lon

  • locnumpy.ndarray of location parameters for each lat/lon

  • scalenumpy.ndarray of scale parameters for each lat/lon

  • shpnumpy.ndarray of shape parameters for each lat/lon

  • RpUpper – Upper CI return period wind speed values for each lat/lon

  • RpLower – Lower CI return period wind speed values for each lat/lon

dumpHazardFromTiles(tiles, progressCallback=None)

Iterate over tiles to calculate return period hazard levels

Parameters

tileiter – generator that yields tuples of tile dimensions.

saveHazard()

Save hazard data to a netCDF file.

class Tile(number, input_limits, output_limits)

Bases: object

Tile object.

Because there is a buffer region around the outer edge of the dict gridLimit, the indices where data is pulled from (the wind field files) are different from those where the data is stored (the output hazard array).

This object holds the index ranges for the input array and output array, to indicate this relationship.

class TileGrid(gridLimit, wf_lon, wf_lat, xstep=100, ystep=100)

Bases: object

Tiling to minimise MemoryErrors and enable parallelisation.

getDomainExtent()

Return the longitude and latitude values that lie within the modelled domain

Return lon

numpy.ndarray containing longitude values

Return lat

numpy.ndarray containing latitude values

getGridLimit(k)

Return the limits for tile k. x-indices correspond to the east-west coordinate, y-indices correspond to the north-south coordinate.

Parameters

k (int) – tile number

Return x1

minimum x-index for tile k

Return x2

maximum x-index for tile k

Return y1

minimum y-index for tile k

Return y2

maximum y-index for tile k

tileGrid()

Defines the indices required to subset a 2D array into smaller rectangular 2D arrays (of dimension x_step * y_step).

aggregateWindFields(inputPath, numSimulations, tilelimits)

Aggregate wind field data into annual maxima for use in fitting extreme value distributions.

Parameters
  • inputPath (str) – path to individual wind field files.

  • numSimulations (int) – Number of simulated years of activity.

calculateCI(Vr, years, nodata, minRecords, yrsPerSim=1, sample_size=50, prange=90)

Fit a GEV to the wind speed records for a 2-D extent of wind speed values, providing a confidence range by resampling at random from the input values.

Parameters
  • Vrnumpy.ndarray of wind speeds (3-D - event, lat, lon)

  • yearsnumpy.ndarray of years for which to evaluate return period values.

  • nodata (float) – missing data value.

  • minRecords (int) – minimum number of valid wind speed values required to fit distribution.

  • yrsPerSim (int) – Values represent block maxima - this value indicates the time span of the block (default 1).

  • sample_size (int) – number of records to randomly sample for calculating confidence interval of the fit.

  • prange (float) – percentile range.

Parameters
  • RpUpper – Upper CI return period wind speed values for each lat/lon

  • RpLower – Lower CI return period wind speed values for each lat/lon

calculateEMP(Vr, years, numsim, nodata, minRecords, yrsPerSim)

Calculate empirical return levels the wind speed records for a 2-D extent of wind speed values

Parameters
  • Vrnumpy.ndarray of wind speeds (3-D - event, lat, lon) block maxima processed with aggregateWindRecords

  • yearsnumpy.ndarray of years for which to evaluate return period values

  • nodata (float) – missing data value.

  • minRecords (int) – minimum number of valid wind speed values required to fit distribution.

  • yrsPerSim (int) – Taken from the config file

GEV fit parameters and return period wind speeds for each grid cell in simulation domain

Parameters
  • Rpnumpy.ndarray of return period wind speed values

  • locnumpy.ndarray of location parameters in the domain of Vr

  • scalenumpy.ndarray of scale parameters in the domain of Vr

  • shpnumpy.ndarray of shape parameters in the domain of Vr

calculateGEV(Vr, years, nodata, minRecords, yrsPerSim)

Fit a GEV to the wind speed records for a 2-D extent of wind speed values

Parameters
  • Vrnumpy.ndarray of wind speeds (3-D - event, lat, lon) block maxima processed with aggregateWindRecords

  • yearsnumpy.ndarray of years for which to evaluate return period values

  • nodata (float) – missing data value.

  • minRecords (int) – minimum number of valid wind speed values required to fit distribution.

  • yrsPerSim (int) – Taken from the config file

GEV fit parameters and return period wind speeds for each grid cell in simulation domain

Parameters
  • Rpnumpy.ndarray of return period wind speed values

  • locnumpy.ndarray of location parameters in the domain of Vr

  • scalenumpy.ndarray of scale parameters in the domain of Vr

  • shpnumpy.ndarray of shape parameters in the domain of Vr

calculateGPD(Vr, years, numsim, nodata, minRecords, yrsPerSim)

Fit a GPD to the wind speed records for a 2-D extent of wind speed values

Parameters
  • inputPath – path to individual wind field files.

  • tilelimits (tuple) – tuple of index limits of a tile.

  • yearsnumpy.ndarray of years for which to evaluate return period values

  • numsim (int) – number of simulations created.

  • nodata (float) – missing data value.

  • minRecords (int) – minimum number of valid wind speed values required to fit distribution.

  • yrsPerSim (int) – Taken from the config file

GPD fit parameters and return period wind speeds for each grid cell in simulation domain

Parameters
  • Rpnumpy.ndarray of return period wind speed values

  • locnumpy.ndarray of location parameters in the domain of Vr

  • scalenumpy.ndarray of scale parameters in the domain of Vr

  • shpnumpy.ndarray of shape parameters in the domain of Vr

calculatePower(Vr, years, numsim, nodata, minRecords, yrsPerSim)

Fit a GPD to the wind speed records for a 2-D extent of wind speed values

Parameters
  • inputPath – path to individual wind field files.

  • tilelimits (tuple) – tuple of index limits of a tile.

  • yearsnumpy.ndarray of years for which to evaluate return period values

  • numSim (int) – number of simulations created.

  • nodata (float) – missing data value.

  • minRecords (int) – minimum number of valid wind speed values required to fit distribution.

  • yrsPerSim (int) – Taken from the config file

GPD fit parameters and return period wind speeds for each grid cell in simulation domain

Parameters
  • Rpnumpy.ndarray of return period wind speed values

  • locnumpy.ndarray of location parameters in the domain of Vr

  • scalenumpy.ndarray of scale parameters in the domain of Vr

  • shpnumpy.ndarray of shape parameters in the domain of Vr

getTileLimits(tilegrid, tilenums)

Generate a list of tuples of the x- and y- limits of a tile

Parameters
  • tilegridTileGrid instance

  • tilenums – list of tile numbers (must be sequential)

Returns

list of tuples of tile limits

getTiles(tilegrid)

Helper to obtain a generator that yields tile numbers

Parameters

tilegridTileGrid instance

loadFile(filename, limits)

Load a subset of the data from the given file, with the extent of the subset specified in the limits tuple

Parameters
  • filename (str) – str full path to file to load.

  • limits (tuple) – tuple of index limits of a tile.

Returns

2-D numpy.ndarray of wind speed values.

loadFilesFromPath(inputPath, tilelimits)

Load wind field data for each subset into a 3-D array.

Parameters
  • inputPath (str) – str path to wind field files.

  • tilelimits (tuple) – tuple of index limits of a tile.

Returns

3-D numpy.narray of wind field records.

run(configFile, callback=None)

Run the hazard calculations.

This will attempt to run the calculation in parallel by tiling the domain, but also provides a sane fallback mechanism to execute in serial.

Parameters

configFile (str) – path to configuration file

setDomain(inputPath)

Establish the full extent of input wind field files

Parameters

inputPath (str) – path of folder containing wind field files

Returns

Longitudes and latitudes of the wind field grid.

Return type

numpy.ndarray