Calculating tide statistics and satellite biases¶
This guide demonstrates how to use the tide_stats
, pixel_stats
and tide_aliasing
functions from eo_tides.stats
to calculate local tide statistics and identify biases caused by interactions between tidal processes and satellite orbits.
Complex tide aliasing interactions between temporal tide dynamics and the regular overpass timing of sun-synchronous satellite sensors mean that satellites often do not always observe the entire tidal cycle. Biases in satellite coverage of the tidal cycle can mean that tidal extremes (e.g. the lowest or highest tides at a location) or particular tidal processes may either never be captured by satellites, or be over-represented in the satellite record. Local tide dynamics can cause these biases to vary greatly both through time and space, making it challenging to compare coastal processes consistently - particularly for large-scale coastal EO analyses.
To ensure that coastal EO analyses are not inadvertently affected by tide biases, it is important to compare how well the tides observed by satellites match the full range of tides at a location.
The tidal_stats
and pixel_stats
functions compares the subset of tides observed by satellite data against the full range of tides modelled at a regular interval through time (every two hours by default) across the entire time period covered by the satellite dataset.
This comparison is used to calculate several useful statistics and plots that summarise how well your satellite data capture real-world tidal conditions.
The tide_aliasing
function can be used to identify potential temporal tide biases, and ensure EO analyses include enough satellite data to adequately sample the entire tide cycle.
Figure 1: Example of tide biases in Landsat satellite data across coastal Australia (Bishop-Taylor et al. 2018). "Spread" represents the proportion of the astronomical tide range observed by satellites; low and high tide "offsets" represent the proportion of highest and lowest tides never observed.

Tip
For a more detailed discussion of the issue of tidal bias in sun-synchronous satellite observations of the coastline, refer to the 'Limitations and future work' section in Bishop-Taylor et al. 2018.
Getting started¶
As in the previous examples, our first step is to tell eo-tides
the location of our tide model directory (if you haven't set this up, refer to the setup instructions here).
We can also define the tide model we want to use the for analysis; for example, the default model "EOT20".
directory = "../../tests/data/tide_models/"
model = "EOT20"
Load Sentinel-2 satellite data using odc-stac¶
We can now load a time-series of satellite data over our area of interest using the Open Data Cube's odc-stac
package.
In this example, we will load Sentinel-2 satellite data from 2021-2023 over the city of Broome, Western Australia.
We will load this data from the Microsoft Planetary Computer STAC catalogue.
Tip
For a more detailed guide to using STAC metadata and odc-stac
to find and load satellite data, refer to the odc-stac Python package documentation. eo-tides
is compatible with satellite data loaded from any STAC API, for example Digital Earth Australia.
import odc.stac
import planetary_computer
import pystac_client
# Connect to STAC catalog
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
# Set cloud access defaults
odc.stac.configure_rio(
cloud_defaults=True,
aws={"aws_unsigned": True},
)
# Build a query and search the STAC catalog for all matching items
bbox = [122.160, -18.05, 122.260, -17.95]
query = catalog.search(
bbox=bbox,
collections=["sentinel-2-l2a"],
datetime="2021/2023",
query={"eo:cloud_cover": {"lt": 10}}, # Filter to images with <10% cloud
)
# Load data into xarray format
ds_s2 = odc.stac.load(
items=list(query.items()),
bands=["red"],
crs="utm",
resolution=100,
groupby="solar_day",
bbox=bbox,
fail_on_error=False,
chunks={},
)
# Inspect output
ds_s2
<xarray.Dataset> Size: 6MB Dimensions: (y: 112, x: 107, time: 117) Coordinates: * y (y) float64 896B 8.015e+06 8.015e+06 ... 8.004e+06 8.004e+06 * x (x) float64 856B 4.11e+05 4.112e+05 ... 4.216e+05 4.216e+05 spatial_ref int32 4B 32751 * time (time) datetime64[ns] 936B 2021-01-03T02:04:51.024000 ... 20... Data variables: red (time, y, x) float32 6MB dask.array<chunksize=(1, 112, 107), meta=np.ndarray>
Using tide_stats¶
Once we have loaded some satellite data, we can pass this to the tide_stats
function to calculate local tide statistics and reveal any potential tidal biases.
The tide_stats
function will return a plain-text summary below, as well as a visual plot that compares the distribution of satellite-observed tides (black dots) against the full range of modelled astronomical tide conditions (blue) using three useful metrics:
- Spread: The proportion of the full modelled astronomical tidal range that was observed by satellites. A high value indicating good coverage of the tide range.
- High tide offset: The proportion of the highest tides not observed by satellites at any time, as a proportion of the full modelled astronomical tidal range. A high value indicates that the satellite data is biased towards never capturing high tides.
- Low tide offset: The proportion of the lowest tides not observed by satellites at any time, as a proportion of the full modelled astronomical tidal range. A high value indicates that the satellite data is biased towards never capturing low tides.
Tip
For a more detailed description of these biases, see Bishop-Taylor et al. 2018.
from eo_tides.stats import tide_stats
statistics_df = tide_stats(
data=ds_s2,
model=model,
directory=directory,
)
Using tide modelling location: 122.21, -18.00 Modelling tides with EOT20 Using tide modelling location: 122.21, -18.00 Modelling tides with EOT20 🌊 Modelled astronomical tide range: 9.36 m (-4.64 to 4.73 m). 🛰️ Observed tide range: 6.31 m (-2.36 to 3.95 m). 🔴 67% of the modelled astronomical tide range was observed at this location. 🟢 The highest 8% (0.78 m) of the tide range was never observed. 🔴 The lowest 24% (2.27 m) of the tide range was never observed. 🌊 Mean modelled astronomical tide height: -0.00 m. 🛰️ Mean observed tide height: 0.95 m. ⬆️ The mean observed tide height was 0.95 m higher than the mean modelled astronomical tide height.
As we can see in the graph, Sentinel-2 captured a biased proportion of the tide range at this location: only observing ~68% of the tide range, and never observing the lowest 24% of tides.
The tide_stats
function also outputs a pandas.Series
object containing statistics for the results above, including:
mot
: mean tide height observed by the satellite (metres)mat
: mean modelled astronomical tide height (metres)hot
: maximum tide height observed by the satellite (metres)hat
: maximum tide height from modelled astronomical tidal range (metres)lot
: minimum tide height observed by the satellite (metres)lat
: minimum tide height from modelled astronomical tidal range (metres)otr
: tidal range observed by the satellite (metres)tr
: modelled astronomical tide range (metres)spread
: proportion of the full modelled tidal range observed by the satelliteoffset_low
: proportion of the lowest tides never observed by the satelliteoffset_high
: proportion of the highest tides never observed by the satellitey
: latitude used for modelling tide heightsx
: longitude used for modelling tide heights
statistics_df
mot 0.951000 mat -0.000000 hot 3.950000 hat 4.728000 lot -2.361000 lat -4.636000 otr 6.311000 tr 9.364000 spread 0.674000 offset_low 0.243000 offset_high 0.083000 x 122.209999 y -18.000000 dtype: float32
Compare against Sentinel-1 tide biases¶
In the previous example, we saw that Sentinel-2 data was biased towards never capturing low tide at our location. These biases are caused by the consistent 10:30 am local overpass time of the Sentinel-2 satellites, which due to a phenomenon called "tide-aliasing" means that certain tides never occur when the satellite overpasses.
One possible way around these biases is to use different satellite data from satellites that overpass at different times. For example, Sentinel-1 radar satellites follow a different orbit to Sentinel-2, overpassing at a local time of 6:00 pm instead of 10:30 am. This diference in overpass time potentially means that Sentinel-1 satellite data may capture different tides to Sentinel-2.
In the next example, we run the tide_stats
function on data loaded from Sentinel-1 for the same location and time period to see if this data is affected by the same tide biases.
# Build a query and search the STAC catalog for all matching items
bbox = [122.160, -18.05, 122.260, -17.95]
query = catalog.search(
bbox=bbox,
collections=["sentinel-1-rtc"],
datetime="2021/2023",
)
# Load data into xarray format
ds_s1 = odc.stac.load(
items=list(query.items()),
bands=["vv"],
crs="utm",
resolution=100,
groupby="solar_day",
bbox=bbox,
fail_on_error=False,
chunks={},
)
# Inspect output
ds_s1
<xarray.Dataset> Size: 4MB Dimensions: (y: 112, x: 107, time: 89) Coordinates: * y (y) float64 896B 8.015e+06 8.015e+06 ... 8.004e+06 8.004e+06 * x (x) float64 856B 4.11e+05 4.112e+05 ... 4.216e+05 4.216e+05 spatial_ref int32 4B 32751 * time (time) datetime64[ns] 712B 2021-01-07T21:23:20.996123 ... 20... Data variables: vv (time, y, x) float32 4MB dask.array<chunksize=(1, 112, 107), meta=np.ndarray>
When we run tide_stats
, we can see a very different pattern: Sentinel-1 data is biased towards low tide observations, and never observes high tide at our location!
statistics_df = tide_stats(
data=ds_s1,
model=model,
directory=directory,
)
Using tide modelling location: 122.21, -18.00 Modelling tides with EOT20 Using tide modelling location: 122.21, -18.00 Modelling tides with EOT20 🌊 Modelled astronomical tide range: 9.53 m (-4.81 to 4.73 m). 🛰️ Observed tide range: 6.40 m (-4.38 to 2.02 m). 🔴 67% of the modelled astronomical tide range was observed at this location. 🔴 The highest 28% (2.70 m) of the tide range was never observed. 🟢 The lowest 4% (0.42 m) of the tide range was never observed. 🌊 Mean modelled astronomical tide height: -0.00 m. 🛰️ Mean observed tide height: -1.31 m. ⬇️ The mean observed tide height was -1.31 m lower than the mean modelled astronomical tide height.
Calculate biases for multiple satellite sensors¶
At our location, Sentinel-2 optical satellites are biased towards high tide observations, while Sentinel-1 radar satellites are biased towards low tide observations. Could combining data from multiple EO sensors help us capture a more complete view of tides at this location?
To test this theory, we can combine Sentinel-2 and Sentinel-1 data into a single xarray.Dataset
, recording the name of each sensor using a new satellite_name
coordinate in our data:
import xarray as xr
# Give each observation a "satellite_name" based on its satellite
ds_s1 = ds_s1.assign_coords(satellite_name=("time", ["Sentinel-1"] * ds_s1.sizes["time"]))
ds_s2 = ds_s2.assign_coords(satellite_name=("time", ["Sentinel-2"] * ds_s2.sizes["time"]))
# Combine both Sentinel-1 and Sentinel-2 data into a single dataset
ds_all = xr.concat([ds_s1, ds_s2], dim="time")
ds_all
<xarray.Dataset> Size: 20MB Dimensions: (time: 206, y: 112, x: 107) Coordinates: * y (y) float64 896B 8.015e+06 8.015e+06 ... 8.004e+06 8.004e+06 * x (x) float64 856B 4.11e+05 4.112e+05 ... 4.216e+05 4.216e+05 spatial_ref int32 4B 32751 * time (time) datetime64[ns] 2kB 2021-01-07T21:23:20.996123 ... ... satellite_name (time) <U10 8kB 'Sentinel-1' 'Sentinel-1' ... 'Sentinel-2' Data variables: vv (time, y, x) float32 10MB dask.array<chunksize=(1, 112, 107), meta=np.ndarray> red (time, y, x) float32 10MB dask.array<chunksize=(1, 112, 107), meta=np.ndarray>
We can now run tide_stats
again.
This time, we pass our satellite name coordinate to the function using the plot_var="satellite_name"
parameter.
This will plot data from each of our satellites using a different symbol and colour.
statistics_df = tide_stats(
data=ds_all,
plot_var="satellite_name",
model=model,
directory=directory,
)
Using tide modelling location: 122.21, -18.00 Modelling tides with EOT20 Using tide modelling location: 122.21, -18.00 Modelling tides with EOT20 🌊 Modelled astronomical tide range: 9.36 m (-4.64 to 4.73 m). 🛰️ Observed tide range: 8.33 m (-4.38 to 3.95 m). 🟡 89% of the modelled astronomical tide range was observed at this location. 🟢 The highest 8% (0.78 m) of the tide range was never observed. 🟢 The lowest 3% (0.26 m) of the tide range was never observed. 🌊 Mean modelled astronomical tide height: -0.00 m. 🛰️ Mean observed tide height: -0.02 m. ⬇️ The mean observed tide height was -0.02 m lower than the mean modelled astronomical tide height.
We can see that at this location, combining Sentinel-2 and Sentinel-1 data greatly improves our biases: our satellite data now covers ~89% of the modelled astronomical tide range, and only fails to observe 8% of the highest tides and 2% of the lowest tides!
Using pixel_stats¶
Modelling tide statistics and biases spatially¶
Because tide regimes and satellite biases can vary greatly along the coast, it can be useful to plot these biases spatially.
To do this, we can use the pixel_stats
function.
pixel_stats
works similarly to tide_stats
, except that statistics are calculated across the entire extent of your satellite dataset.
The function will generate an xarray.Dataset
output containing the statistics discussed above as two-dimensional arrays.
Tip
The pixel_stats
function uses eo_tides.eo.pixel_tides
to model tides spatially. You can experiment passing in parameters like resolution
and buffer
to customise the modelling grid used for calculating tide biases. Be warned however that you can quickly run out of memory with large analyses, given the number of timesteps required to model astronomical low and high tide.
from eo_tides.stats import pixel_stats
stats_ds = pixel_stats(
data=ds_s2,
directory=directory,
)
stats_ds
Creating reduced resolution 5000 x 5000 metre tide modelling array Modelling tides with EOT20 Returning low resolution tide array Creating reduced resolution 5000 x 5000 metre tide modelling array Modelling tides with EOT20 Returning low resolution tide array
/workspaces/eo-tides/.venv/lib/python3.12/site-packages/numpy/lib/_nanfunctions_impl.py:1617: RuntimeWarning: All-NaN slice encountered return fnb._ureduce(a,
Reprojecting statistics into original resolution
<xarray.Dataset> Size: 529kB Dimensions: (y: 112, x: 107) Coordinates: tide_model <U5 20B 'EOT20' * y (y) float64 896B 8.015e+06 8.015e+06 ... 8.004e+06 8.004e+06 * x (x) float64 856B 4.11e+05 4.112e+05 ... 4.216e+05 4.216e+05 spatial_ref int32 4B 32751 Data variables: mot (y, x) float32 48kB 0.9604 0.9604 0.9604 ... 0.9451 0.945 0.945 mat (y, x) float32 48kB -0.0001393 -0.0001393 ... -0.0001331 hot (y, x) float32 48kB 3.894 3.894 3.894 ... 3.967 3.968 3.968 hat (y, x) float32 48kB 4.526 4.526 4.526 ... 4.803 4.804 4.805 lot (y, x) float32 48kB -2.242 -2.242 -2.242 ... -2.407 -2.408 lat (y, x) float32 48kB -4.457 -4.457 -4.457 ... -4.702 -4.702 otr (y, x) float32 48kB 6.136 6.136 6.136 ... 6.374 6.375 6.375 tr (y, x) float32 48kB 8.983 8.983 8.983 ... 9.504 9.506 9.508 spread (y, x) float32 48kB 0.6831 0.6831 0.6831 ... 0.6706 0.6706 offset_low (y, x) float32 48kB 0.2466 0.2466 0.2466 ... 0.2414 0.2414 offset_high (y, x) float32 48kB 0.07026 0.07026 0.07026 ... 0.08802 0.08808
We can plot these statistics to inspect any spatial tide bias patterns:
import matplotlib.pyplot as plt
stats_ds.spread.plot.imshow()
plt.gca().set_title("Spread (%)");
stats_ds.offset_low.plot.imshow()
plt.gca().set_title("Low tide offset (%)");
stats_ds.offset_high.plot.imshow()
plt.gca().set_title("High tide offset (%)");
Tide aliasing for satellite sensors¶
The biases above are caused by interactions between satellite overpasses and the natural cycles of tidal constituents called tide aliasing. Tide aliasing occurs because EO satellites typically capture imagery of the coastline at a frequency that does not align with the frequencies of the many harmonic tidal constituents that produce tides.
The aliasing period of a tidal constituent describes how long it would take for a satellite to sample the entire tidal cycle for each constituent, based on the satellite's observation frequency.
We can use the eo_tides.stats.tide_aliasing
function to estimate aliasing periods for tide constituents, based on the overpass frequency of common EO satellites:
- Short aliasing periods mean the satellite will observe the full range of tidal variation relatively quickly, reducing the risk of tide-related bias.
- Long aliasing periods indicate that it will take much longer to sample all tidal conditions, increasing the risk that satellite analyses may misrepresent tidal dynamics.
Note
Revisit periods are approximate and based on nominal repeat cycles at the equator. Actual observation frequency may vary due to latitude, cloud cover, sensor availability, and acquisition planning.
from eo_tides.stats import tide_aliasing
tide_aliasing(satellites=["landsat", "sentinel-2"], style=False)
Using 8 day revisit for landsat Using 5 day revisit for sentinel-2
name | type | period | aliasing_period | ||
---|---|---|---|---|---|
landsat | sentinel-2 | ||||
m2 | principal lunar semidiurnal | semidiurnal | 0.518 | 17.460 | 14.765 |
s2 | principal solar semidiurnal | semidiurnal | 0.500 | inf | inf |
k1 | lunar-solar diurnal | diurnal | 0.997 | 365.242 | 365.242 |
o1 | principal lunar diurnal | diurnal | 1.076 | 18.337 | 14.192 |
n2 | larger lunar elliptic semidiurnal | semidiurnal | 0.527 | 47.660 | 10.419 |
p1 | principal solar diurnal | diurnal | 1.003 | 365.242 | 365.242 |
k2 | lunar-solar semidiurnal | semidiurnal | 0.499 | 182.621 | 182.621 |
q1 | larger lunar elliptic diurnal | diurnal | 1.120 | 54.812 | 10.725 |
2n2 | lunar elliptic semidiurnal second order | semidiurnal | 0.538 | 65.318 | 16.753 |
mu2 | variational constituent | semidiurnal | 0.536 | 95.668 | 15.493 |
nu2 | larger lunar evectional | semidiurnal | 0.526 | 38.701 | 10.085 |
l2 | smaller lunar elliptic semidiurnal | semidiurnal | 0.508 | 31.812 | 31.812 |
t2 | larger solar elliptic | semidiurnal | 0.501 | 365.242 | 365.242 |
j1 | smaller lunar elliptic diurnal | diurnal | 0.962 | 25.622 | 25.622 |
m1 | smaller lunar elliptic diurnal | diurnal | 1.035 | 29.531 | 29.531 |
oo1 | lunar diurnal | diurnal | 0.929 | 20.383 | 13.168 |
rho1 | larger lunar evectional diurnal | diurnal | 1.113 | 43.288 | 10.194 |
mf | lunisolar fortnightly | long period | 13.661 | 19.306 | 13.661 |
mm | lunar monthly | long period | 27.555 | 27.555 | 27.555 |
ssa | solar semiannual | long period | 182.627 | 182.621 | 182.621 |
m4 | shallow water overtides of principal lunar | shallow water | 0.259 | 95.668 | 15.493 |
ms4 | shallow water quarter diurnal | shallow water | 0.254 | 17.460 | 14.765 |
mn4 | shallow water quarter diurnal | shallow water | 0.261 | 21.393 | 35.391 |
m6 | shallow water overtides of principal lunar | shallow water | 0.173 | 21.358 | 314.549 |
m8 | shallow water eighth diurnal | shallow water | 0.129 | 47.834 | 14.103 |
mk3 | shallow water terdiurnal | shallow water | 0.341 | 16.663 | 15.387 |
s6 | shallow water overtides of principal solar | shallow water | 0.167 | inf | inf |
2sm2 | shallow water semidiurnal | semidiurnal | 0.484 | 17.460 | 14.765 |
2mk3 | shallow water terdiurnal | shallow water | 0.349 | 75.811 | 16.179 |
msf | lunisolar synodic fortnightly | long period | 14.765 | 17.460 | 14.765 |
sa | solar annual | long period | 365.259 | 365.242 | 365.242 |
mt | meteorological tides | long period | 9.133 | 64.491 | 11.049 |
2q1 | lunar elliptic diurnal second order | diurnal | 1.167 | 55.409 | 17.559 |
In the examples above, we can see several interesting results:
- Constituents like
m2
ando1
can be observed effectively by EO satellites, requiring only 14-18 days of data to capture the full tide cycle. - Constituents like
sa
,k1
,p1
require a full year of data to be observed effectively, meaning shorter EO analysis will capture a biased portion of the tide cycle. - Constituents like
s2
have an infinite aliasing period. This means that no matter how long a satellite observes the coast, it will never capture the entire tidal cycle for this consituent.
The tide_aliasing
function can be customised to return results in different time units (e.g. years), or specific tide constituents only:
tide_aliasing(
satellites=["landsat", "sentinel-2"],
c=["m2", "s2", "k1", "o1", "p1", "sa"],
units="years",
style=False,
)
Using 8 day revisit for landsat Using 5 day revisit for sentinel-2
name | type | period | aliasing_period | ||
---|---|---|---|---|---|
landsat | sentinel-2 | ||||
m2 | principal lunar semidiurnal | semidiurnal | 0.001 | 0.048 | 0.040 |
s2 | principal solar semidiurnal | semidiurnal | 0.001 | inf | inf |
k1 | lunar-solar diurnal | diurnal | 0.003 | 1.000 | 1.000 |
o1 | principal lunar diurnal | diurnal | 0.003 | 0.050 | 0.039 |
p1 | principal solar diurnal | diurnal | 0.003 | 1.000 | 1.000 |
sa | solar annual | long period | 1.000 | 1.000 | 1.000 |
To run the function for satellites that are not supported by default, provide a dictionary to satellites
with the overpass frequency in days:
tide_aliasing(
satellites={"custom-sat": 3},
c=["m2", "s2", "k1", "o1", "p1", "sa"],
style=False,
)
Using 3 day revisit for custom-sat
name | type | period | aliasing_period | |
---|---|---|---|---|
custom-sat | ||||
m2 | principal lunar semidiurnal | semidiurnal | 0.518 | 14.765 |
s2 | principal solar semidiurnal | semidiurnal | 0.500 | inf |
k1 | lunar-solar diurnal | diurnal | 0.997 | 365.242 |
o1 | principal lunar diurnal | diurnal | 1.076 | 14.192 |
p1 | principal solar diurnal | diurnal | 1.003 | 365.242 |
sa | solar annual | long period | 365.259 | 365.242 |
Next steps¶
We have explored calculating tide statistics and biases in EO data. Now we can learn how to validate modelled tides against measured tide gauge data to ensure the tides we are modelling are accurate.