Sliderule data I/O#

This notebook will highlight analyzing various coincident elevation measurements. We will find regions with and use slideruleearth.io service to retrieve ICESat-2 and GEDI point elevation measurements.

Note

Keep in mind, these measurements are from different sensor types, close in time, but not at exactly the same time. All measurements also have uncertainties, so we do not expect perfect agreement.

import coincident
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
import matplotlib.pyplot as plt
import numpy as np

# For testing OR if you have access to a private cluster set that here
# https://slideruleearth.io/web/rtd/developer_guide/articles/private_clusters.html#private-clusters
# sliderule.init(verbose=True, organization="uw", desired_nodes=5, bypass_dns=True, time_to_live=60)
/home/docs/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.5/src/coincident/io/download.py:27: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)
  from tqdm.autonotebook import tqdm
%matplotlib inline
# %config InlineBackend.figure_format = 'retina'

Identify a primary dataset#

Start by loading a full resolution polygon of a 3DEP LiDAR workunit which has a known start_datetime and end_datatime:

workunit = "CO_WestCentral_2019"
df_wesm = coincident.search.wesm.read_wesm_csv()
gf_lidar = coincident.search.wesm.load_by_fid(
    df_wesm[df_wesm.workunit == workunit].index
)
gf_lidar
workunit workunit_id project project_id start_datetime end_datetime ql spec p_method dem_gsd_meters ... seamless_category seamless_reason lpc_link sourcedem_link metadata_link geometry collection datetime dayofyear duration
0 CO_WestCentral_2019 175984 CO_WestCentral_2019_A19 175987 2019-08-21 2019-09-19 QL 2 USGS Lidar Base Specification 1.3 linear-mode lidar 1.0 ... Meets Meets 3DEP seamless DEM requirements https://rockyweb.usgs.gov/vdelivery/Datasets/S... https://prd-tnm.s3.amazonaws.com/index.html?pr... https://prd-tnm.s3.amazonaws.com/index.html?pr... MULTIPOLYGON (((-106.13571 38.4146, -106.1702 ... 3DEP 2019-09-04 12:00:00 247 29

1 rows × 33 columns

Search secondary datasets#

Provide a list that will be searched in order. The list contains tuples of dataset aliases and the temporal pad in days to search before the primary dataset start and end dates

secondary_datasets = [
    ("gedi", 40),  # +/- 40 days from lidar
    ("icesat-2", 60),  # +/- 60 days from lidar
]

gf_gedi, gf_is2 = coincident.search.cascading_search(
    gf_lidar,
    secondary_datasets,
    min_overlap_area=30,  # km^2
)

Get ICESat-2 ATL06 Data#

We’ve identified 7 granules of icesat-2 data to examine, but there is no need to work with the entire granule, which spans a huge geographic extent. Instead we’ll retrieve a subset of elevation values in the area of interest for each granule.

# Hone in on single granule as a pandas series
i = 0
granule_gdf = gf_is2.iloc[[i]]
granule = gf_is2.iloc[i]
granule_gdf.plot()  # needs to be a geodataframe
plt.title(granule.id);
# The icesat-2 track trends N-S, and is crossed by multiple GEDI tracks, resulting in the crosshatched appearance
../_images/d4738c7412f63c98e97f138a33f0fff7ce0d3613183e975ed29ecbeab36d6404.png
data_is2 = coincident.io.sliderule.subset_atl06(
    granule_gdf,
    include_worldcover=True,
)
# Plot this data and overlay footprint
fig, ax = plt.subplots(figsize=(8, 10))
plt.scatter(
    x=data_is2.geometry.x,
    y=data_is2.geometry.y,
    c=data_is2.h_li,
    s=10,
)
granule_gdf.dissolve().boundary.plot(ax=ax, color="magenta")
cb = plt.colorbar()
cb.set_label("elevation_hr (m)")
../_images/21baed5e71fd2738adaedbeb5472a67d97a208e5d84725d2ddaea73931ae643a.png

Note

sliderule gets data from the envelope of multipolygon, not only data within intersecting GEDI tracks. Along-track gaps are where there is missing data.

# Here, we specify aoi=granule_gdf to only return GEDI data
# in our icesat-2 track of interest
data_gedi = coincident.io.sliderule.subset_gedi02a(
    gf_gedi,
    aoi=granule_gdf,
    include_worldcover=True,
)
data_gedi
orbit track elevation_lm solar_elevation shot_number flags worldcover.time_ns worldcover.fileid elevation_hr worldcover.value beam sensitivity srcid geometry
time_ns
2019-08-13 00:26:32.668279296 3775 3020 1680.471191 18.899099 37750600300666419 130 [2021-06-30T00:00:00.000000000] [17179869184] 1684.217529 Grassland 6 0.966716 121 POINT (-107.73893 38.75799)
2019-08-13 00:26:32.684807424 3775 3020 1681.909668 18.898209 37750600300666421 130 [2021-06-30T00:00:00.000000000] [17179869184] 1686.067993 Grassland 6 0.963738 121 POINT (-107.73791 38.75734)
2019-08-13 00:26:32.693071360 3775 3020 1685.991699 18.897764 37750600300666422 130 [2021-06-30T00:00:00.000000000] [17179869184] 1689.700439 Grassland 6 0.960438 121 POINT (-107.7374 38.75701)
2019-08-13 00:26:32.701335552 3775 3020 1688.838013 18.897318 37750600300666423 130 [2021-06-30T00:00:00.000000000] [17179869184] 1692.659180 Grassland 6 0.966184 121 POINT (-107.7369 38.75669)
2019-08-13 00:26:32.709599488 3775 3020 1690.289673 18.896873 37750600300666424 130 [2021-06-30T00:00:00.000000000] [17179869184] 1693.998413 Grassland 6 0.966138 121 POINT (-107.73639 38.75636)
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2019-08-20 14:38:27.640957696 3893 583 2454.182129 23.838963 38930000200401361 130 [2021-06-30T00:00:00.000000000] [25769803776] 2458.303223 Shrubland 0 0.936784 159 POINT (-107.69186 38.61128)
2019-08-20 14:38:27.649223680 3893 583 2453.085205 23.839365 38930000200401362 130 [2021-06-30T00:00:00.000000000] [25769803776] 2456.831787 Grassland 0 0.916494 159 POINT (-107.69135 38.6116)
2019-08-20 14:38:27.657489152 3893 583 2451.478516 23.839769 38930000200401363 130 [2021-06-30T00:00:00.000000000] [25769803776] 2455.000244 Grassland 0 0.918110 159 POINT (-107.69085 38.61193)
2019-08-20 14:38:27.665753088 3893 583 2449.634277 23.840174 38930000200401364 130 [2021-06-30T00:00:00.000000000] [25769803776] 2453.418213 Grassland 0 0.918575 159 POINT (-107.69034 38.61226)
2019-08-20 14:38:27.674017280 3893 583 2447.391357 23.840578 38930000200401365 130 [2021-06-30T00:00:00.000000000] [25769803776] 2450.875488 Grassland 0 0.913983 159 POINT (-107.68983 38.61259)

10225 rows × 14 columns

# Plot this data and overlay footprint
fig, ax = plt.subplots(figsize=(8, 10))
plt.scatter(
    x=data_gedi.geometry.x,
    y=data_gedi.geometry.y,
    c=data_gedi.elevation_hr,
    s=10,
)
granule_gdf.dissolve().boundary.plot(ax=ax, color="magenta")
cb = plt.colorbar()
cb.set_label("elevation_hr (m)")
../_images/13bc7a5cd0b8b9a1e5e17931ea49b4ba0d564657fbc0bb6e97e9f06c767591e5.png

NOTE Like ICESat-2, it’s possible to have points falling outside the estimated footprints

Spatial join: nearest points#

NOTE: we will not worry about the difference in time of acquisition between adjacent points for now

utm_crs = granule_gdf.estimate_utm_crs()
data_is2_utm = data_is2.to_crs(utm_crs)
data_gedi_utm = data_gedi.to_crs(utm_crs)
# find nearest IS2 point to each GEDI
close_gedi = data_gedi_utm.sjoin_nearest(
    data_is2_utm,
    how="left",
    max_distance=100,  # at most 100m apart
    distance_col="distances",
)
# index is GEDI subset (with _left added), with _right appended to is2 columns + Distances in meters
close_gedi = close_gedi[close_gedi["distances"].notna()]
fig, ax = plt.subplots()
plt.scatter(close_gedi.h_li, close_gedi.elevation_lm)
ax.axline((0, 0), slope=1, color="k", transform=ax.transAxes)
plt.xlabel("ATL06 elevation (m)")
plt.ylabel("GEDI L2A elevation_lm (m)")
plt.title(f"{len(close_gedi)} points within 100m")


# Add statistics and inset histogram to plot
inset_ax = inset_axes(ax, width="40%", height="30%", borderpad=2, loc="lower right")


residuals = close_gedi.h_li - close_gedi.elevation_lm
mean_residual = np.mean(residuals)
median_residual = np.median(residuals)
std_residual = np.std(residuals)
textstr = "\n".join(
    (
        f"Mean: {mean_residual:.2f}",
        f"Median: {median_residual:.2f}",
        f"Std: {std_residual:.2f}",
    )
)
props = dict(boxstyle="round", facecolor="wheat", alpha=0.5)
ax.text(
    0.05,
    0.95,
    textstr,
    transform=ax.transAxes,
    fontsize=10,
    verticalalignment="top",
    bbox=props,
)

_ = inset_ax.hist(residuals, bins=50, range=(-20, 20), color="gray")
inset_ax.axvline(0, color="k", linestyle="-", linewidth=1);
../_images/1402afd38cbc313cd7aba2f18ce9560febc67c664aed67e66fcfb33b720b8cfb.png

Sample 3DEP#

Sample 3DEP 1m DEM at the subset of GEDI Points

# Warning: this relies on TNM API, which is often down :(
close_gedi_3dep = coincident.io.sliderule.sample_raster(
    close_gedi,
    asset_key="usgs3dep-1meter-dem",
    # Restrict to only DEMs derived from specific WESM LiDAR project
    project_name=gf_lidar["project"].iloc[0],
)
# Add 3DEP elevation values to existing dataframe
close_gedi["elevation_3dep"] = close_gedi_3dep.value.values
# Geometries of nearest IS2 points
close_is2 = data_is2_utm.loc[close_gedi.time_ns_right]
# Sample 3DEP at the subset of GEDI Points
close_is2_3dep = coincident.io.sliderule.sample_raster(
    close_is2,
    asset_key="usgs3dep-1meter-dem",
    project_name=gf_lidar["project"].iloc[0],
)
close_is2["elevation_3dep"] = close_is2_3dep.value.values
plt.axvline(0, color="k", linestyle="dashed", linewidth=1)

diff_3dep = close_is2.h_li.values - close_is2.elevation_3dep
label = f"ICESat-2 (median={np.nanmedian(diff_3dep):.2f}, mean={np.nanmean(diff_3dep):.2f}, std={np.nanstd(diff_3dep):.2f}, n={len(diff_3dep)})"
plt.hist(diff_3dep, range=[-5, 5], bins=100, color="m", alpha=0.5, label=label)

diff_3dep = close_gedi.elevation_lm.values - close_gedi.elevation_3dep
label = f"GEDI2A (median={np.nanmedian(diff_3dep):.2f}, mean={np.nanmean(diff_3dep):.2f}, std={np.nanstd(diff_3dep):.2f}, n={len(diff_3dep)})"
plt.hist(diff_3dep, range=[-5, 5], bins=100, color="c", alpha=0.5, label=label)

plt.xlabel("elevation difference (m)")
plt.title("Altimeter Elevation - 3DEP 1m DEM")
plt.xlim(-5, 5)
plt.legend(loc="upper left");
../_images/b06641413d1e860424ff1d18ae504e9f02999e481c9ce950242a991b8500692b.png