Sliderule data I/O#
This notebook will highlight analyzing various coincident elevation measurements. We will find regions with and use slideruleearth.io service to retrieve ICESat-2 and GEDI point elevation measurements.
Note
Keep in mind, these measurements are from different sensor types, close in time, but not at exactly the same time. All measurements also have uncertainties, so we do not expect perfect agreement.
import coincident
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
import matplotlib.pyplot as plt
import numpy as np
# For testing
# import sliderule
# sliderule.init(url='slideruleearth.io', verbose=True)
/home/docs/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/src/coincident/io/download.py:27: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)
from tqdm.autonotebook import tqdm
%matplotlib inline
# %config InlineBackend.figure_format = 'retina'
Identify a primary dataset#
Start by loading a full resolution polygon of a 3DEP LiDAR workunit which has a known start_datetime and end_datatime:
workunit = "CO_WestCentral_2019"
df_wesm = coincident.search.wesm.read_wesm_csv()
gf_lidar = coincident.search.wesm.load_by_fid(
df_wesm[df_wesm.workunit == workunit].index
)
gf_lidar
| workunit | workunit_id | project | project_id | start_datetime | end_datetime | ql | spec | p_method | dem_gsd_meters | ... | seamless_category | seamless_reason | lpc_link | sourcedem_link | metadata_link | geometry | collection | datetime | dayofyear | duration | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | CO_WestCentral_2019 | 175984 | CO_WestCentral_2019_A19 | 175987 | 2019-08-21 | 2019-09-19 | QL 2 | USGS Lidar Base Specification 1.3 | linear-mode lidar | 1.0 | ... | Meets | Meets 3DEP seamless DEM requirements | https://rockyweb.usgs.gov/vdelivery/Datasets/S... | http://prd-tnm.s3.amazonaws.com/index.html?pre... | http://prd-tnm.s3.amazonaws.com/index.html?pre... | MULTIPOLYGON (((-106.17143 38.42061, -106.3208... | 3DEP | 2019-09-04 12:00:00 | 247 | 29 |
1 rows × 33 columns
Search secondary datasets#
Provide a list that will be searched in order. The list contains tuples of dataset aliases and the temporal pad in days to search before the primary dataset start and end dates
secondary_datasets = [
("gedi", 40), # +/- 40 days from lidar
("icesat-2", 60), # +/- 60 days from lidar
]
gf_gedi, gf_is2 = coincident.search.cascading_search(
gf_lidar,
secondary_datasets,
min_overlap_area=30, # km^2
)
Get ICESat-2 ATL06 Data#
We’ve identified 7 granules of icesat-2 data to examine, but there is no need to work with the entire granule, which spans a huge geographic extent. Instead we’ll retrieve a subset of elevation values in the area of interest for each granule.
# Hone in on single granule as a pandas series
i = 0
granule_gdf = gf_is2.iloc[[i]]
granule = gf_is2.iloc[i]
granule_gdf.plot() # needs to be a geodataframe
plt.title(granule.id);
# The icesat-2 track trends N-S, and is crossed by multiple GEDI tracks, resulting in the crosshatched appearance
data_is2 = coincident.io.sliderule.subset_atl06(
granule_gdf,
include_worldcover=True,
)
# Plot this data and overlay footprint
fig, ax = plt.subplots(figsize=(8, 10))
plt.scatter(
x=data_is2.geometry.x,
y=data_is2.geometry.y,
c=data_is2.h_li,
s=10,
)
granule_gdf.dissolve().boundary.plot(ax=ax, color="magenta")
cb = plt.colorbar()
cb.set_label("elevation_hr (m)")
Note
sliderule gets data from the envelope of multipolygon, not only data within intersecting GEDI tracks. Along-track gaps are where there is missing data.
# Here, we specify aoi=granule_gdf to only return GEDI data
# in our icesat-2 track of interest
data_gedi = coincident.io.sliderule.subset_gedi02a(
gf_gedi,
aoi=granule_gdf,
include_worldcover=True,
)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[8], line 3
1 # Here, we specify aoi=granule_gdf to only return GEDI data
2 # in our icesat-2 track of interest
----> 3 data_gedi = coincident.io.sliderule.subset_gedi02a(
4 gf_gedi,
5 aoi=granule_gdf,
6 include_worldcover=True,
7 )
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/src/coincident/_utils.py:29, in depends_on_optional.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
25 message = (
26 f"Optional dependency {module_name} not found ({func.__name__})."
27 )
28 raise ImportError(message)
---> 29 return func(*args, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/src/coincident/io/sliderule.py:131, in subset_gedi02a(gf, aoi, include_worldcover, include_3dep, sliderule_params)
128 if sliderule_params is not None:
129 params.update(sliderule_params)
--> 131 gfsr = gedi.gedi02ap(params, resources=granule_names)
133 if include_worldcover:
134 gfsr = _decode_worldcover(gfsr)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/sliderule/gedi.py:337, in gedi02ap(parm, callbacks, resources, keep_id, as_numpy_array, height_key)
295 def gedi02ap(parm, callbacks={}, resources=None, keep_id=False, as_numpy_array=False, height_key=None):
296 '''
297 Performs subsetting in parallel on GEDI data and returns geolocated footprints. This function expects that the **parm** argument
298 includes a polygon which is used to fetch all available resources from the CMR system automatically. If **resources** is specified
(...) 335 >>> rsps = gedi.gedi02ap(parms, asset=asset, resources=resources)
336 '''
--> 337 return __processing_request(parm, "gedil2a", callbacks, resources, keep_id, as_numpy_array, 'gedi02ap', 'gedi02arec', height_key)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/sliderule/gedi.py:178, in __processing_request(parm, asset, callbacks, resources, keep_id, as_numpy_array, api, rec, height_key)
175 rsps = sliderule.source(api, rqst, stream=True, callbacks=callbacks)
177 # Return GeoDataFrame
--> 178 return __flattenbatches(rsps, rec, 'footprint', parm, keep_id, as_numpy_array, height_key)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/sliderule/gedi.py:145, in __flattenbatches(rsps, rectype, batch_column, parm, keep_id, as_numpy_array, height_key)
143 for field_set in field_dictionary:
144 df = geopandas.pd.DataFrame(field_dictionary[field_set])
--> 145 gdf = geopandas.pd.merge(gdf, df, how='left', on='shot_number').set_axis(gdf.index)
147 # Delete Shot Number Column
148 if len(gdf) > 0 and not keep_id:
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/frame.py:5357, in DataFrame.set_axis(self, labels, axis, copy)
5319 @Appender(
5320 """
5321 Examples
(...) 5355 copy: bool | None = None,
5356 ) -> DataFrame:
-> 5357 return super().set_axis(labels, axis=axis, copy=copy)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/generic.py:792, in NDFrame.set_axis(self, labels, axis, copy)
746 def set_axis(
747 self,
748 labels,
(...) 751 copy: bool_t | None = None,
752 ) -> Self:
753 """
754 Assign desired index to given axis.
755
(...) 790 %(klass)s.rename_axis : Alter the name of the index%(see_also_sub)s.
791 """
--> 792 return self._set_axis_nocheck(labels, axis, inplace=False, copy=copy)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/generic.py:804, in NDFrame._set_axis_nocheck(self, labels, axis, inplace, copy)
800 else:
801 # With copy=False, we create a new object but don't copy the
802 # underlying data.
803 obj = self.copy(deep=copy and not using_copy_on_write())
--> 804 setattr(obj, obj._get_axis_name(axis), labels)
805 return obj
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/geopandas/geodataframe.py:223, in GeoDataFrame.__setattr__(self, attr, val)
221 object.__setattr__(self, attr, val)
222 else:
--> 223 super().__setattr__(attr, val)
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/generic.py:6313, in NDFrame.__setattr__(self, name, value)
6311 try:
6312 object.__getattribute__(self, name)
-> 6313 return object.__setattr__(self, name, value)
6314 except AttributeError:
6315 pass
File properties.pyx:69, in pandas._libs.properties.AxisProperty.__set__()
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/generic.py:814, in NDFrame._set_axis(self, axis, labels)
809 """
810 This is called from the cython code when we set the `index` attribute
811 directly, e.g. `series.index = [1, 2, 3]`.
812 """
813 labels = ensure_index(labels)
--> 814 self._mgr.set_axis(axis, labels)
815 self._clear_item_cache()
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/internals/managers.py:238, in BaseBlockManager.set_axis(self, axis, new_labels)
236 def set_axis(self, axis: AxisInt, new_labels: Index) -> None:
237 # Caller is responsible for ensuring we have an Index object.
--> 238 self._validate_set_axis(axis, new_labels)
239 self.axes[axis] = new_labels
File ~/checkouts/readthedocs.org/user_builds/coincident/checkouts/v0.4/.pixi/envs/dev/lib/python3.12/site-packages/pandas/core/internals/base.py:98, in DataManager._validate_set_axis(self, axis, new_labels)
95 pass
97 elif new_len != old_len:
---> 98 raise ValueError(
99 f"Length mismatch: Expected axis has {old_len} elements, new "
100 f"values have {new_len} elements"
101 )
ValueError: Length mismatch: Expected axis has 15190 elements, new values have 10296 elements
data_gedi
# Plot this data and overlay footprint
fig, ax = plt.subplots(figsize=(8, 10))
plt.scatter(
x=data_gedi.geometry.x,
y=data_gedi.geometry.y,
c=data_gedi.elevation_hr,
s=10,
)
granule_gdf.dissolve().boundary.plot(ax=ax, color="magenta")
cb = plt.colorbar()
cb.set_label("elevation_hr (m)")
NOTE Like ICESat-2, it’s possible to have points falling outside the estimated footprints
Spatial join: nearest points#
NOTE: we will not worry about the difference in time of acquisition between adjacent points for now
utm_crs = granule_gdf.estimate_utm_crs()
data_is2_utm = data_is2.to_crs(utm_crs)
data_gedi_utm = data_gedi.to_crs(utm_crs)
# find nearest IS2 point to each GEDI
close_gedi = data_gedi_utm.sjoin_nearest(
data_is2_utm,
how="left",
max_distance=100, # at most 100m apart
distance_col="distances",
)
# index is GEDI subset (with _left added), with _right appended to is2 columns + Distances in meters
close_gedi = close_gedi[close_gedi["distances"].notna()]
fig, ax = plt.subplots()
plt.scatter(close_gedi.h_li, close_gedi.elevation_lm)
ax.axline((0, 0), slope=1, color="k", transform=ax.transAxes)
plt.xlabel("ATL06 elevation (m)")
plt.ylabel("GEDI L2A elevation_lm (m)")
plt.title(f"{len(close_gedi)} points within 100m")
# Add statistics and inset histogram to plot
inset_ax = inset_axes(ax, width="40%", height="30%", borderpad=2, loc="lower right")
residuals = close_gedi.h_li - close_gedi.elevation_lm
mean_residual = np.mean(residuals)
median_residual = np.median(residuals)
std_residual = np.std(residuals)
textstr = "\n".join(
(
f"Mean: {mean_residual:.2f}",
f"Median: {median_residual:.2f}",
f"Std: {std_residual:.2f}",
)
)
props = dict(boxstyle="round", facecolor="wheat", alpha=0.5)
ax.text(
0.05,
0.95,
textstr,
transform=ax.transAxes,
fontsize=10,
verticalalignment="top",
bbox=props,
)
_ = inset_ax.hist(residuals, bins=50, range=(-20, 20), color="gray")
inset_ax.axvline(0, color="k", linestyle="-", linewidth=1);
Sample 3DEP#
Sample 3DEP 1m DEM at the subset of GEDI Points
Note
SlideRule returns all elevation data in EPSG:7912
close_gedi_3dep = coincident.io.sliderule.sample_3dep(
close_gedi,
# Restrict to only DEMs derived from specific WESM LiDAR project
project_name=gf_lidar["project"].iloc[0],
)
# Add 3DEP elevation values to existing dataframe
close_gedi["elevation_3dep"] = close_gedi_3dep.value.values
# Geometries of nearest IS2 points
close_is2 = data_is2_utm.loc[close_gedi.time_right]
# Sample 3DEP at the subset of GEDI Points
close_is2_3dep = coincident.io.sliderule.sample_3dep(
close_is2,
project_name=gf_lidar["project"].iloc[0],
)
close_is2["elevation_3dep"] = close_is2_3dep.value.values
plt.axvline(0, color="k", linestyle="dashed", linewidth=1)
diff_3dep = close_is2.h_li.values - close_is2.elevation_3dep
label = f"ICESat-2 (median={np.nanmedian(diff_3dep):.2f}, mean={np.nanmean(diff_3dep):.2f}, std={np.nanstd(diff_3dep):.2f}, n={len(diff_3dep)})"
plt.hist(diff_3dep, range=[-5, 5], bins=100, color="m", alpha=0.5, label=label)
diff_3dep = close_gedi.elevation_lm.values - close_gedi.elevation_3dep
label = f"GEDI2A (median={np.nanmedian(diff_3dep):.2f}, mean={np.nanmean(diff_3dep):.2f}, std={np.nanstd(diff_3dep):.2f}, n={len(diff_3dep)})"
plt.hist(diff_3dep, range=[-5, 5], bins=100, color="c", alpha=0.5, label=label)
plt.xlabel("elevation difference (m)")
plt.title("Altimeter Elevation - 3DEP 1m DEM")
plt.xlim(-5, 5)
plt.legend(loc="upper left");