Changelog¶

All notable changes to pySDP are documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]¶

[0.6.0] - 2026-05-15¶

Added¶

Data-products issue tracker — ports rSDP v0.6's sdp_issues.R. Dataset feedback lives in a dedicated GitHub repo (rmbl-sdp/sdp-products), separate from the pysdp code repo.
pysdp.report_issue(catalog_id, type=None, open=True) — opens the prefilled GitHub Issue Form. Validates catalog_id and type; warns on an unknown CatalogID but still builds the URL.
pysdp.known_issues(catalog_id=None, refresh=False) — paginated GitHub API fetch returning a tidy DataFrame of open issues (columns: CatalogID, number, title, type, severity, status, created, updated, url). On-disk JSON cache at $XDG_CACHE_HOME/pysdp/open_issues.json (TTL 1 h) with offline-graceful fallback to stale cache. Honors GITHUB_TOKEN / GITHUB_PAT to bump the rate limit.
get_catalog(with_issue_counts=True) attaches an OpenIssues Int64 column (zero-filled where no issues exist).
browse(with_issue_counts=True) renders a ⚠ N open issue(s) badge linking to the filtered issues list.
23 new tests in tests/test_issues.py.

Issue-tracker infrastructure (Issue Forms, the STAC patch-in-place refresh script, the validation bot) lives in rSDP/stac-gen/ and the upstream sdp-products repo; the generated STAC catalog is shared between rSDP and pySDP, so this release ports only the client-side helpers.

[0.5.0] - 2026-05-15¶

Added¶

Dataset version-control — ports rSDP v0.5. The catalog now carries a NewVersionID column pointing deprecated products at their replacements.
get_catalog(include_deprecated=False) (new canonical kwarg) hides deprecated rows by default; True includes them.
browse(include_deprecated=True) renders deprecated cards with a tinted background and a deprecated → NEWID badge.
open_raster(), get_metadata(), and get_dates() emit a UserWarning when called with a deprecated CatalogID, pointing at the successor.
Catalog loader hard-fails when Data.URL / Metadata.URL are empty (rather than silently producing broken thumbnails or license="proprietary" fallbacks downstream).
Packaged catalog rotated to the 2026-05-15 snapshot.
22 new tests in tests/test_versioning.py.

Changed¶

Soft-deprecation of get_catalog(deprecated=...) and browse(deprecated=...). The legacy deprecated= kwarg is still accepted but emits DeprecationWarning and will be removed in a future release. Use include_deprecated= instead (semantic shift: include_deprecated=True returns BOTH current and deprecated rows; to get deprecated-only, filter the result on the Deprecated column).

[0.2.0] - 2026-05-01¶

Added¶

Weekly drone imagery + irregular time-series — ports rSDP v0.3.
New TimeSeriesType="Weekly". R6D001 (RGB orthomosaics) and R6D002 (multispectral indices) are the first products to use it. Bbox / resolution / nodata vary per date, so weekly imagery returns dict[str, xarray.Dataset] (one entry per date) instead of a single stacked Dataset.
open_raster(..., dates=[...]) for explicit date selection on irregular products; also accepts date_start / date_end.
open_raster(..., bands=[...]) to subset bands of a multi-band raster (lazy isel(band=...)).
pysdp.get_dates(catalog_id, source="auto"|"stac"|"manifest") — discovers available dates. Regular products (Yearly / Monthly / Daily) compute dates deterministically from MinDate / MaxDate; irregular products read from a baked manifest (pysdp/data/manifests/<CatalogID>.json) or query the live STAC catalog.
{calendarday} template placeholder for products keyed by ordinal day-of-year strings instead of zero-padded {day}.
Packaged catalog rotated to the 2026-04-29 snapshot (162 products).

[0.1.3] - 2026-04-24¶

Fixed¶

browse() renders reliably across Positron, JupyterLab, VS Code, and classic Notebook. Replaced iframe / inline-CSS / JavaScript rendering with plain HTML attributes (width, bgcolor, cellpadding) and a <table> layout so every notebook host's HTML sanitizer policy passes the same output.
Each browse() card now includes a thumbnail, product metadata, an SDP Browser link using the simplified #add=CatalogID URL format, and a copyable pysdp.open_raster() snippet.

[0.1.2] - 2026-04-23¶

Added¶

browse() cards gain SDP Browser links (open the web map viewer) and a copyable pysdp.open_raster() snippet alongside each thumbnail.

[0.1.1] - 2026-04-17¶

Added¶

pysdp.browse(...): HTML-grid catalog browser for Jupyter notebooks, with product thumbnails and overlaid metadata. Accepts the same filters as get_catalog() plus layout controls (columns, width, max_products).
Thumbnail.URL column added to get_catalog() output — derived from Data.URL using the stac-gen thumbnail convention.

[0.1.0] - 2026-04-16¶

First feature-complete release: port of rSDP v0.2 with idiomatic Python types. Tested on Python 3.11 / 3.12 / 3.13 × Linux / macOS / Windows.

Added¶

Phase 6b docs expansion — visual assets in the User Guides (PNG plots + folium HTML maps generated by scripts/build_guide_assets.py from real SDP data). Guides cloud-data.md, field-sampling.md, and pretty-maps.md now include rendered images / iframes showing domain coverage, field sites on the UG 3 m bare-earth DEM (R3D009 — matching the rSDP vignettes), extracted elevations, and multi-panel overlays. All 8 assets live at docs/guides/assets/. The asset script aggressively coarsens the 1 GB DEM before plotting (factor 60×60) so docs pages stay lightweight.
mapclassify>=2.6 added to the [viz] extra (required by GeoDataFrame.explore(column=...)).

Fixed¶

pysdp.io.vsicurl.gdal_defaults() no longer sets CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".tif,.TIF,.tiff". That env var leaked process-globally and blocked any VSICURL-backed open of GeoJSON / GeoPackage / Shapefile URLs after a pysdp raster was opened in the same process. GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR alone already achieves the main performance goal (skipping sidecar probes) without the side-effect. Matches ROADMAP §2 Principle 5 (scoped defaults, never clobber).
Phase 5 — Bulk download (SPEC.md §9):
pysdp.download(urls=..., output_dir=..., ...) — fetch SDP COGs to local disk. Accepts urls= (string or list) or catalog_ids= (mutually exclusive, exactly one required).
catalog_ids expansion: Single → 1 URL; Yearly → every catalog year; Monthly → every catalog month between MinDate and MaxDate. Daily raises a descriptive ValueError (expansion would be open-ended; users pass explicit urls= or open via open_raster(...) with a date range first).
Existing-file pre-check mirrors rSDP: files > 1 kB on disk are considered valid and skipped unless overwrite=True. Partial files (< 1 kB) use HTTP Range resume when resume=True.
Returns a pandas.DataFrame status report with columns [url, dest, success, status, size, error] — one row per URL (including skipped existing files). return_status=False → None.
Backend: threaded requests via concurrent.futures.ThreadPoolExecutor (core deps only). Faster backends (obstore, fsspec+s3fs) deferred to ROADMAP §Phase 7.
Tests: 22 unit tests (input validation, catalog_id expansion for each TimeSeriesType, pre-check logic, responses-mocked end-to-end flow, overwrite + resume, HTTP error propagation) + 1 @pytest.mark.network integration test that downloads R1D001 (~4 MB UER streamlines) from real S3 in ~1.5 s.
Smoke test's NotImplementedError placeholder parametrize list is now empty — all public API functions are implemented.
Phase 4 — Point & polygon extraction (SPEC.md §9):
pysdp.extract_points(raster, locations, ...) — sample raster values at point locations. Accepts a GeoDataFrame or a plain DataFrame with x/y columns + explicit crs. Auto-reprojects locations to raster CRS when they differ (with a verbose message). method="linear" (default, bilinear via xr.interp) or method="nearest" (via xvec.extract_points).
pysdp.extract_polygons(raster, locations, stats="mean", ...) — zonal stats via xvec.zonal_stats. Default exact=False (centroid inclusion, parity with rSDP / terra::extract). exact=True and all_cells=True raise NotImplementedError pointing at ROADMAP §Phase 8a; exact=False covers the common case.
Port of rSDP's .filter_raster_layers_by_time() as _filter_by_time: years= or date_start/date_end= filters for time-indexed rasters, with error-on-empty-overlap and warn-on-partial-overlap semantics.
Output is long-form GeoDataFrame: one row per point (or polygon) for single-layer rasters; one row per (geometry × time) for time-series. bind=True (default) merges input attribute columns onto the output; bind=False returns just geometry + extracted values.
Tests: 34 new unit tests with synthetic local rasters + one @pytest.mark.network integration test that extracts elevation at three real RMBL field sites from the UG 3 m DEM (R3D009); verifies elevations fall in the sensible 2000–4500 m range for the Gunnison basin. All pass locally.
scipy>=1.11 added to core deps (required for xarray's interp(method='linear'), the bilinear extraction path; standard scientific Python dep).
Phase 3 — Raster access (SPEC.md §9):
pysdp.open_raster(catalog_id, ...) — lazy cloud COG access via rioxarray.open_rasterio over GDAL VSICURL. Returns xarray.Dataset with one data variable named after the product's canonical short name. Dims (y, x) / (band, y, x) for Single; (time, y, x) for Yearly / Monthly / Daily.
pysdp.open_raster(url=...) — URL-direct branch, no catalog lookup, no scale/offset application (matches rSDP).
pysdp.open_stack(catalog_ids, align="exact") — multi-product loader. align="exact" default verifies CRS + transform + shape consistency; mismatches raise a descriptive error listing which products drifted. align="reproject" is planned for Phase 7 and raises NotImplementedError today.
Time coordinate is uniformly pandas.DatetimeIndex: Daily → actual date, Monthly → first-of-month, Yearly → Jan 1. Enables ds.sel(time="2019"), .resample(), groupby("time.year") across all TimeSeriesTypes.
CRS set to EPSG:32613 via rio.write_crs(). Scale/offset from the catalog attached as CF scale_factor / add_offset attrs on the data variable (scale = 1 / DataScaleFactor per rSDP's convention; xarray.decode_cf() or mask_and_scale=True materializes the real values).
pysdp.io.vsicurl.gdal_defaults() + ensure_gdal_defaults() — minimal GDAL VSICURL env for cloud COGs (GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR, CPL_VSIL_CURL_ALLOWED_EXTENSIONS=.tif,.TIF,.tiff, VSI_CACHE=TRUE, VSI_CACHE_SIZE=5000000). Applied via os.environ.setdefault — never clobbers user-set values. Full cloud-tuned set (HTTP/2 etc.) lands in Phase 7.
chunks="auto" (default) gracefully falls back to eager reads with a UserWarning when dask is not installed; install pysdp[dask] for lazy Dask-backed reads.
download=True raises NotImplementedError pointing at Phase 5's pysdp.download().
Added pysdp._catalog_data.lookup_catalog_row() as a shared helper; get_metadata and open_raster now use it, emitting the same descriptive "unknown catalog_id" error.
Tests: 30 new unit tests in tests/test_raster.py (canonical-variable naming, time-coord construction, chunks fallback, GDAL-env safety, Dataset assembly via synthetic local COGs, open_stack grid-alignment checks) + 3 network integration tests against real S3 (Single + Daily + URL-direct branches).
dask[array]>=2024.1 added to the test dependency group so CI covers the chunked read path.
Phase 2 — Argument validation + time-slice resolvers (SPEC.md §9):
pysdp.io.template.substitute_template() — {year}/{month}/{day} URL-template substitution with scalar/vector recycling and length-consistency checks. Port of rSDP's .substitute_template().
pysdp._validate.validate_user_args() — pre-catalog-lookup validation; zero-pads months to two-digit strings; rejects invalid combinations of catalog_id/url/date_start/date_end/ download_files/download_path.
pysdp._validate.validate_args_vs_type() — post-lookup check for whether a time-arg combo is valid for a given TimeSeriesType (Single rejects all time args; Yearly rejects months + years∧dates; Monthly requires months with years; Daily requires dates only).
pysdp._resolve.resolve_time_slices() and per-type resolvers (resolve_single, resolve_yearly, resolve_monthly, resolve_daily) returning a TimeSlices(paths, names) named tuple. Pure functions, no network, no raster I/O.
Preserved behavior carry-overs from rSDP: anchor-day seq(by="year"/"month") semantics for Yearly/Monthly date-range branches; 30-layer default clip for Daily datasets with no date bounds; error-on-empty-overlap; warn-on-partial-overlap.
52 new unit tests: test_template.py (8 tests), test_validate.py (20 tests), test_resolve.py (24 tests). Ports rSDP's 32 testthat tests across test-internal_resolve.R and test-internal_validate.R, plus additional edge-case coverage.
Phase 1 — Catalog + metadata (SPEC.md §9):
pysdp.get_catalog() with three sources: packaged (default; offline, emits a UserWarning when the snapshot is older than SDP_STALENESS_MONTHS / default 6 months), live (refetches the CSV from S3), stac (returns a pystac.Catalog for the SDP static STAC v1 catalog; filter args ignored with a warning).
pysdp.get_metadata(catalog_id, as_dict=True) — fetches QGIS-style XML metadata; returns a dict via xmltodict or an lxml element. Descriptive KeyError on unknown catalog_id includes the snapshot date.
Packaged catalog CSV snapshot: SDP_product_table_04_14_2026.csv (156 products across UG/UER/GT/GMUG domains). Loaded via importlib.resources in pysdp._catalog_data. Handles both m/d/y and m/d/Y date formats mixed across rows, and preserves rSDP's sysdata.rda baking model.
scripts/update_catalog.py — mirrors rSDP's data-raw/SDP_catalog.R; downloads a fresh CSV from S3 and rotates the packaged snapshot.
Test suite: 48 unit tests (filter validation, date parsing, staleness warning, synthetic-DataFrame filter logic, responses-mocked HTTP) + 3 live integration tests under @pytest.mark.network.
Initial Phase 0 scaffolding: pyproject.toml (hatchling + hatch-vcs), src/pysdp/ package skeleton with public-API stubs, constants.py with real SDP catalog values (CRS, domains, types, releases, timeseries types), tests/ smoke tests, CI workflows (lint, type-check, test matrix on Python 3.11/3.12/3.13 × linux/macOS/windows), release workflow with PyPI Trusted Publishing, docs workflow stub, .pre-commit-config.yaml, MIT license. See SPEC.md §9 Phase 0.