Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept URLs for describe, validate, and convert input #98

Merged
merged 1 commit into from
Oct 13, 2023
Merged

Conversation

tschaub
Copy link
Member

@tschaub tschaub commented Oct 13, 2023

This adds support for using URLs in addition to file paths as the input for the describe, validate, and convert commands.

# gpq describe https://github.com/opengeospatial/geoparquet/raw/v1.0.0/examples/example.parquet
╭────────────────────┬────────┬────────────┬────────────┬─────────────┬──────────┬───────────────────────┬───────────────────────────┬──────────────────────────╮
│ COLUMN             │ TYPE   │ ANNOTATION │ REPETITION │ COMPRESSION │ ENCODING │ GEOMETRY TYPES        │ BOUNDS                    │ DETAIL                   │
├────────────────────┼────────┼────────────┼────────────┼─────────────┼──────────┼───────────────────────┼───────────────────────────┼──────────────────────────┤
│ pop_est            │ double │            │ 0..1       │ snappy      │          │                       │                           │                          │
│ continent          │ binary │ string     │ 0..1       │ snappy      │          │                       │                           │                          │
│ name               │ binary │ string     │ 0..1       │ snappy      │          │                       │                           │                          │
│ iso_a3             │ binary │ string     │ 0..1       │ snappy      │          │                       │                           │                          │
│ gdp_md_est         │ int64  │            │ 0..1       │ snappy      │          │                       │                           │                          │
│ geometry           │ binary │            │ 0..1       │ snappy      │ WKB      │ Polygon, MultiPolygon │ [-180, -90, 180, 83.6451] │  edges │ planar          │
│                    │        │            │            │             │          │                       │                           │  crs   │ WGS 84 (CRS84)  │
├────────────────────┼────────┴────────────┴────────────┴─────────────┴──────────┴───────────────────────┴───────────────────────────┴──────────────────────────┤
│ Rows               │ 5                                                                                                                                        │
│ Row Groups         │ 1                                                                                                                                        │
│ GeoParquet Version │ 1.0.0                                                                                                                                    │
╰────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

You can also validate given an input URL (with or without the --metadata-only flag, with the flag skips scanning all geometries):

# gpq validate https://github.com/opengeospatial/geoparquet/raw/v1.0.0/examples/example.parquet --metadata-only
Summary: Passed 16 checks.

Metadata and schema checks only.  Skipped 4 data scanning checks.

 ✓ file must include a "geo" metadata key
 ✓ metadata must be a JSON object
 ✓ metadata must include a "version" string
 ✓ metadata must include a "primary_column" string
 ✓ metadata must include a "columns" object
 ✓ column metadata must include the "primary_column" name
 ✓ column metadata must include a valid "encoding" string
 ✓ column metadata must include a "geometry_types" list
 ✓ optional "crs" must be null or a PROJJSON object
 ✓ optional "orientation" must be a valid string
 ✓ optional "edges" must be a valid string
 ✓ optional "bbox" must be an array of 4 or 6 numbers
 ✓ optional "epoch" must be a number
 ✓ geometry columns must not be grouped
 ✓ geometry columns must be stored using the BYTE_ARRAY parquet type
 ✓ geometry columns must be required or optional, not repeated

And the same works for the convert command:

gpq convert https://github.com/opengeospatial/geoparquet/raw/v1.0.0/examples/example.parquet example.geojson

This doesn't yet add support for reading from blob storage. I'll add that separately.

Fixes #93.

@tschaub tschaub merged commit 449433e into main Oct 13, 2023
4 checks passed
@tschaub tschaub deleted the urls branch October 13, 2023 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

describe and validate remote geoparquet files
1 participant