Home / Data: Planet, Coastlines, Buildings, Admins, Roads / Attribution

Daylight Earth Table

Overview: Building, Infrastructure, Land, Landuse, Placename, POI, Road, Transit, Water

The earth table is a new data schema that classifies OpenStreetMap-style tags into a 3-level ontology: theme, class, and subclass. A small fishpond, for example, would be found in the water theme, pond class, and fishpond subclass. A kindergarten school is found in the building theme, education class, kindergarten subclass.

The Daylight Earth Table is the result of running the earth table classification over the latest release (v1.43) of the Daylight Map Distribution. The daylight earth table is available as parquet files on Amazon S3. The instructions on this page will create the table inside your AWS account so that you can access the features with Amazon Athena.

Resources

Available Themes

ThemeClassesSubclassesFeatures in v1.43
building 20 223 592,128,847
building_detail** 1 1 3,062,309
infrastructure 7 48 43,204,249
land 10 19 50,813,775
landuse 25 97 44,975,701
placename 3 11 3,680,869
poi 22 232 45,784,428
road 10 32 240,906,467
transit 6 32 7,338,780
water 10 29 44,934,174

** The building detail theme includes all building:part=yes features in OSM. They do not currently have different class/subclass values. Instead, they contain the IDs of the building footprint(s) with which they overlap.

Daylight Earth Table Schema

Name Type Description
geometry_id string unique identifier of the form (n/w/r) + osm id + @ (osm version). Example, version 22 of node id=1 : n1@22
theme string theme of the feature (one of the 10 themes above)
class string class of the feature
subclass string subclass of the feature
names string JSON-formatted key/value pairs containing the placename in different languages. Keys include local (the common name used in the place) and ISO language codes like en, en-US, and de
metadata string JSON-formatted key/value pairs of feature metadata. To retrieve height of a feature, for example: JSON_EXTRACT_SCALAR(metadata, '$.height_m')
original_source_tags string JSON-formatted key/value pairs of original OSM tags
source string The daylight release, currently v1.18
quadkey string The zoom level 15 quadkey that contains the centroid of the feature.
wkt string well-known text of the feature Coordinate system: EPSG 4326

Getting started with Amazon Athena

  1. Run the following query in the Athena query editor to create the daylight_earth table

     CREATE EXTERNAL TABLE `daylight_earth`(
       `geometry_id` string,
       `class` string,
       `subclass` string,
       `metadata` string,
       `original_source_tags` string,
       `names` string,
       `quadkey` string,
       `wkt` string)
     PARTITIONED BY (
       `release` varchar(5),
       `theme` string)
     STORED AS PARQUET
     LOCATION
       's3://daylight-openstreetmap/earth'
     TBLPROPERTIES (
       'has_encrypted_data'='false',
       'parquet.compression'='GZIP')
    
  2. run MSCK REPAIR TABLE daylight_earth to load the partitions.

Now you have access to nearly 1B OSM features each with theme, class, and subclass attributes. metadata, original_source_tags, and names are stored JSON-formatted strings. These attributes can be accessed with the JSON_EXTRACT_SCALAR function.

Example Query:

For example, let’s extract all of the parks in Minneapolis / St. Paul:

SELECT
	geometry_id,
	class,
	subclass,
	JSON_EXTRACT_SCALAR(names, '$.local') as name,
	CAST(JSON_EXTRACT_SCALAR(metadata, '$.surface_area_sq_m') AS double) as area,
	wkt
from daylight_earth
WHERE release = 'v1.43'
  AND theme = 'landuse'
  AND class = 'park'
  AND subclass <> 'grass'
  AND ST_CONTAINS(
  	ST_GEOMETRYFROMTEXT(
  		'POLYGON((-93.67167390882966 44.88741416375285,-93.29662598669525 45.24265977467033,-92.84310065209863 44.96670966763787,-93.21197278797622 44.64095920238205,-93.67167390882966 44.88741416375285))'),
  	ST_GEOMETRYFROMTEXT(wkt)
  )