Subdivide All the Things

One of the things that makes managing geospatial data challenging is the huge variety of scales that geospatial data covers: areas as large as a continent or as small as a man-hole cover.

The data in the database also covers a wide range, from single points, to polygons described with thousands of vertices. And size matters! A large object takes more time to retrieve from storage, and more time to run calculations on.

The Natural Earth countries file is a good example of that variation. Load the data into CartoDB and inspect the object sizes using SQL:

SELECTadmin,ST_NPoints(the_geom),ST_MemSize(the_geom)FROMne_10m_admin_0_countriesORDERBYST_NPoints;

Coral Sea Islands are represented with a 4 point polygon, only 112 bytes.
Canada is represented with a 68159 point multi-polygon, 1 megabytes in size!

Image may be NSFW.
Clik here to view. Countries by Size in KB

Over half (149) of the countries in the table are larger than the database page size (8Kb) which means they will take extra time to retrieve.

SELECTCount(*)FROMne_10m_admin_0_countriesWHEREST_MemSize(the_geom)>8192;

We can see the overhead involved in working with large data by forcing a large retrieval and computation.

Load the Natural Earth populated places into CartoDB as well, and then run a full spatial join between the two tables:

SELECTCount(*)FROMne_10m_admin_0_countriescountriesJOINne_10m_populated_places_simpleplacesONST_Contains(countries.the_geom,places.the_geom)

Even though the places table (7322) and countries table (255) are quite small the computation still takes several seconds (about 30 seconds on my computer).

The large objects cause a number of inefficiencies:

Geographically large areas (like Canada or Russia) have large bounding boxes, so the indexes don’t work as efficiently in winnowing out points that don’t fall within the countries.
Physically large objects have large vertex lists, which take a long time to pass through the containment calculation. This combines with the poor winnowing to make a bad situation worse.

How can we speed things up? Make the large objects smaller using ST_Subdivide()!

First, generate a new, sub-divided countries table:

CREATETABLEne_10m_admin_0_countries_subdividedASSELECTST_SubDivide(the_geom)ASthe_geom,adminFROMne_10m_admin_0_countries;

Remember to register the table with CartoDB, so that the editor interface can pick it up:

SELECTCDB_CartodbfyTable('ne_10m_admin_0_countries_subdivided');

Now we have the same data, but no object is more than 255 vertices (about 4Kb) in size!

Image may be NSFW.
Clik here to view. Subdivided Countries by Size in KB

Run the spatial join torture test again, and see the change!

SELECTCount(*)FROMne_10m_admin_0_countries_subdividedcountriesJOINne_10m_populated_places_simpleplacesONST_Contains(countries.the_geom,places.the_geom)

On my computer, the return time about 0.5 seconds, or 60 times faster, even though the countries table is now 8633 rows. The subdivision has accomplished two things:

Each polygon now covers a smaller area, so index searches are less likely to pull up points that are not within the polygon.
Each polygon is now below the page size, so retrieval from disk will be much faster.

Subdividing big things can make map drawing faster too, but beware: once your polygons are subdivided you’ll have turn off the polygon outlines to avoid showing the funny square boundaries in your rendered map.

Happy mapping and querying!

Subdivide All the Things

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112