Introduction
The purpose of this post is to highlight the value of vector tiles in delivering larger, complex geometries to a frontend client for visualization. Rather than simply converting the shapes file of a GTFS to a GeoJSON and serving that file, the data can be compressed and optimized for delivery using tippecanoe, which will convert GeoJSON to a vector tiles.
Streaming of geodata on read and write
In addition, I’ll use new line delimited JSON which allows me to stream each feature geometry in a feature collection on a shape by shape basis to the output GeoJSON file. This reduces memory overhead (do not have to hold the whole geometry in memory or the whole shapes.txt
file) and can allow the data to be applied onto larger, or more geometrically complex datasets.
Overview of algorithm
First, write the initial part of the Feature Collection:
After this, we can assemble each line string representing a route from the GTFS per each new line and add that to the file. Then, we can write the end of the Feature Collection and be done with the write of the file.
Here’s the full code for reading through the GTFS zip file’s shapes.txt
file and writing each new geometry as a new line:
Using tippecanoe to compress geodata and serve vector tiles
The output of the prior python process is the file transit_lines.geojson
. This file, for the LA Metro system, is over 19 MB. Naturally, this is too large to reasonably serve to the client.
With tippecanoe, we can compress and simplify these geometries simply using tippecanoe’s defaults.
Here’s the cli command that can be used to convert the GeoJSON to a vector tile set using two flags to auto-clean the geometries so that they can be served as compressed vector tiles:
This operation will output the transit_lines.mbtiles
file. We can see that the original file size (19MB) has now been reduced to 344KB. That is a 55x reduction in file size.
We can now view this file in browser, as shown in the below screen shot:
To quickly view the lines, there’s an npm
package called mbview
(package details here) that will quickly serve the data up locally:
Next steps
The automatic configurations for tippecanoe is not always 100% perfect. With such a large and complex dataset, simplification may have some undesired effects.
For example, different lines may be simplified in such a way that if you zoom in you will see that now all route path lines end up perfectly aligned with the road centerline or lane centerline such that a “spaghetti” mess of lines can show up along a road at higher zooms:
To tackle this, either decide on retaining higher levels of zoom accuracy (and sacrificing by ending up with a larger file size) or explore other settings in the tippecanoe documentation.