Using just parcel data shapes to estimate segments of road that are >45' feet wide. pic.twitter.com/SIvbCf0nIw
— Kuan Butts (@buttsmeister) February 16, 2018
Above: The tweet that was the basis for this blog post.
Introduction
Senate Bill 827 proposes upzoning within one quarter and one half mile of high quality transit stops. Identifying those zones is a task that has already been approached. Two examples include Transit Rich Housing, which identifies all upzoned areas in California, and my previous post on identifying issues with what constitutes a high quality bus stop.
Using either of these, resources, we can identify high quality stop locations. With these stops in hand, we have an additional hurdle. While we know that all sites are being upzoned, the proposition is that those facing streets wider than 45’ be up zoned more so.
The proposition currently states 45’ curb to curb but, as this San Francisco Planning Department memo points out, street width is more often measured via the right of way available. Regardless of how this is measured, it is this aspect of the bill that is most difficultly to measure. This is due to the fact that there is limited data available on street widths.
Post Structure and Logic
This post will propose a method of identification of street width that uses only parcel data. The reason for using only parcel data is that this is the most common and most consistent data available for major urban metros in California. While there is good road segment data for most of the major urban areas of the state available on OpenStreetMap, it is lacking in sufficient accuracy of street width.
Due to this, I want to explore a method that relied on the least amount of data possible. In this post, I will describe a method of identifying street width that relies solely on the geometries of the parcels that are within a quarter mile of each qualifying bus stop. Also, we will use the valid stops dataset that was generated from the workflow presented in my last blog post as a reference layer to find the bus stop locations that are to be evaluated.
Getting started
Here are the libraries we will be using to execute this post’s workflow:
Generating base data
As described earlier, the base stops points are based on the data used in this example project.
For the purposes of this post, we will work on just a subset of all these stops. The operation described is not terribly performant, so processing all stops in the state would be a task best performed with parallelization.
I’ve subset the stops to the area in the bounds, which I drew as a GeoJSON at geojson.io.
For parcels data, I’ll use the parcels dataset available from DataSF, San Francisco’s open data portal.
Both the stops and the parcels are then loaded up into GeoDataFrames and cast in 2163 equal area meter projection:
Creating target base layers
Now that we have our data represented as Python objects, we should be able to trip each down to just what we care about. In the case of the parcel data, we only care about the geometric shapes. As a result, we can pair the data down to just a GeoSeries:
Then, we can convert the bus stops in the target area to a series of Shapely Point objects:
Once that is done, we can also buffer the resulting GeoDataFrame and union the new point shapes:
This gives us the 1/4 mile coverage area from each bus stop that we are to be examining. All these parcels qualify under SB827 to be upzoned.
What needs to be identified is which of these are along a street greater than 45’? We will attempt to estimate this by performing a series of geometric operations on the parcel shapes against all of their neighbors to assess which parcels are on streets that are likely greater than 45’.
To do this, let’s first subset all the parcel shapes to just those that intersect with these 1/4-mile buffer areas:
Here, we can see the resulting subset:
Similarly, let’s generate the same data, but for areas at a 45-foot and 200-foot buffer around each parcel:
Because we will be performing a number of spatial intersections on these GeoDataFrames as a whole, it is worthwhile to produce spatial indices for each:
These will be used in conjunction with the following function:
This allows us to perform the intersections while leveraging Rtree which will enable faster intersections on these larger parcel datasets. Note: Thanks to Twitter user @sidkap_ for reminding me to use spatial indexes here.
At this point, we are ready to enter the main analysis logic.
Core algorithm introduction
At this point, we will iterate through each of the geometries in the parcels subset GeoDataFrame.
Note that this isn’t going to be the most elegant method, but will be sufficient for outlining how such a system would work. Further refinements should be made to improve this in the name of both efficiency and legibility.
Core algorithm
What will be described below will be performed once for each for loop cycle.
First, we find all parcels that are closer than the 45’ distance. We assume the space between these parcels is either none or a street less than 45’ away. We use the pre-buffered 45’ dataset Against the target parcel to find all other parcels that fall within this definition.
We do the same with the 200’ buffered dataset. 200’ is an arbitrary value. I use it as a distance that I assume most street widths will not exceed and thus, as a result, I feel safe omitting all parcels that fall outside of that distance when trying to find street area.
Note: Prior to the update to include spatial indexing, the above 2 intersection operations looked like this:
Now what I want is just the parcels that are within 200’ of the target parcel but also not within 45’ of the target parcel:
From this point on, we need to iterate through all_nearby
and find just the parcels that are “across” from the target parcel. To do this we perform the following looped operation:
I’ll break this down here.
For each nearby parcel, we first get the distance it is away from the target parcel:
There’s a redundant sanity check to just toss any definite outliers:
Then we buffer the target parcel to the edge of the other nearby parcel:
Note: The above image an example showing the operation that is being performed. The site used in this example is a different location that the Dogpatch subset. This is for explanatory purposes only.
Then I do something janky: I buffer the other parcel by 1 meter and check what the intersection size is.
If the size of the intersection area is small, I can assume we have a situation where the parcel is diagonal to the target parcel. Thus, only a corner is intersecting. These parcels will be ignored.
If the nearby parcel passes this sniff test, I buffer both it and the target parcel the same distance (the distance each is away from the other). I then intersect these two buffers and send that shape up to the intersecting areas tracking list:
Note: The above image an example showing the operation that is being performed. The site used in this example is a different location that the Dogpatch subset. This is for explanatory purposes only.
Once the loop is done, I can union all the intersection areas and create a single geometry for that parcels adjoining street shapes:
Note: The above image an example showing the operation that is being performed. The site used in this example is a different location that the Dogpatch subset. This is for explanatory purposes only.
Processing intersection shapes of nearby streets
Once each parcel has been processed, we have a list of nearby street shapes for each parcel. These need to be joined together and then the shape of the parcel footprints punched back out of the shapes (to trim them up):
Final results
Once we have generated the final variation of the uus_diffed
variable, we simply need to identify which parcels touch this shape representing large streets. Those parcels that do abut this geometry are parcels that get the additional wide street bonus (highlighted in green). Those that do not abut this shape do not receive that additional height increase.
As you can see in the resulting image, most of San Francisco is likely to be upzoned at the wide street height, due to the size of its roads. This is an observation that was echoed in the SF Planning Department’s memo from last week.
A note on the addition of spatial indexing
Thanks to Twitter user @sidkap_ for reminding me to use spatial indexing to speed up the intersection checks. Originally the Dogpatch operation shown above took about 35 minutes to complete. With the implementation of spatial indexing, the runtime was improved, to under 15 minutes.