stplanpy.geo module

This module performs various operations on geospatial vector data. Shapefiles of traffic analysis zones (TAZ), places, and counties can be found online.

stplanpy.geo.cent(gdf: geopandas.geodataframe.GeoDataFrame, column_name='tazce') geopandas.geodataframe.GeoDataFrame

Compute centroids from a GeoDataFrame.

This function computes the centroids of the geometries in a GeoDataFrame and returns them as a GeoDataFrame. By default the column name “tazce” is included in the new GeoDataFrame.

Parameters

column_name (str, defaults to "tazce") – Name of an input column to be included in the output GeoDataFrame. This value can also be a list of column names. The default column name is “tazce”.

Returns

GeoDataframe with centroids of the geometries in the input GeoDataFrame. The coordinate reference system (crs) of the output GeoDataFrame is the same as the input GeoDataframe.

Return type

geopandas.GeoDataFrame

See also

corr_cent

Examples

The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github.

from stplanpy import geo

# Limit calculation to these counties
counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"]

# Read TAZ data from zip file
taz = geo.read_shp("tl_2011_06_taz10.zip")

# Rename columns
taz.rename(columns = {"countyfp10":"countyfp", "tazce10":"tazce"}, inplace = True)

# filter by county
taz = taz[taz["countyfp"].isin(counties)]

# Compute centroids
taz_cent = taz.cent()
stplanpy.geo.corr_cent(gdf: geopandas.geodataframe.GeoDataFrame, index, lon, lat, index_name='tazce', crs='EPSG:4326') geopandas.geodataframe.GeoDataFrame

Correct centroid coordinates

This function can be used to manually correct centroid positions computed using the cent() function. The required input parameters are the index of the centroid that is corrected, and the new longitude, lon, and latitude, lat.

Parameters
  • index (str) – Index of the centroid that is modified.

  • lon (float) – New longitude of the corrected centroid. The default coordinate reference system (crs) is “EPSG:4326”.

  • lat (float) – New latitude of the corrected centroid. The default coordinate reference system (crs) is “EPSG:4326”.

  • index_name (str, defaults to "tazce") – Name of the column that the index variable refers to. The default name is “tazce”.

  • crs (str, defaults to "EPSG:4326") – Coordinate reference system (crs) of the lon and lat varibles. The default value is “EPSG:4326”.

Returns

GeoDataframe with corrected centroid. The coordinate reference system (crs) of the output GeoDataFrame is the same as the input GeoDataframe.

Return type

geopandas.GeoDataFrame

See also

cent

Examples

The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github.

from stplanpy import geo

# Limit calculation to these counties
counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"]

# Read TAZ data from zip file
taz = geo.read_shp("tl_2011_06_taz10.zip")

# Rename columns
taz.rename(columns = {"countyfp10":"countyfp", "tazce10":"tazce"}, inplace = True)

# filter by county
taz = taz[taz["countyfp"].isin(counties)]

# Compute centroids
taz_cent = taz.cent()

# Correct centroid location
taz_cent.corr_cent("00101155", -122.078052, 37.423328)
stplanpy.geo.in_county(plc: geopandas.geodataframe.GeoDataFrame, cnt: geopandas.geodataframe.GeoDataFrame, area_min=0.1) geopandas.geodataframe.GeoDataFrame

Check in which county a place is located

From one GeoDataFrame with places and one GeoDataFrame containing counties, this function computes in which county a place is situated. A threshold value is used to handle potential misalignment of the borders.

Parameters
  • cnt (geopandas.GeoDataFrame) – GeoDataFrame with the county geometries in which the TAZ are located.

  • area_min (float, defaults to 0.1) – If ratio of the surface area of a place inside a county devided by the full sarface area of a place is smaller than this threshold value, it is discarded. This is a workaround for geometries who’s borders are not fully aligned.

Returns

GeoDataframe with column names “placefp”, “name”, “countyfp”, and “geometry”. “countyfp” contains the index of the county.

Return type

geopandas.GeoDataFrame

See also

in_place

Examples

The example data files, “ca-county-boundaries.zip” and “tl_2020_06_place.zip”, can be downloaded from github.

from stplanpy import geo

# Limit calculation to these counties
counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"]

# Read County data from zip file
county = geo.read_shp("ca-county-boundaries.zip")

# Filter on county codes
county = county[county["countyfp"].isin(counties)]

# Select columns to keep
county = county[["name", "countyfp", "geometry"]]

# Read Place data from zip file
place = geo.read_shp("tl_2020_06_place.zip")

# Rename to Mountain View, Martinez
place.loc[(place["placefp"] == "49651"), "name"] = "Mountain View, Martinez"

# Compute which places lay inside which county
place = place.in_county(county)
stplanpy.geo.in_place(taz: geopandas.geodataframe.GeoDataFrame, plc: geopandas.geodataframe.GeoDataFrame, area_min=0.001, area_thr=0.9999) geopandas.geodataframe.GeoDataFrame

Check in which place a traffic analysis zone (TAZ) is located

From one GeoDataFrame with traffic analysis zones (TAZ) and one GeoDataFrame containing places, this function computes in which place a TAZ is situated. Threshold values are used to handle potential misalignment of the borders. If a TAZ is situated across multiple places, additional rows are created for the parts that are situated within each place.

Parameters
  • plc (geopandas.GeoDataFrame) – GeoDataFrame with the place geometries in which the TAZ are located.

  • area_min (float, defaults to 0.001) – If ratio of the surface area of a TAZ inside a place devided by the full sarface area of a TAZ is smaller than this threshold value, it is discarded. This is a workaround for geometries who’s borders are not fully aligned.

  • area_thr (float, defaults to 0.9999) – If ratio of the surface area of a TAZ inside a place devided by the full sarface area of a TAZ is larger than this threshold value, it is considered completely located within this place. This is a workaround for geometries who’s borders are not fully aligned.

Returns

GeoDataframe with column names “tazce”, “countyfp”, “placefp”, “geometry”, en “area”. “placefp” contains the index of the place.

Return type

geopandas.GeoDataFrame

See also

in_county

Examples

The example data files,”tl_2020_06_place.zip” and “tl_2011_06_taz10.zip”, can be downloaded from github.

from stplanpy import geo

# Limit calculation to these counties
counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"]

# Read TAZ data from zip file
taz = geo.read_shp("tl_2011_06_taz10.zip")

# Rename columns for consistency
taz.rename(columns = {"countyfp10":"countyfp", "tazce10":"tazce"}, inplace = True)

# Filter on county codes
taz = taz[taz["countyfp"].isin(counties)]

# Read Place data from zip file
place = geo.read_shp("tl_2020_06_place.zip")

# Compute which taz lay inside a place and which part
taz = taz.in_place(place)
stplanpy.geo.read_shp(file_name, tmp_dir='tmp', crs='EPSG:6933')

Read (zipped) shape files

Read (zipped) shape files into a GeoDataFrame with a number of default options. The coordinate reference system (crs) defaults to “EPSG:6933” and all the column names are made lower case.

Parameters
  • file_name (str) – Name and path of the (zipped) shape file.

  • tmp_dir (str, defaults to "tmp") – Name of temporary directory to extract the zip archive to.

  • crs (str, defaults to "EPSG:6933") – The coordinate reference system (crs) of the output GeoDataFrame. The default value is “EPSG:6933”.

Returns

GeoDataframe with all the column names found in the shape file inside the zip achive in lower case.

Return type

geopandas.GeoDataFrame

See also

to_geojson

Examples

The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github. Read a shape file:

import shutil
import zipfile
from stplanpy import geo

# Extract to temporal location
with zipfile.ZipFile("tl_2011_06_taz10.zip", "r") as zip_ref:
    zip_ref.extractall("tmp")

# Read taz data from shp file
taz = geo.read_shp("tmp/" + "tl_2011_06_taz10.shp")

# Clean up tmp files
shutil.rmtree("tmp")

Read a shape file from a zip file:

from stplanpy import geo

# Read taz data from zip file
taz = geo.read_shp("tl_2011_06_taz10.zip")
stplanpy.geo.to_geojson(gdf: geopandas.geodataframe.GeoDataFrame, file_name, crs='EPSG:4326')

Write GeoDataFrame to GeoJson file

Write GeoDataFrame to a GeoJson file with the default coordinate reference system (crs) “EPSG:4326”. If a GeoDataFrame has multiple columns containing geometries, only the column GeoDataFrame.geometry.name is kept.

Parameters

crs (str, defaults to "EPSG:4326") – The coordinate reference system (crs) of the output GeoJson file. The default value is “EPSG:4326”.

See also

read_shp

Examples

The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github.

from stplanpy import geo

# Read taz data from zip file
taz = geo.read_shp("tl_2011_06_taz10.zip")

# Write to file
taz.to_geojson("taz.GeoJson")