stplanpy.geo module¶
This module performs various operations on geospatial vector data. Shapefiles of traffic analysis zones (TAZ), places, and counties can be found online.
- stplanpy.geo.cent(gdf: geopandas.geodataframe.GeoDataFrame, column_name='tazce') geopandas.geodataframe.GeoDataFrame ¶
Compute centroids from a GeoDataFrame.
This function computes the centroids of the geometries in a GeoDataFrame and returns them as a GeoDataFrame. By default the column name “tazce” is included in the new GeoDataFrame.
- Parameters
column_name (str, defaults to "tazce") – Name of an input column to be included in the output GeoDataFrame. This value can also be a list of column names. The default column name is “tazce”.
- Returns
GeoDataframe with centroids of the geometries in the input GeoDataFrame. The coordinate reference system (crs) of the output GeoDataFrame is the same as the input GeoDataframe.
- Return type
geopandas.GeoDataFrame
See also
Examples
The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github.
from stplanpy import geo # Limit calculation to these counties counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"] # Read TAZ data from zip file taz = geo.read_shp("tl_2011_06_taz10.zip") # Rename columns taz.rename(columns = {"countyfp10":"countyfp", "tazce10":"tazce"}, inplace = True) # filter by county taz = taz[taz["countyfp"].isin(counties)] # Compute centroids taz_cent = taz.cent()
- stplanpy.geo.corr_cent(gdf: geopandas.geodataframe.GeoDataFrame, index, lon, lat, index_name='tazce', crs='EPSG:4326') geopandas.geodataframe.GeoDataFrame ¶
Correct centroid coordinates
This function can be used to manually correct centroid positions computed using the
cent()
function. The required input parameters are the index of the centroid that is corrected, and the new longitude, lon, and latitude, lat.- Parameters
index (str) – Index of the centroid that is modified.
lon (float) – New longitude of the corrected centroid. The default coordinate reference system (crs) is “EPSG:4326”.
lat (float) – New latitude of the corrected centroid. The default coordinate reference system (crs) is “EPSG:4326”.
index_name (str, defaults to "tazce") – Name of the column that the index variable refers to. The default name is “tazce”.
crs (str, defaults to "EPSG:4326") – Coordinate reference system (crs) of the lon and lat varibles. The default value is “EPSG:4326”.
- Returns
GeoDataframe with corrected centroid. The coordinate reference system (crs) of the output GeoDataFrame is the same as the input GeoDataframe.
- Return type
geopandas.GeoDataFrame
See also
Examples
The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github.
from stplanpy import geo # Limit calculation to these counties counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"] # Read TAZ data from zip file taz = geo.read_shp("tl_2011_06_taz10.zip") # Rename columns taz.rename(columns = {"countyfp10":"countyfp", "tazce10":"tazce"}, inplace = True) # filter by county taz = taz[taz["countyfp"].isin(counties)] # Compute centroids taz_cent = taz.cent() # Correct centroid location taz_cent.corr_cent("00101155", -122.078052, 37.423328)
- stplanpy.geo.in_county(plc: geopandas.geodataframe.GeoDataFrame, cnt: geopandas.geodataframe.GeoDataFrame, area_min=0.1) geopandas.geodataframe.GeoDataFrame ¶
Check in which county a place is located
From one GeoDataFrame with places and one GeoDataFrame containing counties, this function computes in which county a place is situated. A threshold value is used to handle potential misalignment of the borders.
- Parameters
cnt (geopandas.GeoDataFrame) – GeoDataFrame with the county geometries in which the TAZ are located.
area_min (float, defaults to 0.1) – If ratio of the surface area of a place inside a county devided by the full sarface area of a place is smaller than this threshold value, it is discarded. This is a workaround for geometries who’s borders are not fully aligned.
- Returns
GeoDataframe with column names “placefp”, “name”, “countyfp”, and “geometry”. “countyfp” contains the index of the county.
- Return type
geopandas.GeoDataFrame
See also
Examples
The example data files, “ca-county-boundaries.zip” and “tl_2020_06_place.zip”, can be downloaded from github.
from stplanpy import geo # Limit calculation to these counties counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"] # Read County data from zip file county = geo.read_shp("ca-county-boundaries.zip") # Filter on county codes county = county[county["countyfp"].isin(counties)] # Select columns to keep county = county[["name", "countyfp", "geometry"]] # Read Place data from zip file place = geo.read_shp("tl_2020_06_place.zip") # Rename to Mountain View, Martinez place.loc[(place["placefp"] == "49651"), "name"] = "Mountain View, Martinez" # Compute which places lay inside which county place = place.in_county(county)
- stplanpy.geo.in_place(taz: geopandas.geodataframe.GeoDataFrame, plc: geopandas.geodataframe.GeoDataFrame, area_min=0.001, area_thr=0.9999) geopandas.geodataframe.GeoDataFrame ¶
Check in which place a traffic analysis zone (TAZ) is located
From one GeoDataFrame with traffic analysis zones (TAZ) and one GeoDataFrame containing places, this function computes in which place a TAZ is situated. Threshold values are used to handle potential misalignment of the borders. If a TAZ is situated across multiple places, additional rows are created for the parts that are situated within each place.
- Parameters
plc (geopandas.GeoDataFrame) – GeoDataFrame with the place geometries in which the TAZ are located.
area_min (float, defaults to 0.001) – If ratio of the surface area of a TAZ inside a place devided by the full sarface area of a TAZ is smaller than this threshold value, it is discarded. This is a workaround for geometries who’s borders are not fully aligned.
area_thr (float, defaults to 0.9999) – If ratio of the surface area of a TAZ inside a place devided by the full sarface area of a TAZ is larger than this threshold value, it is considered completely located within this place. This is a workaround for geometries who’s borders are not fully aligned.
- Returns
GeoDataframe with column names “tazce”, “countyfp”, “placefp”, “geometry”, en “area”. “placefp” contains the index of the place.
- Return type
geopandas.GeoDataFrame
See also
Examples
The example data files,”tl_2020_06_place.zip” and “tl_2011_06_taz10.zip”, can be downloaded from github.
from stplanpy import geo # Limit calculation to these counties counties = ["001", "013", "041", "055", "075", "081", "085", "095", "097"] # Read TAZ data from zip file taz = geo.read_shp("tl_2011_06_taz10.zip") # Rename columns for consistency taz.rename(columns = {"countyfp10":"countyfp", "tazce10":"tazce"}, inplace = True) # Filter on county codes taz = taz[taz["countyfp"].isin(counties)] # Read Place data from zip file place = geo.read_shp("tl_2020_06_place.zip") # Compute which taz lay inside a place and which part taz = taz.in_place(place)
- stplanpy.geo.read_shp(file_name, tmp_dir='tmp', crs='EPSG:6933')¶
Read (zipped) shape files
Read (zipped) shape files into a GeoDataFrame with a number of default options. The coordinate reference system (crs) defaults to “EPSG:6933” and all the column names are made lower case.
- Parameters
file_name (str) – Name and path of the (zipped) shape file.
tmp_dir (str, defaults to "tmp") – Name of temporary directory to extract the zip archive to.
crs (str, defaults to "EPSG:6933") – The coordinate reference system (crs) of the output GeoDataFrame. The default value is “EPSG:6933”.
- Returns
GeoDataframe with all the column names found in the shape file inside the zip achive in lower case.
- Return type
geopandas.GeoDataFrame
See also
Examples
The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github. Read a shape file:
import shutil import zipfile from stplanpy import geo # Extract to temporal location with zipfile.ZipFile("tl_2011_06_taz10.zip", "r") as zip_ref: zip_ref.extractall("tmp") # Read taz data from shp file taz = geo.read_shp("tmp/" + "tl_2011_06_taz10.shp") # Clean up tmp files shutil.rmtree("tmp")
Read a shape file from a zip file:
from stplanpy import geo # Read taz data from zip file taz = geo.read_shp("tl_2011_06_taz10.zip")
- stplanpy.geo.to_geojson(gdf: geopandas.geodataframe.GeoDataFrame, file_name, crs='EPSG:4326')¶
Write GeoDataFrame to GeoJson file
Write GeoDataFrame to a GeoJson file with the default coordinate reference system (crs) “EPSG:4326”. If a GeoDataFrame has multiple columns containing geometries, only the column GeoDataFrame.geometry.name is kept.
- Parameters
crs (str, defaults to "EPSG:4326") – The coordinate reference system (crs) of the output GeoJson file. The default value is “EPSG:4326”.
See also
Examples
The example data file, “tl_2011_06_taz10.zip”, can be downloaded from github.
from stplanpy import geo # Read taz data from zip file taz = geo.read_shp("tl_2011_06_taz10.zip") # Write to file taz.to_geojson("taz.GeoJson")