Distance utilities API¶
ChoiceModels also includes tools for constructing pairwise distance matrices and calculating which geographies are within various distance bands of some reference geography.
Distance matrices¶

choicemodels.tools.
great_circle_distance_matrix
(df, x, y, earth_radius=6371009, return_int=True)[source]¶ Calculate a pairwise greatcircle distance matrix from a DataFrame of points. Distances returned are in units of earth_radius (default is meters).
Parameters:  df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
 x (str) – label of the x coordinate column in the DataFrame
 y (str) – label of the y coordinate column in the DataFrame
 earth_radius (numeric) – radius of earth in units in which distance will be returned (default is meters)
 return_int (bool) – if True, convert all distances to integers
Returns: Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
Return type: pandas Series

choicemodels.tools.
euclidean_distance_matrix
(df)[source]¶ Calculate a pairwise euclidean distance matrix from a DataFrame of points. Distances returned are in units of x and y columns.
Parameters: df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns Returns: Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”. Return type: pandas Series

choicemodels.tools.
distance_matrix
(df, method='euclidean', x='lng', y='lat', earth_radius=6371009, return_int=True)[source]¶ Calculate a pairwise distance matrix from a DataFrame of twodimensional points.
Parameters:  df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
 method (str) – {‘euclidean’, ‘greatcircle’, ‘network’} which algorithm to use for calculating pairwise distances
 x (str) – if method=’greatcircle’ or ‘network’, label of the x coordinate column in the DataFrame
 y (str) – if method=’greatcircle’ or ‘network’, label of the y coordinate column in the DataFrame
 earth_radius (numeric) – if method=’greatcircle’, radius of earth in units in which distance will be returned (default is meters)
 return_int (bool) – if method=’greatcircle’, if True, convert all distances to integers
Returns: Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
Return type: pandas Series
Distance bands¶

choicemodels.tools.
distance_bands
(dist_vector, distances)[source]¶ Identify all geographies located within each distance band of each geography.
The list of distances is treated pairwise to create distance bands, with the first element of each pair forming the band’s inclusive lower limit and the second element of each pair forming the band’s exclusive upper limit. For example, if distances=[0, 10, 30], band 0 will contain all geographies with a distance >= 0 and < 10 units (e.g., meters) from the reference geography, and band 1 will contain all geographies with a distance >= 10 and < 30 units from the reference geography.
To make the final distance band include all geographies beyond a certain distance, make the final value in the distances list np.inf.
Parameters:  dist_vector (pandas Series) – Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
 distances (list) – a list of distance band increments
Returns: a series multiindexed by geography ID and distance band number, with values of arrays of geography IDs with the corresponding distances from that ID
Return type: pandas Series