# Distance utilities API¶

ChoiceModels also includes tools for constructing pairwise distance matrices and calculating which geographies are within various distance bands of some reference geography.

## Distance matrices¶

`choicemodels.tools.``great_circle_distance_matrix`(df, x, y, earth_radius=6371009, return_int=True)[source]

Calculate a pairwise great-circle distance matrix from a DataFrame of points. Distances returned are in units of earth_radius (default is meters).

Parameters: df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns x (str) – label of the x coordinate column in the DataFrame y (str) – label of the y coordinate column in the DataFrame earth_radius (numeric) – radius of earth in units in which distance will be returned (default is meters) return_int (bool) – if True, convert all distances to integers Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”. pandas Series
`choicemodels.tools.``euclidean_distance_matrix`(df)[source]

Calculate a pairwise euclidean distance matrix from a DataFrame of points. Distances returned are in units of x and y columns.

Parameters: df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”. pandas Series
`choicemodels.tools.``distance_matrix`(df, method='euclidean', x='lng', y='lat', earth_radius=6371009, return_int=True)[source]

Calculate a pairwise distance matrix from a DataFrame of two-dimensional points.

Parameters: df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns method (str) – {‘euclidean’, ‘greatcircle’, ‘network’} which algorithm to use for calculating pairwise distances x (str) – if method=’greatcircle’ or ‘network’, label of the x coordinate column in the DataFrame y (str) – if method=’greatcircle’ or ‘network’, label of the y coordinate column in the DataFrame earth_radius (numeric) – if method=’greatcircle’, radius of earth in units in which distance will be returned (default is meters) return_int (bool) – if method=’greatcircle’, if True, convert all distances to integers Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”. pandas Series

## Distance bands¶

`choicemodels.tools.``distance_bands`(dist_vector, distances)[source]

Identify all geographies located within each distance band of each geography.

The list of distances is treated pairwise to create distance bands, with the first element of each pair forming the band’s inclusive lower limit and the second element of each pair forming the band’s exclusive upper limit. For example, if distances=[0, 10, 30], band 0 will contain all geographies with a distance >= 0 and < 10 units (e.g., meters) from the reference geography, and band 1 will contain all geographies with a distance >= 10 and < 30 units from the reference geography.

To make the final distance band include all geographies beyond a certain distance, make the final value in the distances list np.inf.

Parameters: dist_vector (pandas Series) – Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”. distances (list) – a list of distance band increments a series multi-indexed by geography ID and distance band number, with values of arrays of geography IDs with the corresponding distances from that ID pandas Series