Performs spatially-constrained hierarchical clustering using Ward's minimum variance method. Only spatially contiguous areas can be merged, ensuring all resulting regions are spatially connected.
Usage
ward_spatial(
data,
attrs = NULL,
n_regions,
weights = "queen",
bridge_islands = FALSE,
scale = TRUE,
verbose = FALSE
)Arguments
- data
An sf object with polygon or point geometries.
- attrs
Character vector of column names to use for clustering (e.g.,
c("var1", "var2")). If NULL, uses all numeric columns.- n_regions
Integer. Number of regions (clusters) to create.
- weights
Spatial weights specification. Can be:
"queen"(default): Polygons sharing any boundary point are neighbors"rook": Polygons sharing an edge are neighborsAn
nbobject from spdep or created withsp_weights()A list for other weight types:
list(type = "knn", k = 6)for k-nearest neighbors, orlist(type = "distance", d = 5000)for distance-based weights
- bridge_islands
Logical. If TRUE, automatically connect disconnected components (e.g., islands) using nearest-neighbor edges. If FALSE (default), the function will error when the spatial weights graph is disconnected.
- scale
Logical. If TRUE (default), standardize attributes before clustering.
- verbose
Logical. Print progress messages.
Value
An sf object with a .region column containing cluster assignments.
Metadata is stored in the "spopt" attribute.
Details
This function implements spatially-constrained agglomerative hierarchical clustering using Ward's minimum variance criterion. Unlike standard Ward clustering, this version enforces spatial contiguity by only allowing clusters that share a border to be merged.
The algorithm:
Starts with each observation as its own cluster
At each step, finds the pair of adjacent clusters with minimum Ward distance (increase in total within-cluster variance)
Merges them into a single cluster
Repeats until the desired number of regions is reached
The result guarantees that all regions are spatially contiguous.
