County similarity
What “similar” really means.
When we say two counties are similar, we mean they look alike across 32 dimensions — not just that they voted the same way. A demographic fingerprint, not a partisan match.
Dimensions
32
Counties indexed
3,143
Archetypes
12
What's in the fingerprint
Thirty-two dimensions, three families
Demographic
Who lives there
Political
How they vote
Composite
How it all fits together
How matching works
Nearest neighbors across the fingerprint
For a given county, we find other counties whose 32-dimensional vector is closest. This is pure vector math — not a partisan lookup.
Holistic, not partisan
HNSW indexing for speed
The 12 archetypes
K-Means clusters of the U.S. county space
Every county belongs to one cluster based on its fingerprint. Examples below are illustrative of each cluster's characteristic shape.
Dense urban cores
Inner suburbs
Sunbelt growth suburbs
Exurban fringe
Rust Belt industrial
Farm Belt rural
Appalachian rural
Southern Black Belt
Latino-majority borderlands
Mountain West lean-R
Pacific Northwest competitive
New England small-town
What it's for
Strategic implications
Transferable strategy
Understanding realignment
Limitations
Where similarity engines fall short
Similarity is a static snapshot. We update the embedding when the underlying census data refreshes — every decade for decennial counts, every year for ACS. Between refreshes, a rapidly changing county’s fingerprint may lag reality.
Similar counties are not identical. Two counties in the same archetype will still diverge on any given race. Similarity is a prior, not a prediction.
Thirty-two dimensions capture most of the variation across U.S. counties but not all of it. Local dynamics — a charismatic local figure, an industry-specific shock, a recent scandal — do not live in the embedding.