Skip to main content
County similarity

What “similar” really means.

When we say two counties are similar, we mean they look alike across 32 dimensions — not just that they voted the same way. A demographic fingerprint, not a partisan match.
Dimensions
32
Demographic, political, composite
Counties indexed
3,143
Every U.S. county with complete data
Archetypes
12
K-Means clusters across the U.S.
What's in the fingerprint

Thirty-two dimensions, three families

Demographic
Who lives there
Age distribution, race and ethnicity, educational attainment, income, urbanization, housing tenure (rent vs own), household composition.
Political
How they vote
Partisan lean over multiple cycles, turnout rate, turnout volatility, third-party support, split-ticket tendencies.
Composite
How it all fits together
Industry mix, religious adherence, unionization, foreign-born share, migration patterns, housing type mix.
How matching works

Nearest neighbors across the fingerprint

For a given county, we find other counties whose 32-dimensional vector is closest. This is pure vector math — not a partisan lookup.
Holistic, not partisan
Two counties that voted identically in 2020 can have very different demographic fingerprints and therefore very different nearest neighbors. Similarity is not about matching vote share — it's about matching underlying structure.
HNSW indexing for speed
We use hierarchical navigable small-world indexing (HNSW via pgvector) so similarity lookups complete in milliseconds even across 3,143 counties. The user experience feels instant; the math does not.
The 12 archetypes

K-Means clusters of the U.S. county space

Every county belongs to one cluster based on its fingerprint. Examples below are illustrative of each cluster's characteristic shape.
Dense urban cores
Central cities of the largest metros. Young, educated, high renter share, strongly Democratic, high turnout variance.
Cook (IL)New York (NY)Suffolk (MA)
Inner suburbs
First-ring suburbs of major metros. College-educated, professional-class employment, Democratic lean that has intensified since 2016.
Montgomery (MD)Fairfax (VA)Oakland (MI)
Sunbelt growth suburbs
Fast-growing suburban counties in the Sunbelt. In-migration from both coasts, educated newcomers, increasingly competitive.
Gwinnett (GA)Maricopa (AZ)Wake (NC)
Exurban fringe
Outer-ring counties at the metro edge. Lower density, more family households, Republican-leaning but with weaker Democratic performance than traditional suburbs.
Collin (TX)Carroll (MD)Douglas (CO)
Rust Belt industrial
Older industrial counties in the Upper Midwest and Northeast. Manufacturing heritage, aging population, union-inflected politics that have realigned.
Erie (PA)Mahoning (OH)Macomb (MI)
Farm Belt rural
Agricultural counties in the Plains and Midwest. Small population, older, white, Republican by wide margins.
Scott (KS)Lyon (MN)Butler (NE)
Appalachian rural
Hill-country counties in Appalachia and adjacent regions. Low-density, declining population, economically depressed, strongly Republican.
McDowell (WV)Pike (KY)Perry (TN)
Southern Black Belt
Majority-Black counties across the Deep South. Democratic strongholds with lower turnout and persistent structural poverty.
Dallas (AL)Holmes (MS)Macon (AL)
Latino-majority borderlands
Heavily Latino counties along the southwestern border and in agricultural regions. Historically Democratic, recently competitive in places.
Starr (TX)Imperial (CA)Doña Ana (NM)
Mountain West lean-R
Rural and small-metro counties in the Mountain West. Sparse, older, Republican-leaning but with libertarian flavor in spots.
Park (WY)Mesa (CO)Twin Falls (ID)
Pacific Northwest competitive
Secondary metros and mid-density counties in the Pacific Northwest. Mixed demographics, competitive at the margins.
Pierce (WA)Clackamas (OR)Snohomish (WA)
New England small-town
Low-density counties across New England. Older, white, educated, Democratic-leaning with strong independent streak.
Washington (VT)Cheshire (NH)Waldo (ME)
What it's for

Strategic implications

Transferable strategy
If a campaign message or tactic works in County A, it is more likely to work in counties with a similar fingerprint than in counties with merely similar partisan lean. Similarity predicts cultural receptivity better than vote share alone.
Understanding realignment
When a cluster's partisan lean shifts, you are seeing realignment in action. The Sunbelt growth cluster has moved noticeably over the last decade. Our similarity engine surfaces which clusters are moving and which are stable.
Limitations

Where similarity engines fall short

Similarity is a static snapshot. We update the embedding when the underlying census data refreshes — every decade for decennial counts, every year for ACS. Between refreshes, a rapidly changing county’s fingerprint may lag reality.

Similar counties are not identical. Two counties in the same archetype will still diverge on any given race. Similarity is a prior, not a prediction.

Thirty-two dimensions capture most of the variation across U.S. counties but not all of it. Local dynamics — a charismatic local figure, an industry-specific shock, a recent scandal — do not live in the embedding.