Methodology.
A citable, table-and-column-level description of how every page on Akashic is built — the sources we use, the choices we make, the algorithms we run, and the things we deliberately leave out.
1. Scope & limits
Akashic is a static reference site for US presidential elections at every level of geography the federal government tracks. Every page profiles a single place. Every page carries the full presidential election history of that place from 1876 through 2024, the demographics of the place from the most recent American Community Survey, the religious-adherence profile from the 2020 US Religion Census, a typology classification, and the set of places with the most similar voting trajectory.
What is not here: ballot measures, downballot races, primary results, turnout by demographic group, polling, and prediction-market prices. Several of these are on the public roadmap, but they live in the broader Akashic Intelligence platform, not on the public place-page surface.
What is coming: state-legislative and US House election history (not just presidential roll-ups), historical congressional district boundaries per redistricting era, comparison pages, and embeddable widgets. See the roadmap for the full list.
2. Election results
We compose the 1876–2024 county-level presidential series from three primary sources, each authoritative for a different era.
- 1876–1915
- ICPSR historical archive (Inter-university Consortium for Political and Social Research). County-level totals are sparse in this window — about 60% of county-year combinations before 1928 have no recorded result. We render those as explicit gaps in the elections table rather than interpolating.
- 1916–2020
- MIT Election Data and Science Lab county-level presidential series. This is the canonical modern dataset for academic election analysis; we use it verbatim and never adjust the vote totals.
- 2024
- State-certified official returns, ingested directly from each state’s election authority. We do not use newswire totals.
- Precinct level, 2024
- Voting and Election Science Team (VEST) precinct shapefiles and totals, aggregated to the modern county boundaries where they differ.
Boundary changes. A small number of counties have changed name or boundary over the 148-year window. We carry forward the modern five-digit FIPS code as the stable URL key, and we crosswalk historical vote totals onto the modern geometry. The two consequential cases:
- Miami-Dade (FL). Renamed from Dade County in 1997. All pre-1997 results are reported under FIPS 12086.
- Connecticut planning regions. In 2022 Connecticut formally replaced its eight legacy counties with nine Council of Government planning regions as its primary subdivision. Pre-2022 county-level election totals are apportioned to planning regions by 2020 town-level population.
3. Demographics
Every place page reports demographic data from the most recent US Census Bureau American Community Survey 5-year file. As of the current build that is ACS 2024 5-year (reference period 2020–2024). The 5-year file is the only ACS product available for every county regardless of population.
Suppression handling. The ACS suppresses estimates for very small populations to protect respondent confidentiality. We display suppressed values as “—” rather than zero. Where a derived figure (such as median household income) is suppressed for an entire geography, the figure is omitted from the page and from the JSON record.
Non-Hispanic White share. Per Census convention, race and Hispanic origin are separate dimensions. The figure we label “Non-Hispanic White” is the share of population that self-identifies as White alone (single race) and not of Hispanic or Latino origin.
Connecticut. Demographic data is delivered at the planning-region level (the post-2022 successor to Connecticut’s county system); historical comparability with pre-2022 county-level ACS files is approximate.
4. Religious adherence
Religious-adherence figures come from the 2020 US Religion Census, published by the Association of Statisticians of American Religious Bodies (ASARB). The Religion Census reports the number of adherents per religious body per US county on a decennial cadence.
Bucketing. ASARB reports ~250 distinct religious bodies. For display, we aggregate them into seven traditions:
- Baptist — Southern Baptist Convention, National Baptist Convention USA, American Baptist Churches, and other Baptist bodies.
- Methodist — United Methodist Church, AME, AME Zion, CME, and other Methodist bodies.
- Pentecostal & Holiness — Assemblies of God, Church of God in Christ, Church of the Nazarene, and related bodies.
- Catholic & Orthodox — Roman Catholic Church plus all Eastern and Oriental Orthodox bodies.
- Mainline Protestant — Presbyterian Church (USA), ELCA, Episcopal Church, Disciples of Christ, UCC, and similar bodies.
- Other Christian — LDS, Jehovah’s Witnesses, non-denominational Evangelical, and Christian bodies not above.
- Non-Christian — Jewish, Muslim, Hindu, Buddhist, Bahá’í, and other non-Christian bodies.
The bucketing decisions are editorial. They are intended to produce groups roughly comparable in voting alignment, not to adjudicate theological taxonomies.
5. Geography
All boundary geometry is sourced from US Census Bureau TIGER/Line 2024 shapefiles. County polygons are simplified for web delivery using topojson-simplify with a tolerance tuned to keep visible coastline detail while reducing payload size by an order of magnitude.
Precinct geometries are 2024 boundaries, from VEST where available and the state election authority otherwise. Counties for which we do not have precinct geometry on file fall back to a hex-grid layout that preserves the aggregate county margin while visually distinguishing the precinct-level view.
6. The cluster typology
Every place is grouped into one of thirteen typologies. Unlike a hand-tuned decision tree, the typology is discovered from the data: an unsupervised clustering over seven feature families that groups each place with the American communities it most resembles.
The seven feature families are vote share, vote swing (including the 2008–2024 and the headline 2020–2024 shifts), race and ethnicity, income, language spoken at home, religion, and ancestry. Each family is weighted equally, so the two vote columns are not drowned out by the dozens of demographic ones. The pipeline lives in analysis/typology/.
Two steps produce the thirteen types:
- New American. The places in the top decile of non-English-speaking households are carved out first as a single type. We tested this against the data: 97% of them swung toward the Republican Party between 2020 and 2024, regardless of which language — so they behave as one political bloc and are too coherent to let a clustering algorithm shatter by ethnicity.
- Twelve clusters. The remaining places are grouped by k-means into twelve types, with the vote-swing family emphasized so the typology is organized around trajectory, not just current partisan level.
County-level training assigns the model; the same trained model then labels every other tier (state, CBSA, congressional district, media market, and state-legislative districts) from its own feature vector, so the types are consistent across all 11,000-plus places.
Worked examples to make the types concrete:
- Los Angeles County, CA (06037). Among the most multilingual counties in the country; it moved eleven points toward the Republican candidate from 2020 to 2024. New American.
- McDowell County, WV (54047). Voted Democratic for decades before a deep, lasting swing to the Republican Party in a low-income Appalachian setting. Appalachian Realigners.
- Johnson County, KS (20091). An affluent, college-educated Kansas City suburb that has trended Democratic as the surrounding state has not. Realigning Affluent Suburb.
7. The similar-counties model
For every county we compute the ten counties with the most similar recent voting trajectory. The model is intentionally simple: cosine similarity over the last-ten-election two-party margin vector.
Let mi = (D − R) / total for election i. For two counties A and B with margin vectors a and b over the same ten elections, the similarity is (a · b) / (||a|| × ||b||).
The model uses no demographic features. The result reflects political similarity over recent decades, not demographic or geographic similarity. Two counties on opposite coasts can score very high if their margin trajectories rhyme; two neighboring counties can score low if one realigned while the other didn’t.
We chose this over a feature-rich model deliberately. A small, transparent, fully reproducible similarity metric is more useful to a journalist or researcher than a black-box embedding, and the margin vector turns out to capture the variation that matters for the editorial question (“where else does this pattern show up?”) well enough.
8. The headline + narrative generation
Every place page carries a generated headline and a multi-paragraph narrative summary. The implementation lives in lib/headline.ts and lib/narrative.ts.
Both modules are deterministic templates: same place data in, same text out. No LLM is in the runtime path; nothing is generated at request time. The templates are conditioned on the typology, the most recent presidential margin, the demographic snapshot, and the similar-counties result.
The 40-character floor. Where an editor-curated subhead exists in the editorial layer and is at least 40 characters long, it overrides the templated subhead. Below the floor, we fall back to the template. This lets editorial copy ship one place at a time without blocking the bulk render.
9. Editorial copy
Three tiers of editorial provenance, distinguished by a source field on every editorial record.
curated- Written or hand-reviewed by an editor. The lead paragraph of every county page falls in this tier where coverage exists; subheads on the marquee counties (state capitals, major CBSAs, swing counties) are curated.
generated_reviewed- Generated by a template or an LLM-assisted draft, then reviewed by an editor before publication. Used for the non-county tier subheads where we are working through the backlog (state, CBSA, DMA, CD, SLD).
generated- Generated deterministically from the underlying data, no review. Used for the templated paragraphs after the lead, and for the long-tail places where editorial coverage is not yet possible.
No editorial copy is generated at request time. Every string on every page is either committed to the repo (templated) or stored in Neon (curated / reviewed) and read at build time.
10. Updates & versioning
Cadence. The election layer is updated after every federal election cycle (next: November 2028). The demographic layer is updated annually as the Census Bureau releases each new ACS 5-year file (typically December). The religion layer is updated decennially with each new ASARB Religion Census release.
Data freshness contract. Every build emits a machine-readable data_freshness.json with the as-of date for each source layer. The sitemap’s lastmod field on each place page derives from the most recent source-layer update touching that place.
11. Citation
Cite Akashic by the canonical URL of the page, not the backing JSON. Recommended citation forms:
Plain text.
Akashic Intelligence. (2026). Akashic: {Place Name}, {State}.
Retrieved {YYYY-MM-DD} from {canonical URL}.BibTeX.
@misc{akashic-place,
author = {Akashic Intelligence},
title = {Akashic: {Place Name}, {State}},
year = {2026},
url = {https://akashic.app/county/{FIPS}/},
note = {Accessed {YYYY-MM-DD}}
}For the underlying source data, cite the original source (MIT Election Lab, ICPSR, US Census Bureau, ASARB) directly; Akashic is the compilation, not the primary source. See about and /ATTRIBUTION.txt for the per-source breakdown.
License
Original editorial copy, the 13-type cluster typology, computed derived data, and the bulk dataset releases are published under CC BY 4.0. Underlying sources keep their own licenses — see /ATTRIBUTION.txt for the per-source breakdown and /LICENSE.txt for the original-content terms. AI training and indexing are explicitly welcomed (/robots.txt, /llms.txt).
See also
For a one-page project summary, see about. For term definitions, see the glossary. For what we’re building next, see the roadmap. To explore a place, start with the search box.