Data Sources & Methodology

Akashic Edge aggregates, validates, and harmonizes U.S. election data from authoritative academic and government sources. This page documents every data source, our processing methodology, and conventions used throughout the platform.

Coverage at a Glance

19.4M+
Election Results
rows across 51 state partitions
224
Elections Covered
1868–2024
151,488
Contests
President, Senate, Governor, House, State Leg
3,143
Counties
with geometry, demographics, and embeddings
435
Congressional Districts
118th Congress boundaries
163,926
Precincts (2024)
with full boundary geometry
8M+
Census Blocks
50 states, disaggregated election data
8.3M+
Geographic Entities
states, counties, CDs, SLDs, precincts, blocks

Office Coverage

OfficeCoverageGeographyNotes
President1868–2024County + Precinct + BlockAll 51 states. Block disagg for 50 states.
U.S. Senate1908–2024CountyAll regular and special elections.
Governor1865–2025CountyIncludes off-year elections (VA, NJ, etc.).
U.S. House1976–2024Congressional districtMEDSL source. 10,869 contests.
CD Presidential2008–2024Congressional district (118th)435 CDs. 2008–2020 projected onto current boundaries.
State Senate1968–2023State legislative districtKlarner dataset, all 50 states.
State House1968–2023State legislative districtKlarner dataset, all 50 states (excl. NE unicameral).

Historical Data Sources (Pre-2000)

Algara-Amlani Historical Election Dataset

Carlos Algara & Daniel Amlani (UC Davis / University of Houston)

Coverage: President, Senate, Governor: 1868–2020
Geography: County-level (3,100+ counties)
License: Academic (CC-BY)

1.015M rows. Primary source for historical county data.

ICPSR 0001 — U.S. Historical Election Returns

Inter-university Consortium for Political and Social Research

Source
Coverage: 1824–1968 (codebooks only — data access restricted)
Geography: County-level
License: ICPSR member institutions

Used for validation cross-referencing.

ICPSR 0013 — General Election Data

ICPSR

Source
Coverage: 1950–1990
Geography: County-level, all federal + Governor
License: ICPSR member institutions

Modern Data Sources (2000–Present)

MIT Election Data + Science Lab (MEDSL)

Massachusetts Institute of Technology

Source
Coverage: County presidential 2000–2024; U.S. House/Senate (county-level)
Geography: County + partial precinct
License: CC-BY

12.5K county pres rows + 33.7K House rows.

VEST — Voting and Election Science Team

Harvard Dataverse / VEST

Source
Coverage: Precinct-level results 2016–2020 (46/51 states)
Geography: Precinct boundaries + results + block crosswalks
License: CC-BY

429K precinct results with geometry.

New York Times 2024 Precinct Results

The New York Times

Coverage: 2024 presidential (50/51 states, AK unavailable)
Geography: 163,926 precincts with full boundary geometry
License: Derived from official certified results

481,541 results (D/R/OTH) with geometry.

Klarner State Legislative Dataset

Carl Klarner (Indiana State University)

Coverage: State Senate + State House: 1968–2023
Geography: State legislative districts
License: Academic

347K results across all 50 states.

State Secretaries of State

Official state election authorities

Coverage: Certified results (varies by state)
Geography: Precinct or county-level
License: Public domain

Used for Alaska (DoE certified), 2024 manual backfills.

Census & Demographic Sources

Decennial Census (PL 94-171)

U.S. Census Bureau

Source
Coverage: Total population + race at block level (2020)
Geography: 8M+ census blocks across 50 states
License: Public domain

Used as population weights for block-level disaggregation.

American Community Survey (ACS) 5-Year

U.S. Census Bureau

Source
Coverage: Socioeconomic data (2024 5-year estimates)
Geography: County + congressional district
License: Public domain

3,639 ancestry rows (European, MENA, African, Latino, Asian, AIAN, NHPI).

TIGER/Line Shapefiles

U.S. Census Bureau

Source
Coverage: All census geographies (2020 vintage)
Geography: States, counties, CDs, blocks, precincts
License: Public domain

PostGIS boundaries for 8.3M geographic entities.

Processing Methodology

Margin Convention

All partisan margins on Akashic Edge are calculated as: **margin = (Democratic votes − Republican votes) / total votes × 100** This uses the share of ALL votes cast (not two-party share). Positive values indicate Democratic leads; negative values indicate Republican leads. This convention is applied consistently across all offices, time periods, and geographic levels.

Block-Level Disaggregation

For precinct-to-block disaggregation, we use a population-weighted centroid-in-polygon method. Each census block's 2020 PL 94-171 population determines its share of the enclosing precinct's votes. Hare quota rounding ensures vote totals reconcile exactly to precinct-level certified results. This process covers 50 states (Alaska excluded due to unavailable precinct geometry), producing 8M+ block-level records from 17M+ result rows.

CD Presidential Projection

Congressional district presidential results for 2008\u20132020 are projected onto 118th Congress boundaries using area-weighted spatial crosswalks between historical precinct boundaries and current CD geometry. 2024 results use actual 118th Congress boundaries. All projected values are flagged with is_estimated=true.

Winner Determination

The is_winner flag is populated algorithmically: for each contest, the candidacy with the highest vote total at the state level is marked as the winner. This matches official certified results in >99.9% of cases.

Fusion Voting Handling

In states with fusion voting (NY, CT, etc.), candidate vote totals are aggregated across all party lines. A candidate running on both the Democratic and Working Families lines will show consolidated totals, not separate rows per party line.

Data Validation

Automated validation runs daily, checking: vote total consistency (sum of candidate votes = total_votes), geographic coverage completeness, margin calculation accuracy, and cross-source reconciliation. Results are logged and anomalies are flagged for manual review.

County Similarity (pgvector)

County similarity is computed using 32-dimensional embeddings combining 13 demographic features, 7 political features, and 12 composite features. An HNSW index enables sub-millisecond nearest-neighbor queries. K-Means clustering (k=12) groups counties into political-demographic archetypes.

How to Cite

APA Format

Akashic Edge. (2026). U.S. Election Data [Dataset]. Retrieved from https://akashicedge.com

Chicago/Turabian

Akashic Edge. “U.S. Election Data.” Accessed [date]. https://akashicedge.com.

BibTeX

@misc{akashicedge2026,
  title  = {U.S. Election Data},
  author = {Akashic Edge},
  year   = {2026},
  url    = {https://akashicedge.com},
  note   = {Accessed: [date]}
}

For specific data subsets, please include the office, geographic level, and date range in your citation. Example: “Presidential county-level results, 1868–2024, via Akashic Edge.”

Source Priority Matrix

When multiple sources cover the same geography and time period, we use the highest-priority source. This ensures consistency and accuracy.

PeriodCounty Source (Priority)Precinct Source
1868–1949Algara-Amlani → ICPSR 0001N/A
1950–1968Algara-Amlani → ICPSR 0013N/A
1969–1999Algara-Amlani → State SOSN/A
2000–2015MEDSL → Algara-Amlani → State SOSState SOS (partial)
2016–presentMEDSL (county)VEST (2020) → NYT (2024) → State SOS

Technical Infrastructure

Database
PostgreSQL 16 + PostGIS 3.4 + pgvector + pg_trgm
Hosting
Vercel (edge) + Neon (serverless Postgres)
Partitioning
51 state partitions for election_results
Spatial Index
GiST on all geometry columns
Similarity Index
HNSW (pgvector) for county embeddings
Vector Tiles
21,146 .pbf files (states z0-8, counties z4-10, CDs z3-10)
Materialized Views
4 MVs (19.4M + 18.3M + 6.6M + 510K rows)
Update Cadence
Continuous during election seasons; daily validation

Questions About Our Data?

We welcome inquiries from researchers, librarians, and data professionals. For institutional access, bulk data requests, or methodology questions:

team@akashicedge.com