Atlas methodology

How Atlas computes the numbers it ships, where the data comes from, and what the confidence grades mean.

What Atlas is

Atlas is a question-answering engine for U.S. political geography. Ask a question in English, get a durable, citable answer composed of typed sections (overview, historical trend, demographic lens, flip analysis, peer list, comparison table, shape map). Every answer has a permalink and an auditable plan.

Atlas is not a forecast model. It does not predict future elections. It is not a polling aggregator. It does not collect or synthesize survey data. It is not partisan commentary. It reports results as they were certified.

Transparency matters because Atlas is an analytical tool used by journalists, academic researchers, campaign staff, and policy teams. The numbers drive decisions. This page documents every assumption the numbers rest on.

Data substrate

Every Atlas answer pulls from four substrates: precinct-level presidential results, block-level disaggregation, block-and-block-group demographics, and census geography.

Precinct-level presidential results

Cycle	Source	Coverage
2016	VEST	51 / 51 jurisdictions, 174K precincts
2020	VEST	51 / 51 jurisdictions (46 data + 5 geometry-only: FL, MD, NJ, TX, AK), 163K precincts
2024	NYT precinct map	50 / 51 jurisdictions (AK unavailable), 163,926 precincts, 481,541 D/R/OTH results

Block disaggregation

Precinct tallies are pushed down to Census blocks using a population-weighted centroid-in-polygon assignment with Hare largest-remainder rounding. Every block inherits the vote shares of the precinct it falls inside, scaled by its share of the precinct’s 18+ population.

Cycle	Coverage	Scale
2020	50 / 51 states (AK excluded)	8.0M blocks, 17M results, 154M votes, 0.12% orphan-block skip rate
2016	In progress (50 / 51 states planned, AK excluded)	Same method as 2020

The 0.12% orphan-block rate refers to blocks whose centroid falls outside any VEST precinct polygon (typically offshore islands, tribal enclaves, and block-group slivers). These blocks are excluded from disaggregation rather than back-filled.

Demographics

Field class	Source	Vintage
Population, race, ethnicity (block-grain)	Census PL 94-171	2020
Income, poverty, education, age, tenure, language (county + CD + place)	ACS 5-year	2020–2024 release

Geography

Layer	Source	Vintage
States, counties	Census Cartographic Boundary	CB 2020, 500k resolution
118th + 120th Congressional Districts, State Legislative Districts	Census Cartographic Boundary	CB 2023
Census blocks and block groups	Census TIGER	CB 2020, 239,502 block-group polygons at 100% ST_IsValid

Margin convention

Every margin on Akashic Edge, Atlas included, uses a single formula:

margin = (dem_votes - rep_votes) / total_votes * 100

Positive values mean a Democratic lead; negative values mean a Republican lead. The denominator is total votes cast, not the two-party total. Third-party and write-in votes count in the denominator.

Example: a county with 60 Democratic votes, 40 Republican votes, and 100 other votes has 200 total votes. The margin is (60 − 40) / 200 × 100 = D+10, not D+20. Two-party normalization would inflate the figure to 20, obscuring the 50% third-party share.

Two-party normalization is rejected on two grounds. It hides third-party strength in cycles where it mattered (1912, 1968, 1992, 1996, 2016, 2024). And it diverges from the way Secretaries of State, wire services, and the AP report results.

Reaggregation methods

Atlas resolves every shape to one of five execution paths. The planner picks the path based on the shape’s type and whether it maps to canonical geo_ids.

Method	When it runs	Accuracy bound
direct_mv_lookup	Named geo with canonical geo_id (state, county, CD, SLD, place)	≤ 0.01 pp vs certified totals
block_group_rollup	Derived shape composed of known block-group members (precomputed membership)	≤ 0.5 pp vs block-level truth
block_group_spatial	Arbitrary drawn shape or isochrone; ST_Intersects against block-group centroids	≤ 0.5 pp vs block-level truth
block_precise	Reserved for v2; not active in v1	Target: ≤ 0.1 pp
state_fallback	Alaska only; county-level rollup because block disaggregation is unavailable	Certified state totals; sub-state precision unavailable

The 0.01 pp bound on direct lookups was verified against Allegheny County, PA in 2024: Atlas returned D+20.3, the materialized view stored 20.31, and the certified county total matched both. The 0.5 pp bound on block-group methods was verified by comparing spatial rollups against the corresponding direct lookups for every state.

One hazard the planner guards against: when a resolver returns a single canonical geo_id but fails to stamp source_geo_ids on the resolved shape, downstream tools fall into the spatial path and pick up cross-border block groups. On Allegheny this produced a systematic 4-point error (D+16.3 observed vs D+20.3 true) before the fix landed. Every resolver now populates source_geo_ids when the shape is canonical.

Alaska

Alaska has no block-level disaggregation for 2016 or 2020. VEST does not publish precinct shapefiles for Alaska, and the state’s House Districts-as-precincts structure does not translate cleanly to Census blocks.

Atlas handles Alaska in three ways. Shape-bounded questions that depend on block math exclude Alaska and say so in the answer. Statewide Alaska questions fall back to certified borough-level totals aggregated from the Alaska Division of Elections precinct file (2024 presidential, 30 boroughs, 120 results). All Alaska answers carry a note on the reaggregation.notes field that the UI surfaces inline.

Confidence grades

Every Atlas answer carries a confidence grade on each section. The grade maps to the reaggregation method and the resolver path.

Grade	Conditions	What it means
high	direct_mv_lookup on a canonical geo_id; no spatial math; data vintage matches question year	Trust the headline number. Suitable for publication.
medium	block_group_rollup on a composed shape with known block-group membership	Trust the direction and magnitude; cite with the method note.
low	block_group_spatial on arbitrary or isochrone shapes, or state_fallback on Alaska	Directionally correct; sanity-check against the containing geo before citing a specific margin.

A low-confidence answer is not a wrong answer. It is an answer whose precision is bounded by spatial intersection or certified-total fallback rather than exact geo_id matching. Power users who need a specific number should prefer shapes that resolve to canonical geographies.

AI narrator

Atlas uses Claude Opus 4.7 with adaptive-high thinking and 1M context to plan the analysis. The planner reads the user’s question, picks section types, and selects the tools and parameters that feed each section. The resulting AnalysisPlan is JSON — every tool call, every geo filter, every threshold is auditable.

The narrator is a separate model per section, prompted with a cached ~102K-token system prompt that carries the full tool catalog, the Akashic semantic layer, the 133-template Historian corpus, and few-shot plan examples. Prompt caching keeps per-call cost and latency bounded.

The narrator reads pre-computed numbers and writes sentences about them. It cannot query the database. It cannot change a margin, a population count, or a cycle year. The numbers come from SQL and materialized-view reads; the prose is a function of those numbers plus the section’s narrative angle.

Voice is enforced by 41 forbidden-phrase regexes plus one automatic retry on violation. If the retry still fails, the narrator returns the raw section data without prose rather than shipping a voice-violating sentence. The full voice guide lives at plans/atlas/atlas-voice.md.

What the model can still fail on: mis-nesting a plan field under a placeholder key, over-routing compound questions to the Historian tool, or classifying an ambiguous similarity question as a compound query. The Plan Inspector on every answer surfaces voice_flags and structural warnings so reviewers can spot these cases.

How to verify any Atlas answer

Every Atlas answer is auditable four ways.

Permalink. Each answer has an immutable 12-character answer_id at /atlas/a/[answer_id]. The plan, the data, and the prose are frozen at write time.
Plan Inspector. "How Atlas thought about this" opens the full AnalysisPlan JSON — archetype, sections, tools, parameters, narrative angle. Users can trace every number back to the tool call that produced it.
Containing-geo comparisons. Every answer lists the state, county, and 120th-Congress district that contain the shape. Users can sanity-check an arbitrary-shape margin against its parent geographies without leaving the page.
Exports. JSON, CSV, and GeoJSON export endpoints let users re-run the analysis in their own tools. Shipping in Phase 4 Batch B-2.

Known limitations (v1)

No forecasts. Atlas is explanatory, not predictive. For forecasts, see the forecasts product.
No exit-poll or survey data. Atlas reports precinct returns and demographics, not voter attitudes.
No international data.
Block-level precision covers 2008–2024. Pre-2008 questions resolve at state or county grain only.
Drawing requires a pointer device. Mapbox Draw does not support keyboard-only drawing. Touch is supported on tablet-class viewports.
Alaska precinct data gap. See the Alaska section above.
pct_non_hispanic_white is approximated at block grain. PL 94-171 publishes race and Hispanic-origin cross-tabulations at block-group grain but not at block grain; Atlas derives block-level NHW shares from the block-group rate applied to block totals.
Choropleth fills on compound block-group shapes are deferred to a future release. The underlying helper is built; the companion polygon fetch is not wired.

Source citations

VEST — Voting and Election Science Team. Precinct-level results and shapefiles, 2016 and 2020. Harvard Dataverse. CC-BY.
The New York Times. Presidential precinct map 2024. GitHub.
Alaska Division of Elections. Certified 2024 precinct results. Official results.
U.S. Census Bureau. PL 94-171 Redistricting Data Summary File, 2020 decennial census. Census.gov. Public domain.
U.S. Census Bureau. American Community Survey 5-Year Estimates, 2020–2024 release. Census.gov. Public domain.
U.S. Census Bureau. Cartographic Boundary shapefiles (CB 2020, CB 2023) and TIGER/Line block and block-group geometry. Census.gov. Public domain.
MIT Election Data and Science Lab. County-level presidential and U.S. House returns, 1976–2024. MEDSL. CC-BY.
Carl Klarner. State legislative election returns, 1967–2023. Used with permission via the State Legislative Elections Database. Academic access.
Algara, Carlos; Amlani, Sharif. Replication data for U.S. county-level presidential, Senate, and gubernatorial returns, 1868–2020. Harvard Dataverse. CC-BY.
Pettigrew, Stephen; Miller, Michael. U.S. House primary returns, 1956–2018. Harvard Dataverse. Academic access.

Last updated: 2026-04-23 · v1.0