Microsoft Research releases open-source U.S. power grid dataset derived from public data
A new pipeline converts publicly available geographic and energy datasets into transmission-level models of the U.S. power grid, spanning 48 states and 20,000+ buses, enabling physics-based analysis without restricted infrastructure data.
1 source · cross-referenced
- Microsoft Research constructed geographically grounded, electrically coherent transmission models of the U.S. power grid entirely from open data sources including OpenStreetMap, EIA statistics, and Census data.
- The dataset spans 48 states and multi-state interconnections, with models ranging from 11-bus systems to the full Eastern Interconnection at 21,697 buses.
- The pipeline validates solvability using AC optimal power flow (AC-OPF) analysis, confirming models are electrically coherent and practical—not synthetic toy networks.
- Released models support applications including transmission expansion planning, targeted line upgrades, and datacenter load placement studies without requiring proprietary grid data.
- The work addresses a gap in power systems research where realistic transmission data is classified as critical infrastructure and restricted, forcing researchers to choose between small toy networks or unrealistic synthetic models.
Microsoft Research has released an open dataset of transmission-level models representing the U.S. power grid, constructed entirely from publicly available geographic, energy, and demographic data. The effort addresses a longstanding constraint in power systems research: realistic transmission data in the United States is classified as critical infrastructure and subject to strict access controls, forcing researchers to work with either small academic networks or synthetic models that do not represent actual system behavior.
The pipeline ingests data from OpenStreetMap for physical transmission corridor and substation layout, then augments it with U.S. Energy Information Administration statistics and Census data to represent generation capacity, fuel mix, demand patterns, and operational boundaries. The resulting models are geographically grounded—they preserve the actual spatial structure of transmission networks—and tested for electrical coherence through AC optimal power flow (AC-OPF) analysis, a standard engineering validation used to confirm models can be solved and reflect realistic physics.
The dataset includes transmission models spanning 48 U.S. states and interconnection-scale networks, ranging from small systems with 11 buses to the full Eastern Interconnection covering 36 states and 21,697 buses. The researchers demonstrated that open-data-derived models support convergent AC-OPF solutions even at this largest scale, indicating the approach produces practically usable models rather than abstract benchmarks. This capability enables researchers to analyze transmission congestion, identify where new demand can be absorbed, and model how infrastructure changes propagate through realistic network topologies.
Applications demonstrated include assessments of transmission expansion potential, targeted line upgrade placement, and location optimization for large datacenter loads. By removing the data access barrier that typically requires lengthy approval cycles or commercial licensing, the work enables broader research participation in grid modernization problems and supports AI and data-driven algorithm development that requires large volumes of physically plausible training data.
- May 22, 2026 · arXiv cs.AI
New Method Improves LLM Reasoning About Conflicting Beliefs in Complex Social Scenarios
Trust79 - May 20, 2026 · OpenAI — News
OpenAI model resolves 80-year-old discrete geometry conjecture
Trust67 - May 20, 2026 · arXiv cs.AI
Study evaluates how language models interpret personal health records to answer patient questions
Trust74