Course Data, Environment and Society
Term Fall 2025
Format Team of four · 8 weeks
Discipline Applied ML · Energy Policy

Predicting where California's grid will buckle next.

A lightweight machine-learning forecast for feeder-level congestion across PG&E's distribution system — built entirely on public data, accurate to within 240 kW.

Random Forest models trained on weather, load shapes, EV adoption, and DER data predicted feeder headroom ~6× more accurately than linear baselines — offering utilities a fast, scalable complement to PG&E's labor-intensive ICA studies.

240kW
RMSE on regression task
0.99
R² on headroom forecast
98%
Accuracy classifying
congested feeders
My contribution
Sourced and cleaned the EV-adoption pipeline from California Energy Commission registration data, and led Prediction Problem
Team — Xiaoxi Cui (weather) · Brooke Eichenlaub (pipeline lead) · Phillip Healy (regression & binary classifier) · Parker Tikson
§ 01 — Problem

A planning tool that can't keep up with the grid it's meant to plan.

Context
↳ Distribution-level
↳ PG&E territory
↳ 2024 baseline

California's distribution grid is being asked to absorb electrification, rooftop solar, and EV charging far faster than its planning assumptions were ever designed for. The tool utilities lean on — PG&E's Integration Capacity Analysis maps — gives only static snapshots, takes weeks to regenerate, and is updated only quarterly.

The result is a planning gap: regulators know some feeders are heading toward congestion, but can't see which ones, when, or under what scenarios. Programs that depend on locational targeting — managed EV charging, behind-the-meter storage incentives — have historically underperformed for exactly this reason.

§ 02 — Approach

Four public datasets, three prediction problems, one pipeline.

Sources
↳ CIMIS
↳ CALMAC
↳ CEC ZEV
↳ PG&E GRIP

We built a pipeline that joins hourly weather data (CIMIS), residential load-shape granular profiles (CALMAC), ZIP-level EV adoption (CEC), and feeder-level Integration Capacity Analysis values (PG&E GRIP) into a single feature table at the feeder × month × hour grain.

Data pipeline → modeling targets
CIMIS hourly weather · 145 stations CALMAC GP 160 residential load shapes CEC ZEV ZIP-level EV registrations PG&E GRIP / ICA feeder hosting capacity feeder × month × hour cleaned feature table Regression predict headroom (MW) Binary below 1 MW headroom? Multi-class low / med / high tier

On top of that table we trained four model families — OLS, Ridge, Lasso, and Random Forest — across three reframings of the same underlying question: continuous headroom forecasting for short-term operational planning, binary classification for near-term overload risk, and multi-class tiering for multi-year capital planning.

Tier 0 · Low
> 5 MW headroom
Comfortable margin. Approve DER interconnections without concern.
Tier 1 · Med
1–5 MW headroom
Watchlist. Candidate for storage or load-shifting programs.
Tier 2 · High
< 1 MW headroom
Stress zone. Triage for inspection or capital upgrade pipeline.
Pythonpandasscikit-learnRandom Forestgeospatial joinspublic-data scrapingJupyterEDAcross-validation
§ 03 — Results

Random Forest didn't just win. It wasn't close.

Test set
↳ 80/20 split
↳ stratified
↳ 5-fold CV

Across all three problems, ensemble methods captured non-linear interactions between customer mix, DER penetration, weather, and time-of-day that linear models simply could not see. For the regression task — predicting continuous headroom — the gap was an order of magnitude.

Regression — feeder headroom (MW)

PROBLEM 01
ModelRMSENotes
OLS ~1,580 kW0.71Misses non-linear DER × time-of-day effects
Ridge ~1,560 kW0.72Marginal gain over OLS
Lasso ~1,540 kW0.72Useful for feature selection only
Random Forest ~240 kW0.99Captures interaction structure cleanly

For the binary congested-vs-not task, the tuned Random Forest hit 97.95% accuracy with a ROC-AUC of 0.997. The metric that matters most operationally — recall on the congested class — came in above 95%, meaning the model rarely tells a planner "you're fine" when in fact the feeder is hitting its limit.

The multi-class tiering model, which I led, performed strongly on the dominant medium-risk tier and meaningfully better than logistic regression on the high-risk extremes — exactly the feeders utilities most need to identify for capital planning.

§ 04 — So what

A complement to engineering studies, not a replacement.

Use cases
↳ DER siting
↳ Capital planning
↳ Program targeting

The honest framing matters here. These models don't enforce power-system physics. They smooth over peak events because they're trained at month-hour resolution. EV data is annual and tied to registration addresses, not actual charging locations. All of these limitations bias the model toward underestimating risk.

But for the use case we set out to address — giving regulators and program administrators a fast, scenario-friendly first pass before commissioning full ICA studies — the models work. A feeder flagged as Tier 2 deserves engineering attention. A managed-charging pilot rolled out to flagged ZIPs gets meaningfully better locational targeting than the historical baseline.

Looking back

The hardest part wasn't the modeling — it was getting four messy public datasets to agree on what a "feeder" was. If I redid this, I'd start with the join logic and work outward.

Artifacts