Exploring energy demand data in light of recent electricity policy changes
Author
Affiliation
Colleen McCamy
MEDS
Published
December 3, 2022
the question
Did the Time-of-Use electricity rate transition have an effect on energy demand in the greater San Diego region?
introduction
California has ambitious clean energy and decarbonization goals. 1 To achieve these goals, the state will need to increase its current electricity grid capacity by about three times.2 Time-of-use electricity rates are a valuable strategy to help reduce the investment needed for expanding grid capacity and can help maximize the use of renewable resources.3
The Time-of-Use (TOU) energy rate referenced in this analysis establishes lower electricity prices for times when there is more renewable energy supply available and helps to encourage electricity use when generation is cleanest and lowest-cost. Additionally, energy rates are higher during the evening peak to help promote less energy use during times when renewable energy supply decreases and grid operators need to ramp up generation from fossil-fuel based power plants. For the default residential rate which is the assumed TOU rate in this analysis, the higher cost hours or “peak hours” are from 4:00 - 9:00 p.m. 4
While many utilities have piloted time-varying rates, the recent California TOU transition was the biggest test of time-based rates yet - automatically switching over 20 million electricity consumers to a TOU rate. 5 While there has been some initial research and analysis on time-based rates in electricity markets, there are limited results on mass-market transitions and how time-based rates affect total electricity consumption.[^6]
Answering the question, ‘did the Time-of-Use electricity rate transition have an effect on peak energy demand in the San Diego region?,’ can help provide insight on the policy’s effect and spur further investigations on time-based rates throughout the state of California and beyond.
the data
Electricity Demand Data
Electricity demand data used in this analysis are publicly available and provided by the US Energy Information Administration. The data were downloaded via the API dashboard 6 and were selected to include hourly electricity demand in megawatt hours (MWh) for the dates of July 1, 2018 to July 31, 2022, in the local time zone (Pacific), and for the San Diego Gas and Electric (SDGE) subregion. SDGE serves 3.7 million people through 1.5 million electric meters. Their service territory covers 4,100 square miles in San Diego and southern Orange counties and the energy demand data are an aggregate hourly electricity demand from all customers with SDGE service. 7
checkout the code
#loading the necessary librarieslibrary(dplyr)library(tidyverse)library(here)library(readr)library(gt)library(tufte)library(feasts)library(janitor)library(lubridate)library(broom)library(tsibble)library(ggpubr)library(ggiraph)library(ggiraphExtra)library(sjPlot)library(ggcorrplot)library(car)library(modelr)# setting my root directoryrootdir <- ("/Users/colleenmccamy/Documents/MEDS/EDS_222_Stats/final_project")# reading in the dataeia_data_raw <-read_csv(paste0(rootdir, "/data/eia_data.csv"))# cleaning the data to be the two variables of interesteia_df <- eia_data_raw |>select(date, hourly_energy_mwh) |>na.omit()# creating a time series dataframeeia_ts <- eia_df |>as_tsibble()
Temperature Data
In California, peak electricity demand and temperature is highly correlated. 8 As this investigation looks into energy demand, temperature data was added to the analysis. The temperature data used in the following analysis are publicly available through the NOWData Online Weather Data portal from the National Weather Service, a branch of the National Oceanic and Atmospheric Administration.9 The temperature data includes an average of daily maximum, minimum, and average temperature from numerous weather stations throughout San Diego County in degrees Fahrenheit (F). This analysis uses maximum daily temperature at the same temporal scale as the energy demand data.
The temperature data is an aggregate of multiple stations throughout San Diego County, which can cause sampling bias. The SDGE service territory covers multiple different climate zones 10 which may not be accurately represented in proportion to hourly energy demand within the aggregate average temperature for the stations. Also, the weather stations are more heavily concentrated towards the coast which may be a source of bias producing more cooler and moderate maximum temperature in the data.
checkout the code
# loading in the temperature datatemp_data <-read_csv(paste0(rootdir, "/data/sd_temp_data.csv"))# wrangling the datatemp_data <- temp_data |>mutate(temp_max =as.numeric(temp_max)) |>mutate(temp_min =as.numeric(temp_min)) |>mutate(temp_avg =as.numeric(temp_avg)) |>mutate(temp_dept =as.numeric(temp_dept)) |>mutate(date = lubridate::mdy(Date)) |>select(!Date)# restructuring the eia data to merge the dataset with the temperature data by dateeia_data <- eia_df |>mutate(time = (date)) |>mutate(date =as.Date(date))eia_data$time <-format(eia_data$time, format ="%H:%M:%S")# merging the data into one dataframeenergy_temp_df <-left_join(x = eia_data,y = temp_data,by ="date")
Exploratory Data Visualizations
The following figures outline the sum of peak daily energy demand hours and the daily max temperature from our datasets.
checkout the code
# exploring the data by plotting energy demand throughout timeenergy_demand_plot <-ggplot(data = eia_df,aes(x = date, y = hourly_energy_mwh)) +geom_line(col ="#b52b8c") +labs(title ="Hourly Energy Demand (MWh)",x ="Date",y ="MWh") +theme_minimal() +theme(plot.title =element_text(hjust =0.5))# exploring the data by plotting maximum temperature throughout timemax_temp_plot <-ggplot(temp_data, aes(x = date, y = temp_max)) +geom_line(col ="#52796f") +labs(title ="Maximum Temperature per day (°F)",x ="Date",y ="Max Temperature (°F)") +theme_minimal() +theme(plot.title =element_text(hjust =0.5))# creating dataframe for tou peak horustou_peak_hours_df <- energy_temp_df |>filter(time >=16& time <=21)# grouping it for daily peak hours to plot with daily maximum temperaturedaily_peak_hrs_df <- tou_peak_hours_df |>group_by(date) |>summarize(daily_energy_mwh =sum(hourly_energy_mwh))# plotting daily peak energy demand with daily max temperaturespeak_demand_plot <-ggplot(data = daily_peak_hrs_df,aes(x = date, y = daily_energy_mwh)) +geom_line(col ="#b52b8c") +labs(title ="Hourly Energy Demand (MWh) Over Time",x ="Date",y ="Houlry Electricity Demand (MWh) Durin Peak Hours") +theme_minimal() +theme(plot.title =element_text(hjust =0.5))# plotting along with daily temperatureggarrange(peak_demand_plot, max_temp_plot,ncol =2, nrow =1)
checkout the code
# restructuring the eia data to merge the dataset with the temperature data by dateeia_data <- eia_df |>mutate(time = (date)) |>mutate(date =as.Date(date))eia_data$time <-format(eia_data$time, format ="%H:%M:%S")# merging the data into one dataframeenergy_temp_df <-left_join(x = eia_data,y = temp_data,by ="date")
We can see a spike in the peak hourly electricity demand in late 2018. Upon further investigation, it is unclear what may be causing this demand. For a more accurate analysis, it would be important to understand this spike. Solutions could include confirming that this spike was also present in other utility energy demand data and identifying additional factors which could cause this spike.
analysis
A multi-linear regression model and time series decomposition analysis can help answer the question at hand. Prior to conducting the linear model, I also used summary statistics to a establish cutoff point in creating temperature as a dichotomous variable.
Linear Model
To investigate if the TOU policy influenced hourly energy demand, I used a multiple linear regression model. The equation for this model is: \[hwy_i =\beta_{0}+\beta_{1} \cdot TOUPolicy_i +\beta_{2} \cdot \text HotDay_i+ \beta_{3} \cdot \text PeakHour_i +\varepsilon_i\]
The ‘TOUPolicy’ predictor (‘tou_policy’ in the results) is a dichotomous variable indicating if the TOU Policy was in effect or not. The ‘PeakHour’ predictor is also a dichotomous variable which indicates whether or not the hour of the day falls within peak TOU pricing times from 4:00 - 9:00 p.m.
Lastly, the ‘HotDay’ variable (‘hot_day’ in the results) is a dichotomous variable indicating if the maximum temperature for the San Diego region was equal to or greater than 80°F or below. This cutoff temperature was determined by looking at the mean and standard deviation of the maximum temperature in San Diego during the time of interest. The average maximum temperature was about 72°F and the standard deviation about 7°F. Thus, 80°F was determined to be a ‘hot day’ for exploring the effect of heat and hourly electricity demand in the model.
checkout the code
### ---- Determining a "Hot Day" ---- # determining the mean and standard deviation for the time period of interestmean_max_temp <-mean(energy_temp_df$temp_max, na.rm =TRUE)sd_max_temp <-sd(energy_temp_df$temp_max, na.rm =TRUE)# preparing the data to plotbox_data <-as_tibble(energy_temp_df$temp_max)### ---- Adding a 'Hot Day' Indicator in the Dataframe ---- temp_demand_daily <- energy_temp_df |>group_by(date) |>summarize(daily_energy_mwh =sum(hourly_energy_mwh)) |>left_join(temp_data, by ="date") |>mutate(hot_day =case_when( (temp_max >=80) ~1, (temp_max <=79) ~0))### ----- Adding TOU Policy and Peak Hours to Dataframe -----# adding a year separate year column in the dataframeenergy_temp_df <- energy_temp_df |>mutate(year = date)energy_temp_df$year <-format(energy_temp_df$year, format ="%Y") # using variables to create dichotomous predictorsenergy_temp_df <- energy_temp_df |>mutate(tou_policy =case_when( (year >2020) ~1, (year <=2020) ~0)) |>mutate(time =as_datetime(time, format ="%H:%M:%S")) |>mutate(time = lubridate::hour(time)) |>mutate(tou_policy =case_when( (year >2020) ~1, (year <=2020) ~0)) |>mutate(peak_hours =case_when( (time <16) ~0, (time >=16& time <=21 ) ~1, (time >21) ~0)) |>mutate(hot_day =case_when( (temp_max >=80) ~1, (temp_max <=79) ~0))#### ----- Linear Regression on Hourly Energy Demand ---- ###model_tou_peak_demand <-lm(formula = hourly_energy_mwh ~ tou_policy + peak_hours + hot_day, data = energy_temp_df)
The previous regression model predicts energy demand given that each predictor doesn’t influence the relationship between the energy demand and other variables. However, thinking about the goal of the TOU policy, we can add an interaction model to the linear regression to see if the effect of peak hours on hourly electricity demand is affected by the TOU policy implementation. I would predict that there would be a greater energy demand decrease after the TOU policy was implemented than before during the peak hours. The new interaction model equation is: \[hwy_i =\beta_{0}+\beta_{1} \cdot TOUPolicy_i +\beta_{2} \cdot \text HotDay_i+ \beta_{3} \cdot \text PeakHour_i + \beta_{3} \cdot \text PeakHour_i \cdot TOUPolicy_i + \varepsilon_i\]
Diving deeper into additional effects on energy demand, I conducted a classical decomposition analysis to investigate if seasonality or any overall trends influence hourly electricity demand within the time frame we are interested in.
checkout the code
x =seq(from =ymd('2018-07-1'), length.out =1481,by='day')# preparing the dataframe for the time seriesdecom_df <- energy_temp_df |>group_by(date) |>summarize(daily_energy_mwh =sum(hourly_energy_mwh)) |>mutate(index = x)decom_ts <-as_tsibble(decom_df, index = index)decom_plot_annual <-model(decom_ts, classical_decomposition(daily_energy_mwh ~season(365), type ="additive")) |>components() |>autoplot(col ="#3d405b") +theme_minimal() +labs(title ="Classical Decomposition Model",subtitle ="Seasonality defined as 365 days",x ="Date",caption ="Figure 4")decom_plot_monthly <-model(decom_ts, classical_decomposition(daily_energy_mwh ~season(30), type ="additive")) |>components() |>autoplot(col ="#3d405b") +theme_minimal() +labs(title ="Classical Decomposition Model",subtitle ="Seasonality defined as 30 days",x ="Date",caption ="Figure 3")
results
Multiple Linear Regression:
From our model, we can interpret all of the predictors used in the regression are statistically significant in predicting hourly electricity demand at a significance level of 0.001 as they all had a p-value of 2 x e-16. The model indicates that when the daily maximum temperature is below 80°F, during non-peak hours prior to the Time-of-Use implementation in 2020, the average hourly electricity demand is about 2,164 MWh for the SDGE’s service territory. In addition, we expect to see on average a decrease in hourly electricity demand by about 108 MWh for years after the Time-of-Use energy policy was implemented, holding all other predictors constant. For days in which the maximum temperature is above 80°F, the model predicts that the average hourly electricity demand increases by about 409 MWh holding all other predictors constant.
Interestingly, the model predicts that the average hourly electricity demand decreases by about 360 MWh holding all other predictors constant. At first thought, we may expect too see hourly electricity demand to increase during peak times as these are times in which the TOU electricity policy calls out as times with higher energy demand. However, in this analysis we didn’t look at the amount of renewable electricity available on the grid. Thus, overall energy demand may be lower during peak times. However it is possible the percent of average hourly electricity demand in relation to hourly renewable electricity available on the grid may be higher during peak times than non-peak times.
Table 1 in the supporting figures section highlights these estimates, p-value and confidence interval for each of the predictors and intercept and the following equation for the linear regression model is:
Multiple Linear Regression with the Interaction Model:
This interaction model indicates that the impact of the TOU policy on hourly electricity demand is 98 MWh lower in times that fall within the peak hours compared to times that fall outside peak hours. This estimate is statistically significant at a significance level of 0.001 (p-value of 2 x e-16). Figure 2 illustrates the linear model with the interaction added. From the graph, the differences in slopes show that the relationship between energy demand and peak hours varies based on the implementation of the TOU policy.
The adjusted R2 values also increased slightly (from 0.213 to 0.214) when adding the interaction model. This demonstrates that the interaction model just slightly increases model fit.
However, this adjusted R2 value also indicates that only about 21% of the variability in hourly electricity demand is explained by the model. With this information, we can hypothesize that there are other factors that affect hourly electricity demand not included within this model. Therefore, we can state that the time-of-use policy implementation had a statically significant decrease in hourly electricity demand (at a significance level of 0.001) but overall hourly electricity demand is more greater affected by other predictors.
Time Series Analysis - Classical Decomposition:
To better understand the other factors that influence hourly energy demand, we can look at classical decomposition graphs for our time series data for both yearly and monthly seasonality (Figure 3 and 4).
Looking at the graphs there doesn’t appear to be evidence of a long-run trend in hourly energy demand over the time period analyzed as the trend seems to be mostly constant when seasonality is defined as 30 days and 365 days. It also appears that seasonality may be important in driving overall variation in electricity demand when seasonality is defined as 365 days since the gray bar is closer in height to gray bar for the overall time series graph. Anecdotally, this is intuitive as we can predict that the variance in hourly electricity could be affected by the month since month of the year and temperature are correlated and energy demand and temperature are also correlated. However, when seasonality is defined as 30 days, the seasonal effect appears to be not as important in driving overall variation in electricity demand.
discussion & conclusion
With this initial analysis we can conclude that the TOU energy rate policy had an effect on hourly electricity demand in SDGE’s service territory, and had a greater effect during peak hours. However, we can also see that other factors not addressed in the model account for more of the variation in hourly electricity demand.
Additional research and analysis should be done to determine how the TOU policy affects electricity demand in relation to the other factors not addressed in this analysis. This can include dividing electricity demand by customer type or rate class, as this analysis just uses an aggregated energy demand for all customers. Furthermore, this analysis was conducted with the assumption that all SDGE customers transitioned to a TOU rate that had peak hours from 4:00 - 9:00 p.m. (TOU-C rate) and were transitioned all at the same time. We know that this is not the case. SDGE conducted a rolling transition throughout the year 2020 and not all customers were transitioned to the TOU-C rate. Conducting this analysis with a more accurate indication of TOU implementation per customer is needed. However, this type information is not publicly available by utilities as it can include confidential customer information.
This analysis is also spatially limited for SDGE’s service area. Additional research can expand this investigation statewide. Lastly, this investigation just looks at electricity demand. However, as noted previously a key part of TOU policy is to reduce energy use when demand is high and renewable supply is low. Instead of solely using hourly electricity demand as the outcome variable, future research can look into how the TOU policy affects energy demand when renewable supply is low.
Given additional analysis is conducted, this information can be used to inform policy makers, energy providers (load-serving entities) and grid operators about the effectiveness of the TOU policy and how the policy can continue to support California’s clean energy goals.
tab_model(model_tou_peak_demand,pred.labels =c("Intercept", "TOU Policy In Effect", "During Peak Hours", "Max. Temp above 80 (°F)"),dv.labels =c("Hourly Electricity Demand (MWh)"),string.ci ="Conf. Int (95%)",string.p ="P-value",title ="Table 1. Linear Model Results for Predictors on Hourly Electricity Demand",digits =2)
Table 1. Linear Model Results for Predictors on Hourly Electricity Demand
Hourly Electricity Demand (MWh)
Predictors
Estimates
Conf. Int (95%)
P-value
Intercept
2164.03
2157.53 – 2170.52
<0.001
TOU Policy In Effect
-107.82
-116.89 – -98.75
<0.001
During Peak Hours
-360.18
-370.32 – -350.03
<0.001
Max. Temp above 80 (°F)
409.10
396.47 – 421.74
<0.001
Observations
36006
R2 / R2 adjusted
0.213 / 0.213
Table 2:
checkout the code
tab_model(model_int_tou_peak_demand,pred.labels =c("Intercept", "TOU Policy In Effect", "During Peak Hours", "Max. Temp above 80 (°F)","TOU Policy & Peak Hours"),dv.labels =c("Hourly Electricity Demand (MWh)"),string.ci ="Conf. Int (95%)",string.p ="P-value",title ="Table 2. Linear Model Results for Predictors on Hourly Electricity Demand with an Interaction Addition",digits =2)
Table 2. Linear Model Results for Predictors on Hourly Electricity Demand with an Interaction Addition
Hourly Electricity Demand (MWh)
Predictors
Estimates
Conf. Int (95%)
P-value
Intercept
2154.55
2147.77 – 2161.34
<0.001
TOU Policy In Effect
-83.11
-93.56 – -72.66
<0.001
During Peak Hours
-322.25
-335.15 – -309.35
<0.001
Max. Temp above 80 (°F)
409.12
396.50 – 421.73
<0.001
TOU Policy & Peak Hours
-98.96
-119.80 – -78.12
<0.001
Observations
36006
R2 / R2 adjusted
0.215 / 0.215
QQ Plot for hourly energy demand residuals: This supports that a linear model is an appropriate method in conducting our analysis as the residual from the model predictions appear to be mainly normal.
Figure 5:
checkout the code
aug <- energy_temp_df |>add_predictions(model_int_tou_peak_demand) |>mutate(residuals_energy = hourly_energy_mwh - pred)qqPlot(aug$residuals_energy)
[1] 16604 16605
Box plot for exploring Maximum Temperature Data:
Figure 6:
checkout the code
# plotting the mean and standard deviationtemp_box <-ggplot(box_data) +geom_boxplot(aes(x = value), col ="#300e2e",fill ="#8a6d88") +labs(x ="Maximum Daily Temperature (°F)") +theme_minimal()temp_box
references
Footnotes
California, State of. 2022. “California Releases World’s First Plan to Achieve Net Zero Carbon Pollution.” California Governor. November 16, 2022. https://www.gov.ca.gov/2022/11/16/california-releases-worlds-first-plan-to-achieve-net-zero-carbon-pollution/.↩︎
California, State of. 2022. “California Releases World’s First Plan to Achieve Net Zero Carbon Pollution.” California Governor. November 16, 2022. https://www.gov.ca.gov/2022/11/16/california-releases-worlds-first-plan-to-achieve-net-zero-carbon-pollution/.↩︎
“Time of Use.” n.d. SVCE (blog). Accessed December 3, 2022. https://svcleanenergy.org/time-of-use/.↩︎
James Sherwood et al., A Review of Alternative Rate Designs: Industry experience with time-based and demand charge rates for mass-market customers (Rocky Mountain Institute, May 2016), http://www.rmi. org/alternative_rate_designs.↩︎
“API Dashboard - U.S. Energy Information Administration (EIA).” n.d. Accessed December 3, 2022. https://www.eia.gov/opendata/browser/electricity/rto/region-sub-ba-data.↩︎
“Our Company | San Diego Gas & Electric.” n.d. Accessed December 3, 2022. https://www.sdge.com/more-information/our-company.↩︎
“Our Company | San Diego Gas & Electric.” n.d. Accessed December 3, 2022. https://www.sdge.com/more-information/our-company.↩︎
Miller, Norman L., Katharine Hayhoe, Jiming Jin, and Maximilian Auffhammer. 2008. “Climate, Extreme Heat, and Electricity Demand in California.” Journal of Applied Meteorology and Climatology 47 (6): 1834–44. https://doi.org/10.1175/2007JAMC1480.1.↩︎
US Department of Commerce, NOAA. n.d. “Climate.” NOAA’s National Weather Service. Accessed December 3, 2022. https://www.weather.gov/wrh/Climate?wfo=sgx.↩︎