The power to predict - Forecasting energy consumption in Buildings
Updated: Jan 8
Forecasting energy consumption of a building is of great importance for predicting energy loads that the electric grid needs to serve. A large share of global energy demand comes from buildings.
In this project, I worked on real office building data taken hourly for a year. I read the data from the csv file, then convert the timestamp information to a date time object and set the date time column as the index of the dataframe. Now the data is ready for some basic analysis.
Lets have a closer look at the entire data through a plot.
We can clearly see the 24 hours seasonality in the data due to the daily trends of low consumption during non working hours . There is also a 7 days seasonality due to very low energy consumption during weekends.
Lets make the plot smoother to see if there is a general trend through the year as well.
We plot the moving rolling average mean of 200 data points and yes indeed, now we can see a gradual upward trend in the data as the year progresses.
Clearly therefore the data has seasonality and the mean of the data has an upward trend. Therefore the data is not stationary. While the lack of stationarity in the data can be tackled by the ARIMA ( Autocorrelated Integrated Moving Average ) model by taking the differences between the data points, the seasonal nature of the data will need SARIMA ( Seasonal ARIMA ) model.
The SARIMA model needs us to analyse several parameters : p is the order (number of time lags) of the autoregressive model, d is the degree of differencing (the number of times the data have had past values subtracted), and q is the order of the moving-average model. Same parameters also needs to be analysed for the seasonal component in the data. These can be arrived at through analyzing autocorrelation and partial autocorrelation plots.
Additionally we need to provide the model the seasonality frequency. In this case I tackled the daily ( 24 hours ) seasonality. Here is what I fed into the model :
mod = sm.tsa.statespace.SARIMAX(ts_train_data['consumption_rate'],
order=(1, 1, 0),
seasonal_order=(1, 1, 0, 24),
After fitting the model, I plot the predictions for the same data range and here are the results.
The analysis continues...more updates shortly