Table of Contents

Introduction

This notebook presents an example of how to use rhced module to disaggregate heating and cooling electricity consumption from the net electricity consumption. Based on the sample data obtained from a smart thermostat and power meter, we will give a realistic example of the use of this model for users.

Requirements

There are several packages that needs to be installed. Please see this page for the required packages. In addition, change the directory and add path to use the modules.

Change working directory and insert path.

Loading modules. Warnings related to theano can be nelgected. If there are errors to import modules, please create a separate environment by using conda or virtualenv and install pymc3 > 3.11 again.

Input_data

Input data consists of three parts: (1) meta data, (2) thermostat data, and (3) meter data.

(1) Meta data

Each housing unit is classified two hierarchy: bldg and unitcode. bldg indicates the unique identifier of the building or site. unitcode is the unique identifier of the housing unit in a certain bldg. You need to specify the start_date and end_date to get access to the specific data. bldg, unitcode, start_date, and end_date are used for a unique identifier of (2) thermostat data and (3) meter data. So, please follow the same format given below.

In addition, max_values shows the maximum value of net, heatpump heating, heatpump heating with defrost control, heatpump_cooling, and auxiliary heating in Watt. During the calculation, the results are strictly restricted by the these maximum numbers, so please put numbers with a certain amount of margin. 120-150% can be a good choice.

output_path is used for storing outputs. Also, time interval is the model's time interval. We use 15-min interval due to defrost control. See section 2.2 in the paper.

(2) Thermostat data

Once you put all meta data information, what you need to do is just put the data for training in the specified data directory.

Then, unit_prediction function in the below read data from the directory.
For example, in this example, the meta data is bldg:sample_bldg, unitcode:sample, start_date:2016-01-19, end_date: 2016-01-25. In this case, the thermostat_data's directory is ~/rhced/data/sample_bldg/sample/raw_data/thermostat_data_sample_2016-01-19_2016-01-25.csv. The meter_data's directory is ~/rhced/data/sample_bldg/sample/raw_data/meter_data_sample_2016-01-19_2016-01-25.csv.

To example data format, we import thermostat_data and meter_data.

thermostat_data has 7 columns.

Column Unit Type Description
timestamp str Timestamp of the data point. The format should be %Y-%m-%d %H:%M:%S (e.g., 2016-01-02 13:15:00)
unitcode str Housing unit identifier. Any character is fine.
operation str Operation of HC system. See Section 3.3 in the paper.
T_in °C numeric Indoor air temperature
rh_in - numeric Indoor air relative humidity in 0-1 scale.
T_out °C numeric Outdoor air temperature
rh_out - numeric Outdoor air relative humidity in 0-1 scale.

meter_data require 3 columns (timestamp, unitcode, and net). But, for the validation purpose, you can have HC data, too.

Column Unit Type Description
timestamp str Timestamp of the data point. The format should be %Y-%m-%d %H:%M:%S (e.g., 2016-01-02 13:15:00)
unitcode str Housing unit identifier. Any character is fine.
net W numeric Average power of net electricity during sampling interval in Watt.
ahu W numeric Average power of air handler electricity during sampling interval in Watt (only necessary for validation purpose).
heatpump W numeric Average power of heat pump electricity during sampling interval in Watt (only necessary for validation purpose).
hvac W numeric Average power of air HC system (i.e., ahu+heatpump) during sampling interval in Watt (only necessary for validation purpose).

Training_prediction

The disaggregation model is unsupervised learning, so it is not necessarily to distinguish training/prediction. However, once the model is trained, it is not necessary to train or update model when there is new data. We just want to know the result. Therefore, training indicates the process of model update with new data, and prediction is the process of disaggregation with data and trained model parameters.

Both training and prediction are done by using unit_prediction function. The function gets all the meta data information (unitcode, bldg, start_date, end_date, max_values, output_path, and time_interval).

Other function arguments control the model update.

The remaining arguments control model inference. n_samples is the number of samples to approximate the posterior distribution. n_training is the number of steps of ADVI learning. It should be large enough to guarantee the stochastic optimization of ADVI convergence. n_inference indicates the number of repeated ADVI. ADVI uses stochastic optimization, so the inference result may be local optima even if the convergence is observed with large n_training. To prevent this, you can repeat the inference with different initial values. This argument controls how many inference you want to repeat.

Results

The results of unit_prediction function are given in two formats. outputs is a dictionary that gives all the posterior distribution of parameter and variables. df is a summary of outputs in a Pandas dataframe format. df gives all the HC signals and 2.5 (lower), 50 (mid), and 97.5 (upper) percentiles of the posterior distribution of each operation's power.

Since P_{HC_operation} is given in power (Watt), the median of predicted HC kWh value of the data period can be calculated by np.nanmean(df['P_hc_mid'])/1000*24*n_days. The mean of P_hc_mid is divided by 1000 to make kW, and then it is multiplied the number of hours during the data period to calculate kWh.

In addition, you can visualize timeseries prediction results.

Update

As shown in Figure 11 in the paper, the model automatically updates model parameters when new HC operation is observed. In the below code, the model predicts for the several weeks of data and updates whenever necessary. The results are not explicitly visualized in this notebook, but the purpose of below code is to give how to do it.