Applied Sciences, Vol. 13, Pages 314: Anomaly Detection of Consumption in Hotel Units: A Case Study Comparing Isolation Forest and Variational Autoencoder Algorithms

Conceptualization, T.M., P.J.S.C., J.M. and J.R.; methodology, T.M., P.J.S.C., J.M. and J.R.; software, T.M.; validation, J.R.; formal analysis, T.M., P.J.S.C. and J.M.; investigation, T.M., P.J.S.C. and J.M.; resources, J.R.; data curation, T.M. and J.R.; writing—original draft preparation, T.M., P.J.S.C. and J.M.; writing—review and editing, T.M., P.J.S.C. and J.M.; visualization, T.M., P.J.S.C., J.M. and J.R.; supervision, P.J.S.C., J.M. and J.R.; project administration, J.M. and J.R.; funding acquisition, J.M. and J.R. All authors have read and agreed to the published version of the manuscript.

Figure 1. Flow diagram of the adopted methodology.

Figure 1. Flow diagram of the adopted methodology.

Applsci 13 00314 g001

Figure 2. An example of the process that occurs in an isolation tree building to isolate points in a two-dimensional sample space.

Figure 2. An example of the process that occurs in an isolation tree building to isolate points in a two-dimensional sample space.

Applsci 13 00314 g002

Figure 3. Sketch of anomaly detection with IF.

Figure 3. Sketch of anomaly detection with IF.

Applsci 13 00314 g003

Figure 4. Sketch of the AE base architecture.

Figure 4. Sketch of the AE base architecture.

Applsci 13 00314 g004

Figure 5. Architecture of the used VAE. The encoder is represented in orange and the decoder in green, with bi-directional layers of Long Short-Term Memory cells in each direction and the latent space Z modeled according to an isotropic normal distribution (with parameters μz and σz).

Figure 5. Architecture of the used VAE. The encoder is represented in orange and the decoder in green, with bi-directional layers of Long Short-Term Memory cells in each direction and the latent space Z modeled according to an isotropic normal distribution (with parameters μz and σz).

Applsci 13 00314 g005

Figure 6. Illustrative example of the evolution of the S1 metric as a function of the number of unknown points classified as abnormal (nva), with nup=8112.

Figure 6. Illustrative example of the evolution of the S1 metric as a function of the number of unknown points classified as abnormal (nva), with nup=8112.

Applsci 13 00314 g006

Figure 7. An illustrative example of the evolution of the value of S2 as a function of the value of Rf, for an anomaly of a sequence of 24 readings.

Figure 7. An illustrative example of the evolution of the value of S2 as a function of the value of Rf, for an anomaly of a sequence of 24 readings.

Applsci 13 00314 g007

Figure 8. Representation of the readings series (sampled hourly), associated with different hotel meters, where occasional consumption peaks are clearly visible.

Figure 8. Representation of the readings series (sampled hourly), associated with different hotel meters, where occasional consumption peaks are clearly visible.

Applsci 13 00314 g008

Figure 9. Detail of technical failure in readings (hourly sampled) received by meter “Electricity 1”.

Figure 9. Detail of technical failure in readings (hourly sampled) received by meter “Electricity 1”.

Applsci 13 00314 g009

Figure 10. Yearly analysis with weekly sampled readings for meters “Electricity 1” and “Gas”.

Figure 10. Yearly analysis with weekly sampled readings for meters “Electricity 1” and “Gas”.

Applsci 13 00314 g010

Figure 11. Representations of daily consumption in different days of a week, for the 4 m under study.

Figure 11. Representations of daily consumption in different days of a week, for the 4 m under study.

Applsci 13 00314 g011

Figure 12. Histograms of the “Water”, “Electricity 1”, “Electricity 2” and “Gas” meters with the readings from January 2014 to October 2021.

Figure 12. Histograms of the “Water”, “Electricity 1”, “Electricity 2” and “Gas” meters with the readings from January 2014 to October 2021.

Applsci 13 00314 g012

Figure 13. Representation of corrected time series, associated with the different hotel meters.

Figure 13. Representation of corrected time series, associated with the different hotel meters.

Applsci 13 00314 g013

Figure 14. Pre-processing of the data collected by meter “Electricity 1”: original data (top), after removal of peaks (middle), and after the input of estimated values for the missing values (bottom).

Figure 14. Pre-processing of the data collected by meter “Electricity 1”: original data (top), after removal of peaks (middle), and after the input of estimated values for the missing values (bottom).

Applsci 13 00314 g014

Figure 15. Graphical representation of the dataset of one meter used in the computational tests. The green line separates the train and test periods; the anomalies introduced in the test year are in red.

Figure 15. Graphical representation of the dataset of one meter used in the computational tests. The green line separates the train and test periods; the anomalies introduced in the test year are in red.

Applsci 13 00314 g015

Figure 16. Proposal of architecture for the intelligent system to detect consumption anomalies in a hotel.

Figure 16. Proposal of architecture for the intelligent system to detect consumption anomalies in a hotel.

Applsci 13 00314 g016

Figure 17. Real anomaly detection scenarios detected by the proposed system. The first scenario (top) is associated with a machinery failure, and the second one (bottom) is related to one water pump that started working on a new unnecessary regime.

Figure 17. Real anomaly detection scenarios detected by the proposed system. The first scenario (top) is associated with a machinery failure, and the second one (bottom) is related to one water pump that started working on a new unnecessary regime.

Applsci 13 00314 g017

Table 1. Statistical description of the meters’ readings in the analysis.

Table 1. Statistical description of the meters’ readings in the analysis.

MetersWater (m3)Electricity 1 (kWh)Electricity 2 (kWh)Gas (m3)mean0.834.7311.619.95std1.394.1321.8910.66min0.000.000.000.0025%0.002.700.004.0050%0.404.300.009.0075%1.106.9018.8014.00max173.10508.601064.60800.00

Table 2. Values of existing correlations between readings at 12:00 A.M. (midday) and readings that occurred at the same time on previous days.

Table 2. Values of existing correlations between readings at 12:00 A.M. (midday) and readings that occurred at the same time on previous days.

Number of Days beforeWaterElectricity 1Electricity 2Gas10.660.860.910.8620.650.830.880.8330.630.820.860.8140.620.790.840.8050.610.780.830.79

Table 3. Values of existing correlations between readings at 12:00 P.M. (midnight) and readings that occurred at the same time on previous days.

Table 3. Values of existing correlations between readings at 12:00 P.M. (midnight) and readings that occurred at the same time on previous days.

Number of Days beforeWaterElectricity 1Electricity 2Gas10.630.960.960.9220.590.950.950.9030.550.930.930.8940.520.910.910.8850.500.880.900.87

Table 4. Different sets of input variables of the algorithms used in the tests.

Table 4. Different sets of input variables of the algorithms used in the tests.

ScenarioFeaturesTemporal VariablesHotel and Environment Variables1Reading, Δ1h, Δ2h, Δ3h, Δ24h, Δ48h, Δ72h, min24h, Δmean24hHour, day of the week, monthNone2Reading, Δ1hb, Δ2hb, Δ3hb, Δ24hb, Δ48hb, Δ72hb, min24h, Δmean24hbHour, day of the week, monthNone3Reading, Δ1h, Δ2h, Δ3h, Δ24h, Δ48h, Δ72h, min24h, Δmean24hHour, day of the week, monthTemperature, occupancy, degree day, daily meals, rooms4Reading, Δ1hb, Δ2hb, Δ3hb, Δ24hb, Δ48hb, Δ72hb, min24h, Δmean24hbHour, day of the week, monthTemperature, occupancy, degree day, daily meals, rooms

Table 5. Best results for each meter with the IF algorithm, according to the proposed metric, among all scenarios and considering the mentioned grid-search.

Table 5. Best results for each meter with the IF algorithm, according to the proposed metric, among all scenarios and considering the mentioned grid-search.

MetersScenariocontaminationmax_samplesbootstrapS1S2 Anomaly 1S2 Anomaly 2S2 Anomaly 3S2 Anomaly 4SfinalWater40.008‘auto’True0.990.990.780.990.990.94Electricity 120.050.3False0.980.990.000.970.960.73Electricity 230.1‘auto’False0.940.990.990.990.990.94Gas30.050.1True0.960.990.900.980.990.93

Table 6. Best results for each meter with the VAE algorithm, according to the proposed metric, among all scenarios and considering the mentioned grid-search.

Table 6. Best results for each meter with the VAE algorithm, according to the proposed metric, among all scenarios and considering the mentioned grid-search.

MetersScenarioSliding windowS1S2 Anomaly 1S2 Anomaly 2S2 Anomaly 3S2 Anomaly 4SfinalWater1240.790.990.000.990.900.57Electricity 11240.850.990.990.990.990.85Electricity 22480.960.990.990.990.990.96Gas2720.990.990.990.850.990.95

留言 (0)

沒有登入
gif