Predicting water main failures using Bayesian model averaging and survival modelling approach
Introduction
The deterioration of water/wastewater infrastructure in water distribution systems becomes a major concern of water utilities throughout the world [60]. There have been more than 4 million breaks in the United States and Canada since January 2000, with an average of 850 water main breaks every day and that lead to annual repair cost of more than 3 billion U.S. dollars (watermainbreakclock.com). Moreover, water main failures affect other existing nearby infrastructures such as pavement, road, storm water, sewer, and gas pipes that may lead to catastrophic failures [56]. Therefore, reliable break prediction models is of importance for water utilities in terms of budgeting and prioritizing of the maintenance, rehabilitation and replacement (M/R/R) of water mains [28], [18]. It is very hard to fully understand the processes that may cause failures in buried water mains [32]. Statistical models attempting to predict the behaviour of water pipes are not only affected both by the quantity and quality of available data, but also by the adopted statistical techniques [30], [31]. Due to multiple factors affecting these failure and data scarcity, it is often challenging to develop statistical models for water main failures [60], [49]. Survival analysis is the most widely used statistical models for water main failures. Survival analysis is a branch of statistics dealing with deterioration and failure over time and involves the modelling of the elapsed time between an initiating event and a terminal event [11], [31]. Survival analysis incorporates the fact that while some pipes break, others do not and this information has a strong impact on pipe failure analysis [49]. The models use covariates (i.e., diameter, length, soil resistivity) to differentiate the pipe failure distributions without splitting the failure data, thereby giving a better understanding of how covariates influence the failure of the pipe [32].
Substantial efforts have been made to develop pipe failure prediction using survival analysis based models. A brief summary of pipe failure prediction models based on survival analysis are presented in Table 1. Table 1 indicates that different researchers applied different survival analysis methods like Kaplan–Meier estimator [7], homogeneous Poisson process or Poisson regression [3], [5], nonhomogeneous Poisson process (NHPP) [47], [29], zero inflated nonhomogeneous Poisson process (NHPP) [45], [13], exponential/Weibull model [11], [12], [40], multivariate exponential model [30], [36], Cox proportional hazard model (Cox-PHM) [43], [41], [57], and Weibull proportional hazard model (WPHM) [57], [32]. To develop water main failure models, researchers considered different pipe specific, site specific, and environmental factors, and divide or grouped the data according to the material [45], [38], [3], [5], number of previous breaks [11], [12], [42], [32], diameter [29], [40], installation year [36], and type of failure [17].
Because of the incomplete and partial information, integration of data/information from different sources, involvement of human (expert) judgment for the interpretation of data and observations, uncertainties become an integral part of the water main failure prediction models [28], [18]. Moreover, the decision-making problem becomes more complex and uncertain when multiple experts are involved who have different levels of credibility and knowledge related to the problem [52]. Data quality also becomes a serious issue as many data sets contain uncertainties, e.g. due to unreliable recording of failure times or inaccurate measurements of the confounding factors or even the lack of the actual failure times [14]. Therefore, some researchers presented Bayesian inference or analysis for water main failure model considering the model parameters as random variables and incorporated external information (e.g. elicited expert opinions, relevant historical information) into the model by constructing a probability distribution that describes the uncertainty in the model parameters (prior to the observing data from the experiment) [11], [12], [14], [55].
In most of the Bayesian analysis, nonhomogeneous Poisson process [14], [55] and exponential/Weibull models [11], [12] were considered. However, all these studies only consider pipe age for their analysis without consideration of other influential physical (i.e., length, diameter, and manufacturing period) and environmental (i.e., soil condition, temperature) factors. On the other hand, very few water main failure prediction studies mention any preliminary covariate or model selection method that considers these uncertainties.
The objective of this study is thus to develop an effective Bayesian analysis framework for failure rate prediction of water mains formally taking uncertainties into consideration. For this, Bayesian model averaging (BMA) is used for influential covariate selection taking account of model uncertainty whereas Bayesian Weibull Proportional Hazard Model (BWPHM) is used to develop survival curves for the failure prediction of water mains. The proposed framework will improve the predictive capability of pipe failure models, and will further assist the utility authorities to proactively address the failures of water mains. Utility of the proposed framework is illustrated with the City of Calgary pipe failure data.
Section snippets
Proposed methodology
Cox-Proportional Hazard Model (Cox-PHM) is one of the most widely used semi-parametric survival analysis models for water main failure. The Cox-PHM was developed by [9] in order to examine the effects of different covariates on the time-to-failure of a system and is of the form:where t is the elapsed time from the last failure, h(t|X) is the hazard function, X=[x1, x2,…,xp] is the covariates vector, θ=[θ1, θ2,…,θp] is the covariates coefficients, and h0(t) is the baseline
Case study: City of Calgary
The proposed methodology is applied on the water distribution network of the City of Calgary. The City of Calgary is located in Alberta, Canada and has a population of 1.1 million people. It is situated at approximately 1,048 m above the sea level and has a humid continental climate. Average daytime high temperature ranges from 26 °C in July to −3 °C in mid-January and the city receives annual precipitation of 412.6 mm with 320.6 mm occurring as rain according to Environment Canada [16].
Conclusions
Accurate quantification of uncertainty is necessary for improving our understanding of water mains’ failure processes. For this, the study has sought to develop Bayesian-based water main failure prediction models for water distribution systems considering uncertainty. In this study, BMA is conducted to bring insight on selecting influential and appropriate covariates and BWPHM is applied to develop survival curves for CI and DI pipes using 57 years of historical data collected for the City of
Acknowledgements
The financial support through Natural Sciences and Engineering Research Council of Canada (NSERC) Collaborative Research and Development Grant (Number: CRDPJ 434629-12) is acknowledged. The authors are also indebted to the three anonymous referees whose suggestions improved the original manuscript.
References (61)
- et al.
Comparing risk of failure models in water supply networks using ROC curves
Reliab Eng Syst Saf
(2010) - et al.
Bayesian Belief Networks for predicting drinking water distribution system pipe breaks
Reliab Eng Syst Saf
(2014) - et al.
Evaluating risk of water mains failure using a Bayesian belief network model
Eur J Oper Res
(2015) - et al.
Comprehensive review of structural deterioration of water mains: statistical models
Urban Water
(2001) - et al.
Using maintenance records to forecast failures in water networks
Urban Water
(2000) - et al.
A Bayesian statistical method for quantifying model form uncertainty and two model combination methods
Reliab Eng Syst Saf
(2014) Bayesian model selection and model averaging
J Math Psychol
(2000)- et al.
Statistical models for the analysis of water distribution system pipe break data
Reliab Eng Syst Saf
(2009) Bayesian computation with R
(2009)- et al.
Comparative analysis of two probabilistic pipe breakage models applied to a real water distribution system
Civil Eng Environ Syst
(2010)