Limitations
of the Exponential Distribution Many years ago, science held and defended the theory that the earth was flat. Once that theory was overturned, great scientific strides were made, leading us to new theories that better describe and model the physical world we live in. Today, even though not widely defended, the unsupported assumption that most reliability engineering problems can be modeled well by the exponential distribution is still widely held. In a quest for simplicity and solutions that we can grasp, derive and easily communicate, many practitioners have embraced simple equations derived from the underlying assumption of an exponential distribution for reliability prediction, accelerated testing, reliability growth, maintainability and system reliability analyses. This practice is perpetuated by some reliability authors and lecturers, some reliability software makers, and most military standards that deal with reliability. So what is wrong with the widespread use of the exponential distribution for reliability analysis? To answer that question, we need to understand the basic constant failure rate assumption of the exponential distribution and examine whether it is supported in most real world applications. The exponential distribution models the behavior of units that fail at a constant rate, regardless of the accumulated age. Although this property greatly simplifies analysis, it makes the distribution inappropriate for most “good” reliability analyses because it does not apply to most real world applications. Inapplicability
of the Constant Failure Rate Assumption A simple analysis of human mortality data obtained from the Sunday newspaper provides an illustration of the erroneous conclusions we can reach through the assumption of a constant failure rate when analyzing real world data. Using the Weibull++ software to analyze the human mortality data with an exponential distribution, we find that if the human mortality rate (failure rate) were constant, a significant percentage of the population (10% based on the data sample used) would be dead by age 10, while another 10% would be alive and well beyond 175 years of age, and a lucky 1% of us would continue to live well past 350 years of age! These calculations clearly disagree with our observation of human mortality in the real world and Figure 1 demonstrates the discrepancy. This graph displays the human mortality data analyzed with both the exponential and Weibull distributions. It shows that the Weibull distribution models the behavior better, while the exponential distribution overestimates the initial failure rate and significantly underestimates the rate in later stages of life.
Similar examples are abundant among manufactured products as well. If cars exhibited a constant failure rate, then the vehicle’s mileage would not be a factor in the price of a used car because it would not affect the subsequent reliability of the vehicle. In other words, if a product can wear out over time, then that product does not have a constant failure rate. Unfortunately, most items in this world do wear out, even electronic components and non-physical products such as computer software. Electronic components have been shown to exhibit degradation over time and computer software also exhibits wear-out mechanisms. For example, a freshly rebooted PC is less likely to crash than a PC that has been running for a while, indicating an increasing software failure rate during each run. Persistence
in Reliability Analysis of the Exponential Assumption What many people fail to understand, however, is that the sole use of the MTBF reliability metric almost always implies that the exponential distribution was used to analyze the data. Under an exponential distribution assumption, the mean completely characterizes the distribution and is a sufficient metric. However, if the data are modeled by another distribution, then the mean is not sufficient to describe the data and is, in many cases, a poor reliability metric. In addition, the term itself has led to many erroneous assumptions and confusions about its relationship to other terms, such as MTTF (mean time to failure or the mean of the data based on the assumed distribution). The reason for the confusion is that both of these terms are equal if you assume a constant failure rate. In other words, if a product experiences one failure per hour, the MTTF is one hour and so is the MTBF. However, if the failure rate is not constant, then each metric has a different meaning and a different result. Thus, in the majority of cases, most practitioners are really looking for and solving for the MTTF, regardless of what they choose to call it. In the analysis of repairable systems, one might argue that the MTBF is a valuable metric because we record the times between failures for the system (the random variable) and compute the mean of these times. However, in reality, is this not the same as computing the distribution mean (i.e. the MTTF) utilizing times between failure as our random variable instead of times-to-failure? [Note: Reliability Edge Volume 1, Issue 1 presents a more complete discussion of issues with the MTBF metric, on the Web at http://www.ReliaSoft.com/newsletter.] Another reason for the extensive use of the exponential distribution is a reliance by some practitioners on antiquated techniques of reliability prediction, which are not based on actual life data for the products. Instead, they utilize compiled tables of generic failure rates (exponential failure rates) and simplistic multiplication factors (e.g. MIL-HDBK-217). These analyses provide little, if any, information and insight as to the true reliability of the products in the field. This misuse more often than not leads to an averaging of the true variable failure rate and, in the case of an increasing failure rate, the overestimation of this rate. This may result in reliability estimates that are too low in the early stages of life and too high in later stages, as demonstrated in the human mortality graph in Figure 1. The exponential distribution is also widely used, although inappropriately, in the development of preventive maintenance strategies. In many cases, the MTBF is used to determine a preventive maintenance interval for a component. However, the use of the MTBF metric implies that the data were analyzed with an exponential distribution since the mean will only fully describe the distribution when the exponential distribution is used for analysis. The use of the exponential distribution, in turn, implies that the component has a constant failure rate. This now begs the question of why anyone would preventively replace a component that has a constant failure rate and does not experience wear-out over time! With a constant failure rate assumption, preventive maintenance actions do not improve the reliability of the component, but rather waste time and parts, as illustrated in the exponential and preventive maintenance example. [Note: More accurate methods for determining the optimum replacement interval for components with non-constant failure rates are presented in Reliability Edge Volume 1, Issue 1.] Exponential
Distribution’s Contribution to Reliability
|
|||||