IMPORTANT TERMS AND CONCEPTS In the E-book, click on any term or concept below to go to that subject. Box plot Frequency distribution and histogram Median, quartiles and percentiles Normal probability plot Population mean Population standard deviation Population variance Random sample Sample mean Sample standard deviation Sample variance Stem-and-leaf diagram Time series plots CD MATERIAL Exponential probability plot Goodness of t Weibull probability plot
Probability plots are extremely useful and are often the rst technique used in an effort to determine which probability distribution is likely to provide a reasonable model for the data. We give a simple illustration of how a normal probability plot can be useful in distinguishing between normal and nonnormal data. Table S6-1 contains 50 observations generated at random from an exponential distribution with mean 20 (or 0.05). These data were generated using the random number generation capability in Minitab. Figure S6-1 presents a normal probability plot of these data, constructed using Minitab. The observations do not even approximately lie along a straight line, giving a clear indication that the data do not follow a normal distribution. The strong curvature at both ends of the plot suggests that the data come from a distribution with right or positive skew. Compare Fig. S6-1 with Fig. 6-19c. Minitab also provides estimates of the mean and standard deviation of the distribution using the method of maximum likelihood (abbreviated ML on the graph in Figure S6-1). We will discuss maximum likelihood estimation in 7. For the normal distribution, this is the familiar sample mean and sample standard deviation that we rst presented in 1. Minitab also presents a quantitative measure of how well the data are described by a normal distribution. This goodness-of- t measure is called the Anderson-Darling statistic (abbreviated AD on the Minitab probability plot). The Anderson-Darling statistic is based on the probability integral transformation
that can be used to convert the data to a uniform distribution if the hypothesized distribution is correct. Thus, if x1, x2, . . . , xn are independent and identically distributed random variables whose cumulative distribution function is F(x), then F1x1 2, F1x2 2, . . . , F1xn 2 are independent uniform (0, 1) random variables. The Anderson-Darling statistic essentially compares how close the F1x1 2, F1x2 2, . . . , F1xn 2 values are to values from a uniform (0, 1) distribution. For
Normal probability plot ML estimates 99 95 90 80 70 60 50 40 30 20 10 5 ML estimates Mean 20.7362 St. Dev. 19.2616 Goodness of fit AD* 1.904
Draw data matrix 2d barcode for .net
using barcode generator for visual studio .net control to generate, create data matrix ecc200 image in visual studio .net applications.
Figure S6-1. Normal probability plot (from Minitab) of the data from Table S6-1.
0 0 Data 50 100
Exponential probability plot ML estimates 99 98 97 95 Percentage 90 80 70 60 50 30 10 0 50 Data 100 Goodness of fit AD* 0.692 ML estimates Mean 20.7362
Figure S6-2. Exponential probability plot (from Minitab) of the data from Table S6-1.
this reason, the Anderson-Darling test is sometimes called a distance test. The test is uppertailed; that is, if the computed value exceeds a critical value, the hypothesis of normality is rejected. The 5% critical value of the Anderson-Darling statistic is 0.752 and the 1% value is 1.035. Because the Anderson-Darling statistic in Figure S6-1 is 1.904, and this exceeds the 1% critical value, we conclude that the assumption of normality would be inappropriate. Minitab can construct several other types of probability plots. An exponential probability plot of the data in Table S6-1 is shown in Figure S6-2. Notice that the data lies very close to the straight line in this plot, implying that the exponential is a good model for the data. Minitab also provides an estimate of the mean of the exponential distribution. This estimate is just the sample mean. Figure S6-3 is a Weibull probability plot of the data from Table S6-1, constructed using Minitab. The data lies approximately along a straight line, suggesting that the Weibull distribution is also a reasonable model for the data. Notice that Minitab provides maximum
Weibull probability plot ML estimates 99 95 90 80 70 60 50 40 30 20 10 5 3 2 ML estimates Shape 1.01967 Scale 20.8955 Goodness of fit AD* 0.679
Figure S6-3. Weibull probability plot (from Minitab) of the data from Table S6-1.
