CONSTRUcrING AND CHECKING THE MODEL

reading ean13+2 for javaUsing Barcode Control SDK for Java Control to generate, create, read, scan barcode image in Java applications.

S O/l

Incoporate ean-13 supplement 2 on javausing barcode integrated for java control to generate, create ean-13 supplement 2 image in java applications.

....

decode ean-13 for javaUsing Barcode recognizer for Java Control to read, scan read, scan image in Java applications.

I

Bar Code integrated in javausing java toaccess bar code in asp.net web,windows application

0 0 0.4 Age 0.6 0.8

Java bar code readerin javaUsing Barcode reader for Java Control to read, scan read, scan image in Java applications.

;.;:: 't;j)

....

Use ean-13 supplement 5 for .netusing asp.net aspx touse ean-13 supplement 2 in asp.net web,windows application

I I. I

.net Vs 2010 ean-13 supplement 5 printingfor .netuse vs .net ean13+5 printing tocompose ean-13 on .net

30 Theft

Integrate ean13+2 with vbgenerate, create ean13 none on vb.net projects

S O/l

UCC - 12 printer for javausing barcode generation for java control to generate, create gs1-128 image in java applications.

....

Compose bar code in javause java bar code printing toencode bar code with java

..... .,.

Java code 39 integratingin javagenerate, create code 3 of 9 none with java projects

"

USPS Intelligent Mail barcode library in javause java 4-state customer barcode implement toinclude onecode with java

'" ..

Datamatrix 2d Barcode generator for vb.netusing visual studio .net tocompose datamatrix on asp.net web,windows application

10 Income

Paint upc a in .netgenerate, create upc-a supplement 2 none for .net projects

Data plots for the fire data after deleting areas 7 and 24 and transforming Y to

2d Matrix Barcode barcode library for office excelusing excel togenerate matrix barcode on asp.net web,windows application

log Y.

Asp.net Aspx pdf-417 2d barcode creatorfor .netusing asp.net aspx toget pdf417 on asp.net web,windows application

f3l = 0,

Linear 1d Barcode writer for .netusing barcode writer for .net vs 2010 crystal control to generate, create linear image in .net vs 2010 crystal applications.

f32 = 0, and f33 = 0. The test of the first hypothesis leads us to drop explanatory variable Xl' the proportion of older housing units, from the model. Consider the model

VS .NET 2d barcode developmenton .net c#using .net touse matrix barcode on asp.net web,windows application

EXAMPLES

"1:l ::I "1:l

.t::i

., ... .,

,:::

1.5 0.5 -0.5 -1.5 -2.5

"1:l

" " V5

"1:l

Fitted values

"1:l

::I "1:l

., ... .,

1.5 0.5

I

'6 -0.5 ...

"1:l

" ,::: " V5

- 1.5 - 2.5

tit

"1:l

::I "1:l

., ... .,

1.5 0.5

'6 -0.5 ...

"1:l

" ,::: " V5

- 1.5 -2.5

Theft

"1:l ::I "1:l

.t::i

., ... .,

1.5 0.5 -0.5 -1.5 -2.5

I..

... " ,::: " V5

"1:l "1:l

,

Income

Figure 2.5 Standardized least-absolute-deviations residual plots for the fire data after deleting areas 7 and 24 and transforming Y to log Y.

CONSTRUCTING AND CHECKING THE MODEL

The LAD estimate of the regression equation is Y = 4.252 + 0.01156X 2 0.2330X 3 and = 0.5232. Testing the two hypotheses f32 = 0 and f33 = 0 in this model, we find that we could try removing X 2 from the model.

Now consider the model

(2.6)

The LAD estimate of the regression equation is Y = 4.988 - 0.2718X3 and 0.5325. The standardized residual plots look similar to those versus Y and X3 in Figure 2.5. There is no evidence of nonlinearity or nonconstant variance or outliers. Model (2.6) appears to be a satisfactory model for the fire data.

NOTES

2.la. The process of model construction is fundamental to the scientific method. See Box, Hunter, and Hunter (1978, 1). 2.lh. In this chapter we have described the process of model construction as a sequential process in which a sequence of models are considered, one at a time, but other approaches are possible. In the examples in Section 2.4 we have followed a particular sequential procedure called backward elimination. One begins with model (2.1), containing all available explanatory variables, and then sequentially eliminates those variables that are unnecessary. Hypothesis testing is used to determine which variables are unnecessary. Other sequential strategies for model construction are forward selection, mentioned in the last sentence of the subsection on proposing a model, and the stepwise method, in which explanatory variables are added at some steps and deleted at other steps. The all-possible-regressions procedure is a nonsequential approach. To use this procedure one must choose a measure of the goodness of a model, such as Mallow's C p criterion (see Weisberg, 1985, p. 216, or Myers, 1990, p. 182). The criterion is calculated for all the linear regression models obtained by using all possible subsets of the p available explanatory variables. Among these 2 p models, one could choose the model with the best value of the criterion. Or one might choose several models with good values of the criterion and investigate them in more detail. Of course, the feasibility of looking at all possible regressions depends on the size of p. See the topic of variable selection in the books suggested for additional reading at the end of the Notes in 3. 2.2a. Methods for checking the assumptions of a model are often called diagnostics. Most of these methods involve analysis of residuals.

NOTES

2.2b. See Note 2.4a for more about standardization of residuals. 2.3a. A transformation of the data can be used to improve their agreement with the linear regression model. A suitable transformation of Y can sometimes be estimated from the data. The Box-Cox family of power transformations can be used for this purpose. One supposes that yA follows a linear regression model for some power A, and then uses the data to estimate the value of A. This family of transformations includes l/Y , /Y, Y, y2, Y3, and log Y. (For A = 0 we let yA = log Y, which is sensible because log Y is the limit of (Y A - 1)1A as A approaches 0.) See Draper and Smith (1981, Section 5.3), Weisberg (1985, Sections 6.4 and 6.5), or Myers (1990, Section 7.3). 2.3b. A linear transformation of the data has no effect on the data's agreement with the linear regression model. Suppose we transform Y to Z = eY + d. The model Z = Yo + YIXI + ... +YpXp + e is equivalent to the model Y = f30 + f3IXI + ... +f3pXp + e, where f30 = (Yo - d)/e, f31 = Yl/e, ... , f3 p = y,,/c. Or suppose we transform XI to W = eX I + d. The model Y = Yo + YIW + Y2X2 + ... +YpXp + e is equivalent to the model Y = f30 + f3I X I + f32 X 2 + ... +f3pX p + e, where f30 = Yo + dYI' f31 = eYI' f32 = Y2, ,f3p = Yp Therefore a linear change in units, such as from miles to kilometers or from degrees Fahrenheit to degrees Celsius, has no essential effect on a linear regression model. Results from a regression analysis using the changed model can easily be translated back in terms of the original model. A nonlinear change in units, however, such as from miles per gallon to gallons per mile or from centimeters to cubic centimeters, produces an essentially different linear regression model. Results from a regression analysis using the changed model do not necessarily imply anything about the original model. 2.3c. A discussion of the frequency of occurrence of outliers (or gross errors) in real data is given in Section 1.2c of Hampel et al. (1986). 2.4a. A residual is standardized by dividing by an estimate of its standard deviation. In Examples 1 and 2 we have standardized i by dividing it by 0-. This is sensible because the residuals are estimates of the random errors, and so SD(e) "" SD(e) = u. But a more accurate estimate of SD(e) can be obtained. For least-squares residuals, SD(e) = uJl - hii' where h ii is the ith diagonal entry in the matrix X(X' X)- I X' and X is the n X (p + 1) matrix of explanatory variables x ij augmented by a column of l's. Hence a more refined definition of a standardized least-squares residual would be eJ(o-Jl - h ii ), rather than eJo-. This is sometimes called a studentized residual. In Example 1, there are three turnip plants having studentized