The Bayes Estimate. The Bayes estimate of the parameter vector f3 is the expectation of f3 under the posterior distribution. This expectation is
V * = (V1
+ X' X) -1
f3 q + I = ... = f3 p = 0
Justification of (7.14). The posterior distribution of f3 conditional on u can be obtained as follows. Since our prior distribution is different, (7.10) is no longer valid for f({3ly, u), but it is still a valid expression for f(yl{3, u). We multiply this by (7.1 1) in accordance with formula (7.8). The product can be manipulated, using the technique of "completing the square", to obtain
where b* is the vector displayed in (7.14). This shows that the posterior distribution of {3 conditional on u is multivariate normal with mean vector b * and variance-covariance matrix u 2 V * . Therefore the posterior expectation of {3 conditional on u is b *. Since it does not depend on u, it is also the posterior expectation of {3 unconditionally, and hence is the Bayes estimate. Formula (7.14) can be interpreted as a weighted average of the mean of the prior distribution for {3 and the least-squares estimate of {3, where the "weights" are matrices. The weight matrices are V * V- I and V * X' X, respectively. Whereas ordinary weights are nonnegative numbers whose sum is I, these weight matrices are matrices with nonnegative eigenvalues and their sum is the identity matrix. The Bayes estimate is weighted more heavily toward the least-squares estimate if X' X is "large" or if V- I is "small". This makes sense, because X' X is large when the variance-covariance matrix u 2(X ' X)-1 of [3LS is small, that is, when the least-squares estimate is precise; and V- I is small when u 2 V is large, that is, when our prior information about (3 is imprecise. One way to express prior ignorance about the regression coefficients would be to let the prior variance-covariance matrix be infinitely large or, in other words, to let V- I be o. Then [3BayCS = [3LS' which is the same estimate obtained using the noninformative prior distribution (7.7). The Pavement Data. For the prior distribution we take the distribution specified by (7.11), (7.12), and (7.13). Using formula (7.14) we calculate the Bayes estimate of the regression equation to be Y = - 2.713 - O.5748X I + 0.441OX 2 + O.1706X 3 + O.002078X4 + O.1636Xs
TESTING {3q+1 = ... = {3p = 0
Consider the general linear regression model with p explanatory variables. We will perform a Bayesian test of the hypothesis that f3 q + I = ... = {3p = O.
The Prior Distribution. Suppose we have no prior knowledge about the parameters. This calIs for a noninformative prior distribution. However, the noninformative prior distribution that is used in the preceding section for estimating {3 is not suitable to use for testing it. The reasons for this are discussed in Section 7.5. An appropriate prior distribution is easier to describe if we first reparameterize so as to achieve orthogonality between the explanatory variables that are included in the hypothesized submodel and those that are excluded. Let W be the n X (q + 1) matrix of explanatory variables Xii"'" x iq included in the hypothesized submodel, together with a first column consisting of 1's, and let Z be the n X (p - q) matrix of explanatory variables Xi, q + I' ... , X ip excluded from the hypothesized submodel. Thus X = (W, Z). Let y = ({3o,"" {3q) and ~ = ({3q+ I"'" {3p)' Then the model y = X{3 + e can be written as
y = Wy
+ Z~ + e
The hypothesis is that ~ = O. The columns of Ware not necessarily orthogonal to those of Z, but if we define U = Z - W(W'W)-IW'Z, then Wand U are orthogonal. Model (7.15) is equivalent to the model y = Wp + U~ + e, where p = y + (W'W)-lW'Z~. The folIowing prior distribution is a generalization of distribution (7.5): Prob( Ho) =
Prob( Ha) =
(7.16a) (7.16b) (7.16c) (7.16d)
f(p,O'IH o ) = I/O' f(p, O'IHa) = I/O'
f(~lp,O',Ha) = cexp[2-0'12~'V_I~]