PH1820- Statistical Analysis - Support Site
Monday & Wednesdays 8:00AM-10:00AM - E305curriculum

[ Welcome | What's New | SAS Assistance |Assignments | Curriculum |
Download | Lecture Notes |Clinical Trial Issues |
HomePage | Discussion Group]

Go to the What's New Page to learn about new links, curriculum, lecture, assignment, and dataset updates.   Go to the curriculum page to learn about updated schedules.

(Last modified: Tuesday May 15, 2001.)


Application of Linear Algebra to Regression Analysis – The Normal Equations

With the introduction to concepts in linear algebra now behind us, we are ready to return to regression analysis. We can now write the regression model as

We will take this parameterization one term as a time. The vector y is now the dependent variable vectore. , i.e. y is an nx1 vector containing the dependent variable measurments. Thus, we are collapsing n individual scalar measurements of yI into an a single n tuple vector y . The beta vector is the p tuple vector of parameters. In the straightline model, relating the dependent variable y to the independent variable x, the beta vector is a 2x1, since it consists of the intercept and the slope estimate. In the "through the origin" model, the beta vector is 1x1. The vector e is the vector of errors.

We can now go through the following brief development to arrive at the solution of this equation for this equation (this is not a formal proof)

and we have the solution to the normal equations

It is important to understand the motivation of this development. The purpose is to isolate the estimate, b ,by itself. We could not do this by multiplying each side of the equation by X-1 since X need not be square. However if the design matrix X is full rank, then XX is full rank and (XX )-1 exists. So assuming a full rank design matrix X we can solve for the vector of parameter estimates b.

Note also that this solution is linear in the y’s (i.e. it is not a function of powers of y’s ln(y), etc.) In fact the solution

is the least square solution to the normal equations, and, invoking the Gauss Markov theorem, we know that the vector of parameter estimates b is the best linear unbiased estimate of the beta vector.. We can find the expected value of b as

Thus b is unbiased for the beta parameters.

To find the variance of b , we note that although the variance of a scalar is a scalar quantity, the variance of a p tuple vector is going to be a p by p matrix containing not just the p variances but all possible covariances as well.

We will use the result that the Var(Aw )=AVar(w )A’, a linear algebra result not too distant from the familiar scalar result Var(aw)=a2Var(w). Now begin by

We are now ready to apply these results to regression problems in general.

 

 

 

 

 

 

 

 

 

 

Professor: Lemuel A. Moyé, M.D., Ph.D. Associate Professor of Biometry
Office Address: RAS Building E815
Voice Number (713) 500-9518
Fax Number---(713) 500-9530
Office Email Lemuel.A.Moye@uth.tmc.edu

Home Email moyelaptop@email.msn.com
Teaching Assistant:  Miriam Morales help4ph1820@yahoo,.com
Secretary (713) 500-9562