Statistical Methods
for Economists
Lecture 4
Multiple Linear Regression
•David Bartl
•Statistical Methods for Economists
•INM/BASTE

Outline of the lecture
•Introduction:  Simple Linear Regression & Least Squares Method
•Multiple Linear Regression:  Introduction
•Multiple Linear Regression:  Summary & Background
•The Classical Assumptions
•The Coefficient of Determination (R2)
•Further Theorems, Tests of Hypotheses and Confidence Intervals
•Two-sample t-test for the difference of the population means // σX=σY
•Simple linear regression without the intercept term

Introduction
•Simple Linear Regression
•Motivation
•Example
•Least Squares Method
•Generalization
•Multiple Linear Regression:  Introduction
•Multiple Linear Regression:  Notation
•

Simple Linear Regression:  Motivation


Simple Linear Regression:  Motivation


Simple Linear Regression:  Motivation


Simple Linear Regression:  Example


Simple Linear Regression:  Example
1
2
3
4
5
6
7
8
9
10
0
1
2
3
4
5
6
7
8
9
10

Simple Linear Regression:  Least Squares Method


Simple Linear Regression:  Least Squares Method


Simple Linear Regression:  Least Squares Method
the normal equation


Simple Linear Regression:  Least Squares Method


Simple Linear Regression:  Generalization


Simple Linear Regression:  Generalization
•
•We shall now study
Multiple Linear Regression

Multiple Linear Regression:  Introduction


Multiple Linear Regression:  Introduction


Multiple Linear Regression:  Notation


Multiple Linear Regression:  Notation


Multiple Linear Regression:  Notation


Multiple Linear Regression:  Notation


Multiple Linear Regression:  Notation


Multiple Linear Regression:  Notation


Random vectors
•Random variable
•Random vector
•Mean value
•Variance-covariance matrix
•Uncorrelated random variables
•Independent random variables
•

Random variable


Random vector


Random vector


Random vector:  Expected value


Random variables:  Variance and Covariance


Random vector:  Variance-covariance matrix


Random vector:  Uncorrelated random variables


Random vector:  Independent events


Random vector:  Independent random variables


Random vector:  Independent random variables


Multivariate normal distribution
•
•

Normal distribution


Normal distribution


Normal distribution


Variance-covariance matrix


Variance-covariance matrix:  A decomposition


Standard multivariate normal distribution (of dim. k ≥ 1)


Standard multivariate normal distribution (of dim. k ≥ 1)


Multivariate normal distribution


Multivariate normal distribution:  Density


Multivariate normal distribution:  Another definition


Multivariate normal distribution:  Linear transformation


Multivariate normal distribution:  Theorem


Multiple Linear Regression:  Summary & Background
•Summary
•Terminology
•Assumptions
•Random vectors
•The classical assumptions
•Notation
•

Multiple Linear Regression:  Summary


Multiple Linear Regression:  Summary


Multiple Linear Regression:  Summary


Multiple Linear Regression:  Summary


Multiple Linear Regression:  Terminology
Regressand
Predicand
Explained variable
Dependent variable
Endogenous variable
Controlled variable
Response
Outcome
Predicted variable
Measured variable
Regressors
Predictors
Explanatory variables
Independent variables
Exogenous variables
Control variables
Stimuli
Covariates
Parameters
Regression coefficients
Deviation
Error term
Disturbance
Noise

Multiple Linear Regression:  Terminology
Regressand
Predicand
Explained variable
Dependent variable
Endogenous variable
Controlled variable
Response
Outcome
Predicted variable
Measured variable
Regressors
Predictors
Explanatory variables
Independent variables
Exogenous variables
Control variables
Stimuli
Covariates
Parameters
Regression coefficients
Deviation
Error term
Disturbance
Noise
The intercept term

Multiple Linear Regression:  Assumptions


Multiple Linear Regression:  Assumptions


Multiple Linear Regression:  Random vectors


Multiple Linear Regression:  Random vectors


Multiple Linear Regression:  Random vectors


Multiple Linear Regression:  The Classical Assumptions


Multiple Linear Regression:  The Classical Assumptions


Multiple Linear Regression:  The Classical Assumptions


Multiple Linear Regression:  The Classical Assumptions
homoscedasticity, i.e. the variance
is the same

Multiple Linear Regression:  The Classical Assumptions
•


Multiple Linear Regression:  The Classical Assumptions
linearity


Multiple Linear Regression:  Notation


Multiple Linear Regression:  Notation


Basic Results (Theorems)
•


Multiple Linear Regression:  The Normal Equation


Multiple Linear Regression:  The Normal Equation


Multiple Linear Regression:  The Normal Equation


Multiple Linear Regression:  The Normal Equation


Multiple Linear Regression:  The Normal Equation
NO
(perfect)
multicollinearity

Multiple Linear Regression:  The Normal Equation


Multiple Linear Regression:  the predicted values


Multiple Linear Regression:  the predicted values


Multiple Linear Regression:  the predicted values


Multiple Linear Regression:  some properties of  H
moreover


Multiple Linear Regression:  some properties of  H


Multiple Linear Regression:  some properties of  H
0
(the orthogonal
complement =
= the space of
the residuals)

Multiple Linear Regression


Multiple Linear Regression:  Theorem 1


Multiple Linear Regression:  Theorem 1:  Corollary


Multiple Linear Regression:  Theorem 2


Multiple Linear Regression:  Theorem 3


Multiple Linear Regression:  Theorem 4


Residual Sum
of Squares,
χ2-test for the variance  σ2,
and
confidence intervals
•

Multiple Linear Regression:  Residual Sum of Squares


Multiple Linear Regression:  Mean Square Error


Multiple Linear Regression:  Theorem 5


Multiple Linear Regression:  Theorem 5


Test of hypothesis about the variance  σ2


χ2-test for the variance  σ2


χ2-test for the variance  σ2


χ2-test for the variance  σ2


χ2-test for the variance  σ2


χ2-test for the variance  σ2


χ2-test for the variance  σ2


Confidence interval for the variance  σ2


Confidence interval for the variance  σ2


Confidence interval for the variance  σ2


Confidence interval for the variance  σ2


t-test for a single linear combination of the parameters  β0,β1,…,βk  —
e.g. an individual parameter  βj  —  and
confidence
interval
•

Multiple Linear Regression:  Theorem 6


Multiple Linear Regression:  Prediction (Extrapolation)


Multiple Linear Regression:  Prediction (Extrapolation)


Multiple Linear Regression:  Prediction (Extrapolation)


Multiple Linear Regression:  Prediction (Extrapolation)


Multiple Linear Regression:  Prediction (Extrapolation)


Multiple Linear Regression:  Theorem 6:  Corollary


Tests of hypotheses about the individual parameters  βj


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


t-test for the parameter  βj  // rank(X)=k+1


Confidence interval for the parameter  βj  // rank(X)=k+1


Confidence interval for the parameter  βj  // rank(X)=k+1


Confidence interval for the parameter  βj  // rank(X)=k+1


Confidence interval for the parameter  βj  // rank(X)=k+1


Confidence interval for the parameter  βj  // rank(X)=k+1
•¡¡¡ WARNING !!!
•Never use the above t-test for the parameters  β0,β1,…,βk  consecutively!
•Never use the above construction of the confidence intervals consecutively!
•Use the following result (Theorem 7) instead !
the true confidence region
the confidence interval for β1
the confidence interval for β0

F-test for the significance
of the model and confidence region &
F-test for
a system of linear combinations
of the parameters  β0,β1,…,βk
•Theorem 7:
•
•F-test for the significance of the model
•Confidence region
•Theorem 8:
•
•

Multiple Linear Regression:  Theorem 7


Multiple Linear Regression:  Theorem 7*


Multiple Linear Regression:  Theorem 7*:  Corollary


Multiple Linear Regression:  Theorem 7*:  Corollary


Multiple Linear Regression:  Theorem 7*:  Corollary
0
(the orthogonal
complement =
= the space of
the residuals)

F-test for the significance of the model // rank(X)=k+1


F-test for the significance of the model // rank(X)=k+1
d.f.


F-test for the significance of the model // rank(X)=k+1


F-test for the significance of the model // rank(X)=k+1


Confidence region for the parameters // rank(X)=k+1


Confidence region for the parameters // rank(X)=k+1
the unknown


Multiple Linear Regression:  Theorem 8


Multiple Linear Regression:  Theorem 8:  Illustration
0
(the orthogonal
complement =
= the space of
the residuals)
subspace
of dimension
this is
an affine subspace
of dimension
It holds:

Multiple Linear Regression:  Theorem 8:  Remark


The Coefficient
of Determination  (R2)
•

The Coefficient of Determination  (R2):  Assumption


The Coefficient of Determination  (R2):  Th. 8:  Corollary


The Coefficient of Determination  (R2):  Th. 8:  Corollary


The Coefficient of Determination  (R2):  Th. 8:  Corollary
0
(the orthogonal
complement =
= the space of
the residuals)
subspace
of dimension
the line is a subspace
of dimension
By Theorem 8:
Let:

The Coefficient of Determination  (R2):  Th. 8:  Corollary
0
(the orthogonal
complement =
= the space of
the residuals)
subspace
of dimension
the line is a subspace
of dimension
The Coefficient of Determination:
Let:

The Coefficient of Determination (R2):  TSS=RSS+RegSS


The Coefficient of Determination (R2):  TSS=RSS+RegSS
0
(the orthogonal
complement =
= the space of
the residuals)
subspace
of dimension
the line is a subspace
of dimension
Let:
By the Pythagoras Theorem:

The Coefficient of Determination  (R2):  Some facts


The Coefficient of Determination  (R2):  Some facts


The Coefficient of Determination  (R2):  Some facts


The Coefficient of Determination (R2):  TSS=RSS+RegSS


The Coefficient of Determination  (R2):  Some facts


The Coefficient of Determination  (R2)


The Coefficient of Determination  (R2)


The Coefficient of Determination  (R2):  Th. 8:  Corollary


F-test for the null hypothesis  H0:  β1=…=βk=0


The Coefficient of Determination  (R2)


The Coefficient of Determination  (R2)


The Coefficient of Determination  (R2)
population
(cf. ANOVA)