Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
Chapter 4
ESTIMATION BY INSTRUMENTAL VARIABLES
(Instrumental Variable Estimators)
I. ENDOGENEITY:
Now suppose ε, X are not independently generated: Cov(ε,X)≠ 0 and E(ε|X) ≠ 0.
There are 4 sources of this problem:
1. Errors in measurement of independent variables:
Suppose that the true regression equation is given by:
yi = β0 + β1xi + εi
=
−
=
−
=
where E(εi) = E(εixi) = 0
Cov
x
E
x
x
E
x
E
x
,
)
(
)]
,
)
,
)
ε ( i
i
ε [ i
i
ε ( i
i
ε ( i
i
ε E x ( , ) i 0
=
Note:
,
)
↔= 0
,
)
0
Cov
x
E
x
ε ( i
i
ε ( i
i
=*
+
So if
x
x
i
e i
i
Suppose
Assume: E(ei) = E(eixi) = 0
* + ui
=*
+
→ estimate: yi = β0 + β1xi
x
x
i
e i
i
correlated with where: ui = εi - β1ei through terms ei
,
0
i xuCov (
* ≠i )
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
1
→
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
2. Variables on both sides of regression equation are jointly determined (endogenous) →
=
+
0
h i
=
+
+
0
e i
ββ + u e 1 i i εαα h 1 i i
=
+
+
RHS variables are endogenous.
u
e i
i
ε i
−
1
1
βαα + 0 1 0 βα − 1 1 1
α 1 βα − 1 1
1 βα 1 1
0
→
i euCov , (
≠i )
→
=
+
+
+
a
w i
βββ s 1 2
0
i
i
ε i
=
+
s
u
3. Omitted variables:
w i
0 ββ +
1
i
i
u
0
Estimate:
= 2
i
β + a i
ε i
i suCov , (
≠i )
Where: , if ai and si are correlated →
+=
+
+
ε t
t
− 1
→
≠
Cov
Y
,
)
0
4. Lagged dependent variables (Yt-1) as a regressor and auto correlated errors.
ε ( t
t
− 1
λβα X Y +
=
t u
Y t ε t
ρε t
t
− 1
because Yt-1 and εt both contain εt-1.
=
εβ+
Y
Model:
X ×kn
(1)
(2) X and ε are not generated independently
(3) E(ε|X) ≠ 0
(4) E(εε'|X) = σ2I
p
lim(
XX
=′ )
XXΣ
′ i XXE i × × 1 kn k
=
1 n
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
2
(5) X consists of stationary random variables with:
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
′
≠=
p
lim(
γεX )
0
1 n
Σ+
Σ+
ˆ = ββ
′ = βε
≠ βγ
p
X
lim
lim(
− 1 XX
− 1 XX
1 p ) n ≠ 0
Now and
βˆ is also no longer unbiased
−
1
→ βˆ is an inconsistent estimator.
′ = β + ≠ β ˆ( β E X ) ( ′ XX ) ε XEX ( ) ≠ 0
II. ESTIMATION BY INSTRUMENTAL VARIABLES:
W × kn
Suppose we can find a set of k variables that have two properties:
1. Exogeneity (validity): They are uncorrelated with the disturbance ε.
2. Relevance: They are correlated with the independent variable X.
=
ε WE (
)
→= 0
wE (
ε )'
0
ε
=
p
W
lim
'
0
1 n
E
' WW
(
Σ=′ )
WWE i i
WW
1 n
=
Σ=
p
lim
' XW
WW
1 n
Such that:
(W & X are stationary random variables).
1− ( YWXW )
'
'
ˆ =β IV
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
3
Then W is a set of instrumental variables and we define:
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
IVβˆ
: IV estimator.
IVβˆ
=
1− ( YWXW )
'
'
('
εβ+ )
'
− 1 ) ( XWXW
ˆ =β IV
−
1
ε
=
+
β
IVβˆ
Consistency: IV estimator is consistent:
XW ' n
W ' n
−
1
ε
=
+
β
p βˆ lim
p
IV
XW ' n
W ' n
lim
p lim 0
=
β
Σ+
=
β
− 0.1
WX
(Slutsky theorem).
β
=
+
)
(
) β =
( WE βˆ
)IV
( ) ε ' WEWE 0
( − 1 ' XWE −Σ 1 WX
IV estimator is unbiased.
≠
)
0
ε ( XE
ε
)
== 0
lim
'
ε ( WE
p
W
1 n
W × kn
Σ=
lim
p
' WW
WW
Σ=
lim
non
singular
p
' XW
WX
1 n 1 n
III. TWO-STAGE LEAST SQUARES ESTIMATION:
Z × qn
Now we have a set of instruments , that are unrelated to ε.
X consists two parts:
)
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
4
= X × kn X 2 × rn X 1 −× rkn (
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
X1: exogenous variables
X2: endogenous variables
Note: q must be ≥ r (if q < r → (W'W)-1 doesn't exist.
+Π=2 Z × × rqqn
V × rn
X × rn
=
X
X
2 2
r 2
X 2 × kn
1 X 2 × 1 n
Z includes X1, We can define reduced form equations for X2:
r
Π 2 ΠΠ=Π 1 × rq × q 1
2 V
V r
= V × rn
V 1 × 1 n
+Π=
X
Z
1 2
1
V 1
+Π=
X
Z
V
2 2
2
2
Z × qn
+Π= q
r X 2 × n 1
r × 1
V r × n 1
So:
Π are estimators: rq×
Estimate this system by OLS,
X
X
2 2
r 2
ˆ +Π ˆ Z V × × × rqqn rn
1 X 2 × 1 n
=
+
2
r
[ ˆ ˆ VV 1 2
]rV ˆ
ˆ Π = Z ˆ ˆ ΠΠ 1 × 1 q
ˆX is a good instrument.
ˆ X
Π= ˆ Z
2
2
Then we get: →
ˆ X
Π= ˆ Z
(
)0
Cov
=εZ ),
ˆX is correlated with X2 but not correlated with ε because
2
2
( ,
]WY
ˆXX
1
]2
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
5
. Apply OLS on [ we have: • After the first stage we get the set W = [
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
1
− ( YWWW )
'
'
2
ˆ =β SLS
→ two-stage least squares estimator.
ˆXX
1
]2
−
1
=Π
1− YWXW ( )
'
'
(
ZZ '
)
' XZ
ˆ =β IV
2
as an instrument variable and get: • We can also use W = [
ˆ ˆ ββ = 2
IV
SLS
We can show that:
IVβˆ
−
1
=
n
ˆ( − ββ )
' XW
W
'
n
IV
1 n
1 n
ε
W
n
11 Σ= − WX n
ε'
2
n
=
ε
W
'
IV. ASYMPTOTIC DISTRIBUTION OF
3
iW
ε i
∑
i
= 1
w ik
Wi =
=
=
)
(
)
w
)
0
ε wE ( i i
EwE i
ε ( i
=
Σ
)
2 σ ε
2 σ ε
ε wVar ( i i
εε wE ( i i i
=′ w ) i
=′ wwE ( ) i i
WW
1 w i w i
n
W
So by the central limit theorem:
N
,0(
)
2 Σεσ
XX
1 n
ε'
1
→
Σ −
Σ
N
,0(
)
n
2 εσ
WW
ˆ( ββ −IV )
WX
~
University of Economics - HCMC - Vietnam
Nam T. Hoang University of New England - Australia 6
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
1
2
→d
N
,0(
)
( Σ′
) εσ−
− 1 ΣΣ WX
WW
WX
− 1
→asy
N
IVβˆ
( − 1 Σ′ ΣΣ WX WW AsyVarCov
2 εσ ) WX n ˆ( β )
IV
β ,
β
E
ˆ( β =) IV
IVβˆ
OLS
βˆ Note: → is also an unbiased estimator. is asymptotically
IVβˆ
efficient to .
V. HAUSMAN SPECIFICATION TEST AND AN APPLICATION TO IV
ESTIMATION:
−Σ
1. Theorem:
Z
'
1 ~ Z
2 χ ][ r
Z ~ ( ×r )1
Let then: N ) ,0( × 1 r Σ XX × rr
0
=Λ
Proof:
0
0
0 λ 0 2 nλ
λ 1 0
[ 1= CCC
2
]rC
jC : eigenvector 1×r
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
7
Recall: for λj: eigenvalue
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
λ 1
2/1
2/1
2/1 =Λ
ΛΛ=Λ=
'
C
C
Σ × rr
0
0
0
0 0 λ 0 2 nλ
2/1
2/1
Λ
we have:
Λ=ΣCC
(
'
()'
)
2/1
2/1
2/1
2/1
2/1
− Λ
Λ
Λ
− Λ
=
→
I=
(
()'
()'
)(
)
)'
2/1 ΛΣ Λ ( ' ( ) CC
'
D
D
→
2/1−Λ= CD
=Σ' IDD
−
−
−
−
1
1
1
1
Σ
=
with →
(
D
)'
D
'
DD
(
D
)'
D
−
1
=Σ
→
Σ='DD
D
D
(
− 1)'
→ →
Note: C' = C-1, CC' = I
= ' ZDW × × × 1 rrr 1 r
Let → W ~ N(0,DΣD') = N(0,I)
W 1×r
~ N(0,I)
2 rχ ][~
WW ' ×r 1
→
(
ZDZD ()'
'
'
)
2 rχ ][~
→
2 rχ ][~
1
−Σ
→ ' ' ZDDZ −Σ 1
2 rχ ][~
Finally: Z ' Z
=
εβ
=+
Y
X
X
X
β 1 1
+ εβ 2
2
+ ×rn
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
8
2. Hausman Test:
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
=
XE ε (
)
2
0 × r 1
)
(
0
2 ≠XE ε
H0:
H0:
OLSβˆ
Two alternative estimators:
IVβˆ
: consistent under H0 but not under HA
−
1
=
+
β
ε
X
(
XX '
)
'
−
1
+
ε
WXW ) (
'
'
IV
ˆ β OLS ˆ = ββ
: consistent under both H0, HA (but inefficient compare to OLSβˆ )
ˆ = ˆ IV ββ OLS
Under H0:
−
1
′ )
[ VarCov
] ( )
2 rχ ][~
( ˆ ˆ − ββ OLS
IV
( ˆ ˆ − ββ OLS
IV
)OLS ˆ ˆ − ββ
IV
1
Construct the Hausman's test statistic:
−Σ
,0( ΣN
)
2 ][~
rχ )
=
+
−
2
VarCov
VarCov
VarCov
Cov
)
)
)
( ˆ ˆ − ββ OLS
IV
( ˆ β IV
( ˆ β OLS
( ˆ ˆ, ββ IV
)OLS
→ (Note: Z ~ Z ' Z
Cov
= − E
) β '
( ˆ ββ ˆ,
[ { ( )( ˆ ˆ − βββ OLS
IV
IV
)OLS
] }XW ,
− 1
− 1
=
)
εε '
'
XXX
(
'
)
[ { ( ' WXWE
] }XW ) ,
−
1
=
XXX
− 1 EWXW ( )
'
'
(
'
)
εε )' ( 2 εσ I
−
1
1
2
=
XXXWXW ( '
)
(
'
'
)
εσ−
2
=
=
(
XX '
εσ− 1)
( VarCov βˆ
)OLS
=
−
VarCov
VarCov
VarCov
)
)
( ˆ ˆ − ββ OLS
IV
( ˆ β IV
( )OLS ˆ β
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
9
So
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
−
1
−
VarCov
′ )
)
[ VarCov
] ( )
2 rχ ][~
( ˆ ˆ − ββ OLS
IV
( ˆ β IV
( ˆ β OLS
)OLS ˆ ˆ − ββ
IV
2 rχ ][~
Then, the Hausman's test statistic is:
Under H0: H
=
+
Y
X
X
β 1 1
+ εβ 2
2
)
(
0
2 =XE ε
3. Wu's approach:
Do we have:
In the first stage of IV estimation:
ˆX
2
V × rn
X 2 × rn
ˆ +Π= Z × × rqqn ˆ X
2
=
+
+
Y
X
X
ˆ X
β 1 1
β 2
2
* εγ + 2
r ≤ q → we get
=
+
+
−
* εγ +
(
Y
X
X
X
)ˆ V
β 1 1
β 2
2
2
→
=
+
−
Y
X
X
ˆ * εγ + V
β 1 1
γβ + ( ) 2
2
=
→
0 × 1 r
γ × 1 r
(
)
0
Test: H0:
2 ≠XE ε
If reject H0 →
VI. CHOOSING THE INSTRUMENTS:
1. If we are working with time-series data, lagged values of regressors will generally
=
+
β
+
ε
y
x
provide appropriate instruments.
+ ββ 2
1
2
x 33
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
10
EX:
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
x
x
02
03
x 12
x 13
=
=
X
W
x 12
x 13
x
x
n
1
n
2
1 1
x
x
−
−
n
2,1
n
3,1
1 1 1
IVβˆ
. 2. Choice of Z affects asymptotic efficiency of
Generally want to choose instruments to be highly corrected with the regressors (but
uncorrelated with the errors).
3. With the cross-section data, not always easy. One option is to use the ranks of the
=
+
data to form Z.
ε
y
x
i
+ ββ 2
1
i
14
2
1
=
=
Z
X
5
8
3
10
71 21 11 41 51 31 61
1 1 1 1 1 1 1
Example:
Appendix:
=
εβ+
Measurement Error in Linear Regression
Y × 1 n
X × kn
(1)
=*
We don't observe X, but observe X*
X × kn
+ VX × × kn kn
=
= 2 σ
XE ε (
)
εε ' (
)
E
X
(2)
I × nn
0 × n 1
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
11
Where:
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
*
=
−
(
X
V
)
Y × n 1
εβ+ × k 1
=
+
Y
X
(
* − βεβ V ) u
Put (2) into (1) yields:
The error term u = ε - Vβ is correlated with the regressor X* through the measurement
error V.
′
*
=
lim
p
X
p
′* uX
lim
− βε V ( )
1 n
1 n
=
lim
(
p
+ VX
−′ βε V () )
1 n
ε
+
−
=
lim
lim
'
lim
)
lim
'
X
βε − p
V
p
β p
' VV
1 n
1 ' p VX n 0
1 n 0
1 n 0
0≠Σ−= VVβ
Formally, we have:
*
+
+
=
p
lim
′ * XX
p
lim
(
′ () VXVX
)
1 n
1 n
=
+
+
+
p
pXX
pVX
pXV
' VV
lim
'
lim
'
lim
'
lim
1 n
1 n
1 n
1 n
Σ=
XX Σ+
vv
′
*
*
*
* β
+
′ * XX
′ * YX
′ * XX
X
X
u
(
)
An OLS regression of Y on X will lead to an inconsistent estimate of β.
OLSβˆ
=
′
*
*
=
′ * uXXX
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
12
= +β
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
−
1
*
βˆ
limp
lim
p
′ * XX
′ * uX
1 n
1 n
−1
Σ+
Σ
)
= +β
VV
XX
βVV
= −β ( Σ
2
−1
Σ+
Σ
βˆ
)
limp
VV
βVV
XX
3 β = = −β ( Σ
4 5
Clearly, OLS is inconsistent as long as there are measurement errors and ΣVV ≠ 0
ˆ β is inconsistent as long as ΣVV ≠ 0. ×k 1
1)
2) If there are some variables which are correctly measured.
→ Their coefficient estimators are also inconsistent. A badly measured variable
contaminates all the least squares estimates.
→ The effect of measurement errors is also called: "contamination bias".
00
0
=Σ VV
00
00
0 2 σ v
3) For example if only one regressor is measured with errors
→ the bias and inconsistent of all correctly measured variables depend on the
form of ΣXX → unknown.
4) In practice, it seems that the coefficients of the correctly measured variables are
consistent but this depends on the special form of ΣXX
Research questions: In practice
→ What kind of ΣXX we will count on the coefficients of correctly measured
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
13
variables?
Advanced Econometrics
Chapter 4: Estimation By Instrumental Variables
→ If we cannot find a good instrumental variable: omit wrongly measured
variables or don't omit? Which form of ΣXX.
Computer programs could answer these questions (I guess). The form of ΣXX can
be tested by simulations.
5) There are other cases that endogeneity is a problem → what is the role of ΣXX in
affecting the inconsistency of the coefficients in those cases.
Nam T. Hoang University of New England - Australia
University of Economics - HCMC - Vietnam
14
6) Endogeneity by measurement errors is a serious problem.