Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

Chapter 5

LIMITED - DEPENDENT VARIABLE MODELS: TRUNCATION, CENSORING (TOBIT) AND SAMPLE SELECTION.

TRUNCATION: I.

The effect of truncation occurs when sample data are drawn from a subset of a larger

population of interest.

1. Truncated distributions:

Is the part of an untruncated distribution that is above or below some specified value

• Density of a truncated random variable:

)(xf

>

=

a

f x x (

)

f x ( ) > x

a

Prob(

)

and a is a constant then: If a continuous random variable x has pdf

2σµNx

(~

,

)

>

Φ−=

α

=

If

a

xP (

1)

α ) (

 − a  σ 

µ  Φ−= 1 

 − a  σ 

µ   

2

1

µ −− x ( ) 2 σ 2

e

>

=

=

a

xxf (

)

Φ−

xf )( Φ− α ( )

1

2 πσ 2 1

α ( )

µ   

=

Φ=φ (

)'

1 σ 1

 − x φ  σ  Φ− α ( )

,

o Truncated standard normal distribution:

>

=

>

E x x [

a

]

xf x x (

= a dx µ )

a

V

2) ( µ− xa

= ∫

Nam T. Hoang UNE Business School

University of New England

1

2. Moments of truncated distributions:

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

2

~

(

)

,

axxE

[

]

ασλµ+= )

(

> <

2

=

Var [

]

αδσ − )

1[

(

> axx <

α

=

o Truncated mean and truncated variance x N µσ and a is a constant If

(.)φ is this standard normal density

 − a  σ 

= αφαλ

Φ−

Where ,

x > a

µ    [ 1)

(

)

(

]) α (

−=

Φ

And if

ααφαλ ( )

)

(

)

(

x < a

=

if

ααλαλαδ ]

)[

)

(

(

)

(

And

0

< αδ (

< 1)

2

2 σ<

truncatedσ

for all values of α

µ

3. The truncated regression model:

β

i X=

i

= i X Y

εβ+ i

i

N

~

,0(

)

Assume now:

ε i

X i

)

(~

XN

Where:

XY i

i

2σβi ,

So that

We are interested in the distribution of Yi given that Yi is greater than the truncation

>

=

σβ +

]

a

X

[ YYE i i

i

σβ /) ] σβ ] /)

1

φ − [( Xa i −Φ− [( Xa

i

point a

i

i

i

=

+

βλαλσβ −

i

( 2

)(

i

i

) σ

=

+

1(

2 λαλβ ) i i

i

=

)

iδβ − 1(

β i

i

=

> ] a = + / ) d αλσβ ( d i ∂ [ YYE i i ∂ X α ∂ i ∂ X

)

)

αλλ = ( i

i

αδδ = ( i

i

α i

Xa − σ

Where: , ,

iδ−1

is between zero and 1  for every element of Xi , the marginal effect is less than

Nam T. Hoang UNE Business School

University of New England

2

the corresponding coefficient

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

=

> a

]

2 δσ − 1( )

YYVar [ i

i

>

+

a

]

u

[ YYEa i

i

i

=

+

X

σλβ + i

i

u i

2

=

]

)

iuVar [

δσ − 1( i

o Estimate: => YY i i

iλ  all the biases that arise because of an omitted

If we use OLS on (Yi,Xi)  we omit

( YXE

)

variable can be expected.

for some τ in the full population is a linear function of Y then βτ=b o If

CENSORED DATA II.

• A very common problem in micro economic data is censoring of dependent variable.

• When the dependent variable is censored, value in a certain range are all transferred to (or

reported as) a single value.

4. The censored normal distribution:

When data is censored the distribution that applies to the sample data is a mixture of

discrete and continuous distribution.

*Y by:

*

*

≤ >

if if

Y Y

0 0

 = 0 Y  * = YY 

*

Define a new random variable Y transformed from the original one,

Y

~

2 σµN , (

)

*

=

=

= Φ

= − Φ

Prob(

y

0) Prob(

Y

0)

(

) 1

− µ σ

µ ( ) σ

If

* >Y

0

*Y This is the mixture of discrete and continuous parts.

*

If then Y has the density of

Y = if a

Y ≤*

a

Y

~

2 σµN , (

)

*YY =

Φ−+Φ=

YE ][

1(

a .

)(

σλµ+ )

2

=

Φ−

+

Φ

2 σYVar ][

1(

1)[(

λαδ ( )

)

]

Moments: and or else then:

Nam T. Hoang UNE Business School

University of New England

3

Where:

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

*

Φ

= Φ

=

= Φ

Y

a

α (

) Prob(

)

− a  σ 

µ   

λ ϕ =

− Φ

=

/ (1

)

(

)

ϕ − Φ

1

2 δ λ λα

=

        

Φ=

=aYE

[

]0

)(

+ σλµ )

µ ( σ

=

λ

Φ

µ φ ( ) σ µ ) ( σ

For a=0 

5. The censored Regression Model: (Tobit Model)

=*

X

Y i

εβ+ i

i

*

0 *

*

= =

≤ >

0 0

if if

Y i

Y i Y i

 Y i  Y  i

a. Model:

iY

β

+

Φ=

[

]

(

X

)

XYE i

i

+ σλβ i

i

X i σ

=

=

µ

β

We only know

]

X

[ * XYE i

i

i

=

=

Note:

λ i

1

σβφ ( / ) X i σβ Φ ( / ) X

i

i

σβ /) ] i σβ /) ] X ]

i

β=

Where:

*Y variable

*Y is unobservable

φ − 0[( X −Φ− 0[( [ XYE * ∂ ∂ X

i

but For the

=*

X

Y i

εβ+ i

i

*

≤ *

b

a < *

= a * = Y = b

if if if

Y i < Ya i < Yb i

 Y i  Y  i  Y  i

=

f

ε (

X

)

f

ε )(

b. Marginal Effects:

ε iid ~

,0(

)

)(εf

)(εF

Let & denote the density and cdf of ε assume and

Nam T. Hoang UNE Business School

University of New England

4

Then

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

)

*

β

=

<

* Prob[

< a Y

b

]

∂ E Y X ( ∂ X

This result does not assume ε is normally distributed. For the standard case with censoring at

ε N ~

,0(

)

zero and normally distributed disturbances;

i

i

) β Φ= β . X i σ ∂ ( XYE i ∂ X      

OLS estimates usually = MLE estimate times the proportion of non-limit observations in the

sample

i

i

) o A useful decomposition of ∂ ( XYE i ∂ X

i

{ −Φ= β .

})

i

i

i

i

i

i

i

β

λ

=

) + + + 1[ ( λαφλαλ )] i ∂ XYE ( i ∂ X

)

i

αΦ=Φ ( i

α i

X i σ

φ i = i Φ

i

Where: , and

>

)

,

[

0]

i

i

=

>

Prob[

0].

Y i

∂ ( E Y X i ∂ X

∂ E Y X Y i i ∂ X

i

i

>

0]

+

>

[

,

0].

E Y X Y i

i

i

∂ Prob[ Y i ∂ X

i

*

Taking two parts separately

iY in the positive

Thus, a change in Xi has two effects: It affects the conditional mean of

part of the distribution and it affects the probability that the observation will fall in that

part of the distribution.

6. Estimation and Inference with Censored Tobit:

Estimation of Tobit model and the truncated regression is similar using MLE.

β

i

=

− Φ

X i σ

1 σ

− Y X i σ

=

>

0

0

  

  

β   

 1  

  

  ϕ    

  

y i

y i

2

β

β )

i

=

+

+

+

− Φ

2 σ

→ = L

ln

(

ln

X i σ

− Y X ( i 2 σ

>

=

1 2

0

0

  

  

y i

y i

 ln 1  

  

 π ) ln(2 )  

  

The log-likelihood for the censored regression model is

The two parts correspond to the classical regression for the non-limit observations and the

Nam T. Hoang UNE Business School

University of New England

5

relevant probabilities for the limit observation. This likelihood is a mixture of discrete and

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

continuous distribution  MLE produce an estimator with all the familiar desirable

θ

γ =

properties attained by MLEs.

β σ

1 →= σ

2

2

+

− Φ

→ = L

(

ln

(

X

(

X

[ ln 1

]

+ θ θ y i

γ ) i

γ ) i

 π ) ln(2 ) 

 

>

=

1 2

0

0

Y i

Y i

and o With

 The Hessian is always negative definite. Newton-Raphson method is simple to use and

usually converges quickly.

β

i

n

1 σ

=

>

=

a

)

f Y Y ( i i

>

i

0

= 1

i

y i

1

− Y X i σ a X σ

 ϕ   − − Φ  

   β   

2

n

β

β )

i

i

=

+

+

− Φ

2 σ

ln

π ln(2 )

ln(

)

L

∑ =

− a X σ

− Y X ( i 2 σ

− 1 2

i

= 1

  

  

 ln 1  

  

  

  

    

    

σ

β =

=

o By contrast, for the truncated model

1 θ

γ θ

After convergence, the original parameters can be uncovered using and

( σβ , )

'

i

i

)

( XA

i

i

Xb i c i

' XXa i i Xb i

  

 = 

−=

− 2 σ

Asymptotic covariance matrix of

a

1(

Where

{ X

i

[ 2 − φγφ i i

i

Φ−Φ− )] i

}i

2

= − 3 σ

+

Φ−

)

[(

X

1(

} 2/ )]

b i

φφγ i i i

2 φγ ) i i

i

4

− σ

−=

+

X

X

X

(

[(

1(

Φ−Φ− )]

2

} 4/

{ ( X { (

c i

3 φγ ) i i

φγ ) i i

i

2 φγ ) i i

i

γ =

iφand

iΦ are evaluated at γiX

β σ

− 1

n

VarCov σβ )

(

,

)

iXA (

i

= 1

 = ∑  

  

=)

( iXA

     

     

Where:

Nam T. Hoang UNE Business School

University of New England

6

o Researchers often compute least squares estimates despite their inconsistency.

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

o Empirical regularity: MLE estimates can be approximated by dividing OLS estimates by

β

β

*

εβ

>

=

>+

=

−>

=

>

Pr

)0

Pr

)0

Pr

β )

Pr

( Xob

ob

X

( Yob i

i

ε ( i

i

ε i σ

X i σ

X i σ

  ob 

 Φ= 

  

  

β

β

β

=

Φ

OLS

MLE

X i σ

  

  

the propotion of non-limit observation in the sample:

o Another strategy is to discard the limit observations, that just trades the censoring problem

for the truncation problem.

SOME ISSUES IN SPECIFICATION III.

Heteroscedascticity and Non-normality:

o Both heteroskedasticity & non-normality result in the Tobit estimator βˆ being

inconsistent for β.

o Note that in OLS we don’t need normality, consistency based on the CLT and we only

) 0=XE ε (

(exogeneity)  data censoring can be costly. need

( , >YXYE

)0

functional forms for and . o Presence of hetero or non-normality in Tobit on truncated model entirely changes the )XYE (

SAMPLE SELECTION MODEL: IV.

7. Incidental Truncation in a Bivariate Distribution: o Suppose that y & Z have a bivariate distribution with correlation ρ. o We are interested in the distribution of y give that Z exceeds a particular value

 If y & Z are positively correlated, then the truncation of Z should push the distribution

=

>

f y Z Z

a

( ,

)

of Y to the right.

a

Prob(

)

o The truncated joint density of y and Z is f y Z ) ( , > Z

For the bivariate normal distribution:

yµ and

Zµ , standard

Theorem: If y and Z have a bivariate normal distribution with mean

yσ and

Zσ and correlation ρ, then:

Nam T. Hoang UNE Business School

University of New England

7

deviations

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

(

)

y

y 2

> >

a ) a

)

[1

(

)]

= µ ρσ λα Z − = σ ρδ α Z

+ 2 y

 E y Z (   Var y Z ( 

=

(

a

α z

=

)

αλ ( z

)] −

=

σµ ) Z Z αφ ) ( z Φ− α ( z ) )[ (

(

(

)

]

z

z

   1[   ααλαλαδ  z z

=

→<

Where:

)

aZ

αλ ( Z

αφ − ( ) Z α Φ ( ) Z

If the truncation is

ρ )

Zy ,(

~)

N

,1,1,0,0(

= ) aZyE

(

ρ= a

2 ρ−=

(

1)

= aZyV

<

−=

ρ

aZyE

(

)

φ Φ

a )( a )(

>

=

aZyE

(

)

φ a )( Φ− )( a

ρ 1

2

δρ−=

Var

(

> aZy

1)

a )(

For the standard bivariate normal:

∑µNy (

~

,

)

y 1

11

12

=

µ

=

=∑

y

General case: Let and partition y, µand ∑ into:

∑ ∑

∑ ∑

y

21

22

2

µ 1 µ 2

  

  

  

  

  

  

,

)

(~

,

)

y

1y is

1 ∑µN , (

11

2

∑µN 2

22

, . Then the marginal distribution of

1 yy

2

(

),

]

[~ N

y

yy 1

2

µ 1

∑∑+ 12

− 1 22

2

µ 2

∑∑∑−∑ 12

11

− 1 22

21

Conditional distribution of is:

8. The Sample Selection Model:

+

= γ uWZ i

* i

i

*

iZ : difference between a person’s market wage and her reservation wage, the wage

a) Wage equation:

rate necessary to make her choose to participate in the labour for

0

* >iZ

Nam T. Hoang UNE Business School

University of New England

8

participate

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

0≤iZ

iW : education, age,…

do not participate

=

Y i

+ X β ε i

i

iY : number of hours supplied

iX : wage # children, marital status.

b) Hours equation

0

u&ε i

i

* >iZ

iY is observed only when

 . Suppose have a bivariate normal

=

(

)0

E Y Y is observed )

i ZYE (

* > i

i

i

=

−>

W

uYE ( i

i

γi )

=

+

−>

β

X

E

i

ε ( i

u i

γ W ) i

=

+

=

+

)

(

)

(

X

X

αλββ λ u

i

i

i

i

=)

distribution with zero mean and correlation ρ.

αλ ( u

α −= u

W σγ i u

αλρσβ ε u ) ( σγφ W i u )u ( σγ Φ W i

+

>

=

+

+

)0

=> 0

X

)

* i

v i

* i

ZYE ( i

ZY i

(λββ λ u i

i

u

v i

λ ρσβ = ε

Where:

OLS estimation produces inconsistent estimates of βbecause of the omitting of

)

iλwere observed, the OLS would be inefficient. The

i αλ ( u

relevant variable . Even if

iv is heteroskedasticity.

disturbance

+

=

γ Z W u i i

* i

model

biary

choice

1

if

Z

* i

Z

i

0

otherwise

 =  

  >  0  

=

= Φ

Prob(

Z

1

W

)

i

i

W γ ( ) i

=

= − Φ

Prob(

Z

0

W

) 1

i

i

W γ ( ) i

We reformulate the model as follow:

εβ+

Regression model:

= i Y i X

1=iZ

~)

,

iu ε bivariate normal (

i

, observed only if

u

Nam T. Hoang UNE Business School

University of New England

9

, , , σ , ε σ ε ,0,0[ ,1 σµµ ε uu ρ ] σ ε u

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

i WZ & are observed for a random sample

i

Suppose that, as in many of these studies,

iY is observed when only

1=iZ

=

=

+

,1

,

γλρσβ ) ε

ZYE [ i

i

XWX ] i

i

W ( i

i

. of individuals but

c) Estimation

The parameters of the sample selection model can be estimated by maximum

likelihood estimation. However Heckman’s (1979) two-step estimation procedure is

usually used instead: o Estimate the probit equation by MLE to obtain estimates of γ. For each

=

=

observation in the selected sample, compute

ˆ(ˆ ˆ γλλδ )ˆ i i

W+ i

i

ˆ γφλ W Φ )ˆ ( i

i

γ )ˆ ( W i

and

λˆ&X .

λ ρσβ = ε

]ˆ,ˆ[ λββ :

by least-squares regression of Y and o 2. Estimate βand

=

=

+

+

,1

)

X

ε

( ZY i

i

, WX i i

λρσβ i

i

v i

o Asymptotic covariance matrix of

2

2

=

=

[

,1

]

1(

)

Var

Zv 1

i

, WX i i

δρσε − i

*

=

Heteroskedasticity:

[

,

X

X

]ˆ,ˆ[ λβββ =

* i

λ ] i

i

*

'

*

'

2 ρ

VarCov

[

*' XX

− 1 []

*' IX (

)

X

][

*' XX

− 1 ]

2 σβ = ) ( ε

Let ,

)

(

I

− 2ρI

2 δρ− i

+

=

=

lim

' ee

p

;ˆˆ 2 δβδ λ

ˆ δ i

1 n

1 n

 2 σ ˆ  ε 

  

* ρ = ˆ

ˆ * β λ * σ ˆ ε

is a diagonal matrix with on the diagonal. Where

>

=

X

0

X

]

+ β ε 1

i

Y 1

<

1 0

0

* Y 2 Y 2

X W , [ i i * = Z Y 2

  =  

= X

* Y 2

εβ + 2

2

ε

=

)

~

,0(

N

d) Model:

ε  1  ε  2

  

=∑

Assume

  

σσ  11 12  σ 1  12

Nam T. Hoang UNE Business School

University of New England

10

With

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

Heckman’s two steps estimation:

0

1 ≠Y

>

=

+

>

(

0)

X

E

0)

E Y Y 1 i i

* 2

β 1

i

ε ( 1 i

Y i

* 2

=

+

−>

X

E

X

)

β 1

i

εε ( i i 1

2

β 2

i

=

(

)

iX

αλρσβ+ 1 i

i

β 2

−=

In the subsample for which we have

α i

X i σ 2

Where

0

1 ≠Y

>

> = 0

(

0)

Y Y 1 i i

* 2

E Y Y 1 i i

* 2

+ v i

=

+

+

X

(

)

1 αλρσβ

i

i

i

v i

=

+

X

i

λββ λ + i

v i

Therefore, in the subsample for which

,

,

= 0)0

xvE ( i i

λ i

>i Y

* 2

=

ρ

This is a proper regression equation in the sense that:

σ 12 σ 1

Note that:

ˆβ . Use this estimates to construct:

Regression of 1Y on X is subject to the omitted variable bias.

2

=

ˆ λ i

1

)ˆ − βφ X ( 2 i )ˆ −Φ− β ( X 2

i

o Heckman’s two steps estimation: (Heckit) procedure 1. Estimate the probit equation by MLE to get

iλˆ

1iY on

iX and

2. Regress

o Maximum likelihood:

0

1

2 =Y

2 =Y

There are two data regimes: and .

Construct the Likelihood Function:

0

X−<

,

not

observed

ε 2

β 2

=Y 2

Y 1

What is known about ε Regime

=

−>

observed

,

1

X

X

=Y 2

β 2

Y 1

ε 1

Y 1

εβ , 2 1

1

2

Nam T. Hoang UNE Business School

University of New England

11

Regime 1: likelihood element:

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

X

β 2

f

(

))

εε d 2 ) ( 2

2βX−Φ

∞−

(

+∞

X

d

,

Yf ( 1

εεβ ) 2 2

1

X

β 2

X

+∞

β 2

,

∑ = )

,

f

X

d

β β ( , 1 2

ε ε ( ) d 2

2

( f Y 1

β ε ε ) 2 2

1

1

2

−∞

X

β 2

Regime 2:

>

>

X

0

or

(

and

)

0

or

both

εβ + 1

1

* Y 2

* Y 3

=

Y 1

0

otherwise

  

2

= =

X X

+ εβ 2 + εβ 3

3

*  Y 2  * Y  3

THE DOUBLE SELECTION MODEL: V.

=

+

+

X

E i

εδβ C i i

i

iC is a dummy variable indicating whether or not the individual attended college.

VI. REGRESSION ANALYSIS OF TREATMENT EFFECTS:

Does δmeasure the value of a college education?

(Assume the rest of the regression model is correctly specified)

The answer is no

If the typical individual who chooses to go to college would have relatively high earnings

whether or not he or she went to college  The problem is one of seft-selection (sample

selection).

 δwill overestimate the treatment effect.

 Other settings in which the individuals themselves decide whether or not they will receive

+

= γ uWC i

* i

i

>

0

if

= 1 =

1

* C i otherwise

 C i  C  i

=

=

δβ

++

=

,1

)

X

,1

)

CYE ( i i

ZX , i

i

i

ε ( CE i i

ZX , i

i

=

++

( γλρσδβ )

X

ε

− W i

i

the treatment.

Nam T. Hoang UNE Business School

University of New England

12

 estimate this model using the two-step estimator. For non-paticipate:

Advanced Econometrics - Part II

Chapter 5: Limited - Dependent Variable Models

=

=

,0

)

X

ρσβ + ε

CYE ( i i

ZX , i

i

i

 γφ − W ( ) i  Φ− γ ( W ) 1  i

  

=

=

+=

,1

)

,0

)

ρσδ ε

CYE ( i i

ZX , i

i

CYE ( i i

ZX , i

i

φ Φ−Φ 1(

)

i

i

  

  

δleast square overestimate the effect.

Nam T. Hoang UNE Business School

University of New England

13

The difference in expected earings between participants and non-participant is then: