Bài giảng Chapter 1: Classical linear regression

Chia sẻ: Codon_09 Codon_09 | Ngày: | Loại File: PDF | Số trang:0

Thêm vào BST

Báo xấu

60
lượt xem 2
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Bài giảng Chapter 1: Classical linear regression tập trung trình bày các vấn đề cơ bản về model; assumptions of the classial regression model; least souares estimation;... Mời các bạn cùng tìm hiểu và tham khảo nội dung thông tin tài liệu.

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Bài giảng Chapter 1: Classical linear regression

Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Chapter 1: CLASSICAL LINEAR REGRESSION I. MODEL: Population model: Y = f ( X 1 , X 2 , ... , X k ) + ε Dependent Explanatory variable Disturbance variable or Regressor (error) - f may be any kind (linear, non-linear, parametric, non-parametric, ...) - We'll focus on: parametric and linear in the parameters. Sample information: - We have a sample: {Yi , X i 2 , X , ... , X ik }in=1 - Assume that these observed values are generated by the population model: Yi = β1 + β 2 X 2i + β 3 X 3i + ... + β k X ki + ε i - Objectives: i. Estimate unknown parameters. ii. Test hypotheses about parameters. iii. Predict values of y outside sample. ∂Yi - Note that: β k = , so the parameters are the marginal effect of the X's on Y, ∂X ki with other factors held constant. EX: Ci = β1 + β 2Yi + ε i Nam T. Hoang University of New England - Australia 1 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression ∂Ci βk = = M .P.C ⇒ require 0 ≤ β ≤ 1 ∂Yi Denotes: Y1  1 X 12 X 13 ... X 1k   β1  ε 1  Y  1 X X 23 ... X 2 k  β  ε  Y=  2 ; X = 22  ; β =  2  and ε =  2                    Yn  1 X n 2 X n3 ... X nk  β n  ε n  ⇒ We have: Y = X β + ε ( n×1) ( n×k ) ( k ×1) ( n×1) II. ASSUMPTIONS OF THE CLASSICAL REGRESSION MODEL: Models are simplications of reality. We'll make a set of simplifying assumptions for the model. The assumptions relate to: - Functional form. - Regressors. - Disturbances. Assumption 1: Linearity. The model is linear in the parameters. Y = X.β + ε Assumption 2: Full rank. Xijs are not random variables - or Xijs are random variables that are uncorrelated with ε. There is no exact linear dependencies among the columns of X. This assumption will be necessary for estimation of the parameters (need no EXACT). X ; 1 Rank(X) = k implies n > k ( n ×k ) as Rank (A) ≤ min (Rows, Columns) Rank(X) = n is also OK. Nam T. Hoang University of New England - Australia 2 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Assumption 3: Exogeneity of the independent variables. E[ε i X j1 X j 2 , X j 3 , ... , X jk ] = 0 i = j also i ≠j This means that the independent variables will not carry useful information for prediction of εi E[ε i X ] = 0 ∀i = 1, n → E (ε i ) = 0 Var (ε i ) = σ 2 i = 1, n Assumption 4:  Cov(ε i , ε j ) = 0 ∀i ≠ j  z1  z  For any random vector Z =   , we can express its variance - 2      zn  covariance matrix as: VarCov( Z ) = E [( Z − E ( Z ))( Z − E ( Z ))' ] =     z1 − E ( z1 )       z 2 − E ( z 2 )   = E [( z1 − E ( z1 )) ( z 2 − E ( z 2 )) ( z m − E ( z m ))                      1× m    z m − E ( z m )         m ×1  jth diagonal element is var(zj) = σjj = σj2 ijth element (i ≠ j) is cov(zi,zj) = σij  E[( z1 − E ( z1 )) 2 ] σ 12  σ ij    σ 21 σ 22  σ 2m  =           σ m1 σ m2  σ mn  So we have "covariance matrix" for the vector ε VarCov (ε ) = E[(ε −  E (ε ))(ε −  E (ε ))' ] = E (εε ' ) 0 0 Then the assumption (4) is equivalent: Nam T. Hoang University of New England - Australia 3 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression σ 2 0  0    0 σ2  0 E (εε ' ) = σ I = 2         0 0  σ 2 Var (ε i ) = σ 2 i = 1, n (hom oscedasticity ) ⇔  Cov (ε i , ε j ) = 0 ∀i ≠ j (no autocorrelation) Assumption 5: Data generating process for the regressors. (Non-stochastic of X). + Xijs are not random variables. Notes: This assumption is different with assumption 3. E[ε i X ] = 0 tell about the mean only (has to be 0). Assumption 6: Normality of Errors. ε ~ N [0, σ 2 I ] + Normality is not necessary to obtain many results in the regression model. + It will be possible to relax this assumption and retain most of the statistic results. SUMMARY: The classical linear regression model is: Y = X.β + ε ε ~ N [0, σ 2 I ] Rank(X) = k X is non-stochastic III. LEAST SQUARES ESTIMATION: (Ordinary Least Squares Estimation - OLS) Our first task is to estimate the parameters of the model: Y = X.β + ε with ε ~ N [0, σ 2 I ] Nam T. Hoang University of New England - Australia 4 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Many possible procedures for doing this. The choice should be based on "sampling properties" of estimates. Let's consider one possible estimation strategy: Least Squares. Denote βˆ is estimator of β: True relation: Y = X.β + ε Estimated Relation: Y = X. βˆ + e while: e1  e  e =   is estimated residuals (of ε) or ei is estimated of εi 2      en  For the ith observation: Yi = X i′ β + ε i = X i' βˆ + ei      unobserved observed ( population ) ( sample ) n Sum of square residuals: ee′ = ∑ ei2 i =1 n ∑e i =1 2 i = e′e = (Y − Xβˆ )′(Y − Xβˆ ) = Y ′Y − βˆ ′X ′Y − Y ′Xβˆ + βˆ ′X ′Xβˆ = Y ′Y − 2 βˆ ′X ′Y + βˆ ′X ′Xβˆ ( βˆ ′X ′Y = Y ′Xβˆ ) Min(Y ′Y − 2 βˆ ′X ′Y + βˆ ′X ′Xβˆ ) We need decide βˆ satisfies:  βˆ The necessary condition for a minimum: ∂[e′e] ∂[Y ′Y − 2 βˆ ′X ′Y + βˆ ′X ′Xβˆ ] =0 ⇔ =0 ∂βˆ ∂βˆ  1 1 1 ... 1  Y1  X X 22 X 32 ... X n 2  Y2  βˆ ′X ′Y = [ βˆ1 βˆ2  βˆk ]  12                 X 1k X 2k X 3k ... X nk  Yn  Nam T. Hoang University of New England - Australia 5 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression ∑ Yi    = [ βˆ1 βˆ2   βˆk ]  ∑ X i 2 Yi      ∑ X ik Yi    Take the derivative w.r.t each βˆ : ∑ Yi    ∂[ βˆ ′X ′Y ] ∑ X i 2Yi  ∂βˆ =   = X ′Y    ∑ X ik Yi     1 1 1 ... 1  1 X 12 X 13 ... X 1k  X X 22 X 32 ... X n 2  1 X 22 X 23 ... X 2 k  X ′X =  12                  X 1k X 2k X 3k ... X nk  1 X n 2 X n3 ... X nk   n  ∑X i2 ∑X i3 ∑X ... ik   = ∑ X i2 ∑X 2 2i ∑X ∑X i2 i3 ... ∑X ∑X i2 ik           ∑ X ik ∑X ∑X ∑X ∑X ∑X 2 ik i2 ik i3 ... ik  Symmetric Matrix of sums of squares and cross products: βˆ ′X ′Xβˆ : quadratic form. k k βˆ ′X ′Xβˆ = ∑∑ ( X ′X ) ij βˆi βˆ j i =1 j =1 Take the derivatives w.r.t each βˆi :  ∂[( X ′X ) ij βˆi2 ] j =i: = 2( X ′X ) ij βˆi  ∂β ˆ i →   ( X ′X ) ij βˆi βˆ j / ∂βˆ j ≠i:  → 2( X ′X ) ij βˆ j ∂ j = 1, n  ( X ′X ) ji β j β i ˆ ˆ Nam T. Hoang University of New England - Australia 6 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression ( X ′X )ij βˆi βˆ j / ∂βˆ →  → 2( X ′X )ij βˆ j ∂ j = 1, n ( X ′X ) ji β j β i ˆ ˆ ∂[ βˆ ′( X ′X ) βˆ ] Then = 2( X ′X ) βˆ ∂βˆ ∂[e′e] So = 0 ⇔ − 2 X ′Y + 2( X ′X ) βˆ = 0 (call "Normal equations"). ∂βˆ → ( X ′X ) βˆ = X ′Y → βˆ = ( X ′X ) X ′Y −1 Note: for the existence of ( X ′X ) −1 we need assumption that Rank(X) = k IV. ALGEBRAIC PROPERTIES OF LEAST SQUARES: 1. "Orthogonality condition": ⇔ − 2 X ′Y + 2( X ′X ) βˆ = 0 (Normal equations). βˆ ) = 0 ⇔ X ′(Y−X e ⇔ X ′e = 0  1 1 1 ... 1  e1  0 X X 22 X 32 ... X n 2   e   0 ⇔  12   2 =                    X 1k X 2k X 3k ... X nk   e n   0 n ∑ ei = 0  i =1 ⇔ n j = 1, n  X e =0 ∑ i =1 ij i 2. Deviation from mean model (The fitted regression passes through X , Y ) Yi = βˆ1 + βˆ2 X 2i + βˆ3 X 3i + ... + βˆk X ki + ei i = 1, n Sum overall n observations and divide by n Nam T. Hoang University of New England - Australia 7 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression n Y = βˆ1 + βˆ2 X 2 + βˆ3 X 3 + ... + βˆk X k + ∑ ei i =1    0 Then: Yi − Y = βˆ1 + βˆ2 ( X 2i − X 2 ) + βˆ3 ( X 3i − X 3 ) + ... + βˆk ( X ki − X k ) + ei i = 1, n In model in deviation form, the intercept is put aside and can be found later. 3. The mean of the fitted values Yˆi is equal to the mean of the actual Yi value in the sample: Yi = X i′ β + ε i = X i′βˆ + ei  Yˆi n n n → ∑Yi = ∑Yˆi + i =1 i =1 ∑e i =1 i i = 1, n    0 → ∑Y = ∑Yˆ i i → Y = Yˆ Note that: These results used the fact that the regression model include an intercept term. V. PARTITIONED REGRESSION: FRISH-WAUGH THEOREM: 1. Note: Fundamental idempotent matrix (M): e = Y − X ′βˆ = Y − X ( X ′X ) −1 ( X ′Y ) = [ I − X ( X ′X ) −1 X ′]Y n ×n     n ×n = [( Xβ + ε ) − X ( X ′X ) −1 X ′( Xβ + ε )] = [( Xβ + ε ) − Xβ − X ( X ′X ) −1 X ′ε )] = [ I − X ( X ′X ) −1 X ′]ε  M ( n ×n ) So residuals vector e has two alternative representations: Nam T. Hoang University of New England - Australia 8 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression e = MY   e = Mε M is the "residual maker" in the regression of Y on X. M is symmetric and idempotent, that is: M = M ′   M .M = M ′ [ ] M ′ = I − X ( X ′X ) −1 X ′ = I − [ X ( X ′X ) −1 X ′]′ = I − X ( X ' X ) −1 X ' =M Note: ( AB )′ = B ′A′ [ ][ M .M = I − X ( X ′X ) −1 X ′ I − X ( X ′X ) −1 X ′ ] = I − X ( X ′X ) −1 X ′ − X ( X ′X ) −1 X ′ + X ( X ′X ) −1 X ′X ( X ′X ) −1 X ′  I = I − X ( X ′X ) −1 X ′ = M Also we have: MX = [ I − X ( X ′X ) −1 X ′] X = X − X (X ′X ) −1  X ′ X = X − X = 0 n ×k n ×k k ×k k ×n n ×k    n ×n 2. Partitioned Regression: Suppose that our matrix of regressors is partitioned into two blocks: X = [ X1 X 2 ]  ( k1 + k 2 = k ) n ×k  n × k1 n × k 2 Y = X 1 β1 + X 2 β 2 + ε n ×1   n × k1 k1 ×1 n × k 2 k 2 ×1  βˆ  Y = [ X 1 X 2 ] 1  + e n ×1  βˆ2  The normal equations: ( X ′X ) βˆ = X ′Y ⇔ [ X 1 X 2 ]′[ X 1 X 2 ]βˆ = [ X 1 X 2 ]′Y Nam T. Hoang University of New England - Australia 9 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression  X 1′   βˆ1   X 1′  ⇔  X ′ [ X 1 X 2 ] ˆ  =  X ′ Y  2  β 2   2   X 1′ X 1 X 1′ X 2   βˆ1   X 1′Y  ⇔ X ′ X  =  2 1 X 2′ X 2   βˆ2   X 2′Y  ( X 1′ X 1 ) βˆ1 + ( X 1′ X 2 ) βˆ2 = X 1′Y (a ) ⇔  ( X 2′ X 1 ) βˆ1 + ( X 2′ X 2 ) βˆ2 = X 2′Y (b) From (a) → ( X 1′ X 1 ) βˆ1 = X 1′ ( − X 2 βˆ2 + Y ) or βˆ1 = ( X 1′ X 1 ) −1 X 1′( − X 2 βˆ2 + Y ) (c) Put (c) into (b): ( X 2′ X 1 )( X 1′ X 1 ) −1 X 1′ ( − X 2 βˆ2 + Y ) + ( X 2′ X 2 ) βˆ2 = X 2′Y ⇔ − X 2′ X 1 ( X 1′ X 1 ) −1 X 1′ X 2 βˆ2 + ( X 2′ X 2 ) βˆ2 = X 2′Y − X 2′ X 1 ( X 1′ X 1 ) −1 X 1′Y ⇔ X 2′ [ I − X 1 ( X 1′ X 1 ) −1 X 1′ ] X 2 βˆ2 = X 2′ [ I − X 1 ( X 1′ X 1 ) −1 X 1′ ]Y   M M ( n ×n ) ( n ×n ) We have: ( X 2′ M 1 X 2 ) βˆ2 = X 2′ M 1Y → βˆ2 = ( X 2′ M 1 X 2 ) −1 X 2′ M 1Y M = M ′ Because   M .M = M Then: ( X 2′ M 1′ M 1 X 2 ) βˆ2 = X 2′ M 1′ M 1Y   X 2* Y2* ( X 2* ' X 2* ) βˆ2 = X 2* ' Y * → βˆ2 = ( X 2* ' X 2* ) −1 X 2* ' Y *  X 2* = M 1 X 2 → X 2* ' = X 2′ M 1′ Where:  * Y = M 1Y Interpretation: • Y * = M 1Y = residuals from regression of Y on  X1 n ×1 n × k1 Nam T. Hoang University of New England - Australia 10 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression • X 2* = M 1 X 2 = matrix of residuals from regressions of X2 variables on  X1 n × k1 Suppose we regress Y on X1 and get the residuals and also regress X2 (each column of X2) on X1 and get the matrix of the residuals. • Regressing Y on X1, the residuals are: e1 = Y − Yˆ = Y − X 1 [( X 1′ X 1 ) −1 X 1′Y ] = M 1Y = Y *  n ×1 • X 1 ): Regressing X2 (each column of X2 on  n × k1   β + ε E = X1  n ×k 2 n ×k n × k1 k1 × k 2 2 E = ( X 2 − Xˆ 2 ) = ( X 2 − X 1 β   ˆ ) n ×k     2 n ×k 2 n ×k 2 n ×k 2 n × k1 k1 × k 2 = [ I − X 1 ( X 1′ X 1 ) −1 X 1′ ] X 2 = M 1 X 2 = X 2* • If we now take these residuals, e1 , and fit a regreesion: now we regress e1 on E:  n ×1 ~ E β +u e1 =   n ×k  n ×1 2 k 2 ×1 ~ then we will have: β 2 = βˆ2 We get the same results as if we just regress the whole model. This results is called the "Frisch - Waugh" theorem. Example: Y = Wages X2 = Education (years of schooling). X1 = Ability (test scores) Y = X 1 β1 + X 2 β 2 + ε β2 = effect of one extra year of schooling on wages controlling for ability. Nam T. Hoang University of New England - Australia 11 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression Y* = residuals from regression of Y on X1 (= variation in wages when controlling for ability). X* = residuals from regression of X2 on X1. Then regress Y* on X* → get β2 : Y * = X 2* β 2 + u Example: De-trending, de-seasonaling data: 1  2  Y = t β1 + X 2 β 2 + ε t=  n ×1n ×1    n ×1   k1 ×1 n × k 2 k 2 ×1   n  either include "t" in model or "de-trend" X2 & Y variables by regressing on "t" & taking residuals. Note: Including trend in regression is an effective way de-trending of data. VI. GOODNESS OF FIT: One way of measuring the "quality of the fitted regression line" is to measure the extent to which the sample variable for the Y variable is explain by the model. - The sample variability of Y is: 1 n ∑ n i =1 (Yi − Y ) 2 n or we could just use: ∑ (Y i =1 i − Y )2 - Our fitted regression: Y = Xβˆ + e = Yˆ + e Yˆ = Xβˆ = X ( X ′X ) −1 X ′Y Note that if the model includes an intercept, then Y = Yˆ Now consider the following matrix: Nam T. Hoang University of New England - Australia 12 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression  1 ′ M C =  I − 1 1 n ×n n ~ ~  1 1 where 1 =   ~ n ×1    1 Note that:  1 0  0  n1 1 n  n1   Y1     0 1  0  n1 1  n1   Y2  M Y =  0 − n                0       0  1  n1 1 n  n1   Yn  Y1 1 Y1  1 Yn  Y1 − Y    n n Y 1 Y1  1 Yn  Y2 − Y  = 2 n n =             Y  Yn − Y  1 1 Yn n n n 1 Y We have: • M0 is idempotent. • M0 1~ = 0~ n • ′M (Y 0 ' M 0Y = Y ′M 0Y = ∑ (Yi =1 i − Y )2 ( M 0Y )' So: Y = Xβˆ + e = Yˆ + e M 0Y = M 0 Xβˆ + M 0 e = M 0Yˆ + M 0 e X ′e = e′X = 0 ( ∑ ei = 0 → M e = e) 0 Recall that: ~ e′M 0 ' X = e′M 0 X = 0 ~ → Y ′M Y = 0 ( Xβˆ + e)′ ( M 0 Xβˆ + M 0 e) = ( βˆ ' X ′ + e' ) ( M 0 Xβˆ + M 0 e) Nam T. Hoang University of New England - Australia 13 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression = βˆ ' X ′M 0 Xβˆ + βˆ ' X ′M 0 e + e′M 0 Xβˆ + e′M 0 e = βˆ ' X ′M 0 Xβˆ + e′M 0 e n n n So: ∑ i =1 (Yi − Y ) 2 = ∑ (Yˆi − Y ) 2 + ∑ ei2 i =1 i =1             SST SSR SSE n ( Yˆ − Y ; Yˆ = Xβˆ so βˆ ' X ′M 0 Xβˆ = Yˆ ′M 0Yˆ = ∑ (Yˆ i =1 i − Y )2 ) SST: Total sum of squares. SSR: Regression sum of squares SSE: Error sum of squares Coefficient of Determination: SSR SSE R2 = = 1− SST SST (only if intercept included in models). SSR Note: R2 = ≥0 SST SSE R2 = 1− ≤1 SST ⇒ 0 ≤ R2 ≤ 1 What happens if we add any regressor(s) to the model? Y = X 1 β1 + ε (1) Y = X 1 β 1 + X 2 β 2 + u = Xβ + u ( 2) (A) Applying OLS to (2) ˆu min u' ( βˆ1 βˆ2 ) (B) Applying OLS to (1) Nam T. Hoang University of New England - Australia 14 University of Economics - HCMC - Vietnam
Advanced Econometrics Part I: Basic Econometric Models Chapter 1: Classical Linear Regression min e' e ( β1 ) Problem (B) is just problem A subject to the restriction that β2 = 0. The minimized value in (A) must be ≤ that in (B) so uˆ ' u = e' e . → Adding any regression(s) to the model cannot increase (typically decrease) the sum of squared residuals so R2 must increase (or at worst stay the same), so R2 is not really a very interesting measure of the quality of regression. For this reason, we often use the "Adjusted" R2-Adjusted for "degree of freedom":  e′e  R 2 = 1 −  Y ′M Y  0  e′e /(n − k )  R 2 = 1 −   Y ′M Y /(n − 1)  0 Note: e′e = Y ′M Y and rank(M) = (n-k) 0 n Y ' M 0Y = ∑ (Yˆ i =1 i − Y )2 d of freedom = n-1 R 2 may ↑ or ↓ when variables are added. It may even be negative. Note that: If the model does not include an Intercept, then the equation: SST = SSR + SSE does not hold. And we no longer have 0 ≤ R2 ≤ 1. We must also be careful in comparing R2 across different models. For example: (1) Cˆ i = 0.5 + 0.8Yi R2 = 0.85 (2) log Ci = 0.2 + 0.7 log Yi + u R2 = 0.7 In (1) R2 relates to sample variation of the variable C. In (2), R2 relates to sample variation of the variable log(C). Reading Home: Greene, chapter 3&4 Nam T. Hoang University of New England - Australia 15 University of Economics - HCMC - Vietnam