Economics 6302, Econometrics II

Dr. Philip Rothman

Assignment #4

On Thursday, 4/24/97, you are required to hand in a short "paper" reporting your results in the exercises detailed below. Your paper should be typed/word processed. You may attach any relevant EVIEWS output to your paper. It is imperative, however, that you organize your attached output so that it can easily be matched with the relevant text. You will also be expected to discuss your results in class.

Part I: Linear Probability, Probit, and Logit Models

The cross-sectional data you will analyze are contained in the following EXCEL file (called "pr10.xls"): EXCEL file. First download this file and "import" the data into EVIEWS after setting up an EVIEWS workfile with the following specifications: undated frequency with "Start observation" equal to 1 and "End observation" equal to 95. To "import" the data into EVIEWS you need to know that: (1) the data are ordered "By observation;" (2) the "Upper-left data cell" is A2; and (3) the "Number of series" is 9 (and the names of the series are in the EXCEL file).

The data set contains observations on the characteristics of 95 individuals and whether they voted for or against an increase in school taxes in Troy, Michagan in a local school millage referendum in 1973. The data are:

YESVM: = 1 if the individual voted "yes," = 0 otherwise
PUB12: = 1 if 1 or 2 children attend public school, = 0 otherwise
PUB34: = 1 if 3 or 4 children attend public school, = 0 otherwise
PUB5: = 1 if 5 or more children attend public school, = 0 otherwise
PRIV: = 1 if 1 or more children attend a private, = 0 otherwise
SCHL: = 1 if the individual is a school teacher, = 0 otherwise
INCCON: log of annual household income
PTCON: log of property taxes (a proxy for the price of education)
TYEAR: number of years of residency in Troy

Using "YESVM" as the (limited) dependent variable and all the other data as your explanatory variables, your primary task is to estimate a: (1) linear probability model; (2) probit model; and (3) logit model. Compare your estimation results across the different models. The "fitted values" from these estimated models are the estimated probabilities that the ith individual will vote yes. Are any of these estimated probabilities greater than 1 or less than 0 for the linear probability model? Calculate and compare these estimated probabilities, for all three models, for the first ten individuals in the sample. Finally, suppose we used the following rule of thumb to predict voting behavior:

if P_i > .50, then the ith person would vote "yes,"
if P_i < 0.5, then the ith person would vote "no."
Then, predict the voting decisions of the first 10 individuals with each of the three estimated models.

Part II: a "Heckit" Exercise

First download the following ASCII file (called "mroz.dat"): ASCII file. It contains data from Mroz, Thomas A. (1987), "The Sensitivity of an Empirical Model of Married Women's Hours of Word to Economic and Statistical Assumptions," ECONOMETRICA, 55, July, 765-799. The following data are contained in this ascii file:

LFP: A dummy variable = 1 if the married woman worked in 1975, else 0
WHRS: Wife's hours of work in 1975
KL6: Number of children less than 6 years old in household
K618: Number of children between ages 6 and 18 in household
WA: Wife's age
WE: Wife's educational attainment, in years
WW: Wife's average hourly earnings, in 1975 dollars
RPWG: Wife's wage reported at the time of the 1976 interview (not the same as the 1975 estimated wage). To use the subsample with this wage, one needs to select 1975 workers with LFP=1, then select only those women with non-zero RPWG. Only
HHRS: Husband's hours worked in 1975
HA: Husband's age
HE: Husband's educational attainment, in years
HW: Husband's wage, in 1975 dollars
FAMINC: Family income, in 1975 dollars. This variable is used to construct the property income variable.
MTR: This is the marginal tax rate facing the wife, and is taken from published federal tax tables (state and local income taxes are excluded). The taxable income on which this tax rate is calculated includes Social Security, if applicable t
WMED: Wife's mother's educational attainment, in years
WFED: Wife's father's educational attainment, in years
UN: Unemployment rate in county of residence, in percentage points. This taken from bracketed ranges.
CIT: Dummy variable = 1 if live in large city (SMSA), else 0
AX: Actual years of wife's previous labor market experience

To "import" this file into EVIEWS, create a workfile with the following specifications: undated frequency with "Start observation" equal to 1 and "End observation" equal to 753. To "import" the data into EVIEWS you need to know that: (1) the data are ordered "By observation;" (2) the "Text for NA" is NA; and (3) the "Number of series" is 19 (and the names of the series are in the ASCII file).

Once you have read the data into EVIEWS, you need to carry out the following transformations (with the "GENERATE" command):

LWW = LOG(WW); click the "ok" button when EVIEWS responds with the message "Log of negative number- Missing data generated"
PRIN = FAMINC - (WHRS*WW); this is a measure of "property income"
AX2 = AX*AX
WA2 = WA*WA
WE2 = WE*WE
WA3 = WA2*WA
WE3 = WE2*WE
WAWE = WA*WE
WA2WE = WA2*WE
WAWE2 = WA*WE2

In the first stage of the Heckit procedure, estimate a probit model in which LFP is the dependent variable and the explanatory variables include a constant term, KL6, K618, WA, WE, WA2, WE2, WAWE, WA3, WE3, WA2WE, WAWE2, WFED, WMED, UN, CIT, and PRIN. Comment on the statistical significance of parameters estimated in this probit equation. The residual from this regression, called "RESID" by EVIEWS, contains the inverse Mills ratio needed for the second stage of the Heckit procedure. Store this residual series under the name INVR.

Next, restricting your sample to those who work for pay, i.e., only those observations for which LFP = 1 (in EVIEWS set the sample equal to "1 to 753 if LFP = 1"), estimate by OLS a wage determination equation allowing for sample selectivity and compare results to an equation that does not take into account the sample selectivity. In particular, let LWW be a linear function of a constant term, KL6, K618, WA, WE, WA2, WE2, WAWE, WA3, WE3, WA2WE, WAWE2, WMED, WFED, UN, CIT, and PRIN. Estimate this equation by OLS and employ White's robust standard error procedure. Comment on the signs and statistical significance of the estimated parameters. Next estimate the same equation by OLS (again using White's robust standard error procedure) but allow for sample selectivity by including the variable INVR created above. Comment on the sensitivity of the estimated parameters to inclusion of the sample selectivity adjustment. Is sample selectivity significant?

Link to Philip Rothman's homepage

Link to ECU Economics Home Page