Economics 6302, Econometrics II
Dr. Philip Rothman
Assignment #4
On Thursday, 4/24/97, you are required to hand in
a short "paper"
reporting your results in the exercises detailed below. Your paper
should be typed/word processed. You may attach any relevant EVIEWS
output to
your paper. It is imperative, however, that you organize your attached
output so that it can easily be matched with the relevant text. You
will also be expected to discuss your results in class.
Part I:   Linear Probability, Probit, and Logit Models
The cross-sectional data you will analyze are contained in the
following EXCEL file (called "pr10.xls"):  
EXCEL file. First download this file and "import" the data into
EVIEWS after setting up an EVIEWS workfile with the following
specifications: undated frequency with "Start observation" equal to 1
and "End observation" equal to 95. To "import" the data into EVIEWS you
need to know that: (1) the data are ordered "By observation;" (2) the
"Upper-left data cell" is A2; and (3) the "Number of series" is 9 (and
the names of the series are in the EXCEL file).
The data set contains observations on the characteristics of 95
individuals and whether they voted for or against an increase in school
taxes in Troy, Michagan in a local school millage referendum in 1973. The
data are:
- YESVM:   = 1 if the individual voted "yes," = 0 otherwise
- PUB12:     = 1 if 1 or 2 children attend public school,
= 0 otherwise
- PUB34:     = 1 if 3 or 4 children attend public school,
= 0 otherwise
- PUB5:       = 1 if 5 or more children attend public
school, = 0 otherwise
- PRIV:       = 1 if 1 or more children attend a private,
= 0 otherwise
- SCHL:       = 1 if the individual is a school teacher,
= 0 otherwise
- INCCON: log of annual household income
- PTCON:   log of property taxes (a proxy for the price of education)
- TYEAR:   number of years of residency in Troy
Using "YESVM" as the (limited) dependent variable and all the other data
as your explanatory variables, your primary task is to estimate a: (1)
linear probability model; (2) probit model; and (3) logit model. Compare
your estimation results across the different models. The "fitted
values" from these estimated models are the estimated probabilities that
the ith individual will vote yes. Are any of these estimated
probabilities greater than 1 or less than 0 for the linear probability
model? Calculate and compare these
estimated probabilities, for all three models, for the first ten
individuals in the sample. Finally, suppose we used the following rule of
thumb to predict voting behavior:
                   
     
if Pi > .50, then the ith person would vote "yes,"
                   
     
if Pi < 0.5, then the ith person would vote "no."
Then,
predict the voting decisions of the first 10 individuals with each of the
three estimated models.
Part II:   a "Heckit" Exercise
First download the following ASCII file (called "mroz.dat"):
ASCII file. It contains data from Mroz, Thomas A. (1987), "The
Sensitivity of an Empirical Model of Married Women's Hours of Word
to Economic and Statistical Assumptions," ECONOMETRICA,
55, July, 765-799. The following data are contained in this ascii file:
- LFP:   A dummy variable = 1 if the married woman worked in
1975, else 0
- WHRS:   Wife's hours of work in 1975
- KL6:   Number of children less than 6 years old in household
- K618:   Number of children between ages 6 and 18 in household
- WA:   Wife's age
- WE:   Wife's educational attainment, in years
- WW:   Wife's average hourly earnings, in 1975 dollars
- RPWG:   Wife's wage reported at the time of the 1976 interview (not the same as the 1975 estimated wage). To use the subsample with this wage, one needs to select 1975 workers with LFP=1, then select only those women with non-zero RPWG. Only
- HHRS:   Husband's hours worked in 1975
- HA:   Husband's age
- HE:   Husband's educational attainment, in years
- HW:   Husband's wage, in 1975 dollars
- FAMINC:   Family income, in 1975 dollars. This variable is used to construct the property income variable.
- MTR:   This is the marginal tax rate facing the wife, and is taken from published federal tax tables (state and local income taxes are excluded). The taxable income on which this tax rate is calculated includes Social Security, if applicable t
- WMED:   Wife's mother's educational attainment, in years
- WFED:   Wife's father's educational attainment, in years
- UN:   Unemployment rate in county of residence, in percentage points. This taken from bracketed ranges.
- CIT:   Dummy variable = 1 if live in large city (SMSA), else 0
- AX:   Actual years of wife's previous labor market experience
To "import" this file into EVIEWS, create a workfile with the following
specifications:
undated frequency with "Start observation" equal to 1
and "End observation" equal to 753. To "import" the data into EVIEWS you
need to know that: (1) the data are ordered "By observation;" (2) the
"Text for NA" is NA; and (3) the "Number of series" is 19 (and
the names of the series are in the ASCII file).
Once you have read the data into EVIEWS, you need to carry out the
following transformations (with the "GENERATE" command):
- LWW = LOG(WW); click the "ok" button when EVIEWS responds with
the message "Log of negative number- Missing data generated"
- PRIN = FAMINC - (WHRS*WW); this is a measure of "property income"
- AX2 = AX*AX
- WA2 = WA*WA
- WE2 = WE*WE
- WA3 = WA2*WA
- WE3 = WE2*WE
- WAWE = WA*WE
- WA2WE = WA2*WE
- WAWE2 = WA*WE2
In the first stage of the Heckit procedure, estimate a probit model in
which LFP is the dependent variable and the explanatory variables include
a constant term, KL6, K618, WA, WE, WA2, WE2, WAWE, WA3, WE3, WA2WE,
WAWE2, WFED, WMED, UN, CIT, and PRIN. Comment on the statistical
significance of parameters estimated in this probit equation.
The residual from this regression,
called "RESID" by EVIEWS, contains the inverse Mills ratio needed for
the second stage of the Heckit procedure. Store this residual series under
the name INVR.
Next, restricting your sample to those who work for pay, i.e., only those
observations for which LFP = 1 (in EVIEWS set the sample equal to
"1 to 753 if LFP = 1"), estimate by OLS a wage determination
equation allowing for sample selectivity and compare results to an equation
that does not take into account the sample selectivity. In particular, let
LWW be a linear function of a constant term, KL6, K618, WA, WE, WA2, WE2,
WAWE, WA3, WE3, WA2WE, WAWE2, WMED, WFED, UN, CIT, and PRIN. Estimate
this equation by OLS and employ White's robust standard error procedure.
Comment on the signs and statistical significance of the estimated
parameters. Next estimate the same equation by OLS (again using White's
robust standard error procedure) but allow for sample
selectivity by including the variable INVR created above. Comment on the
sensitivity of the estimated parameters to inclusion of the sample
selectivity adjustment. Is sample selectivity significant?
Link to
Philip Rothman's homepage
Link to
ECU Economics Home Page