Econ7810: Econometrics for Economic Analysis,
Econ7810: Econometrics for Economic Analysis, Fall 2024
Homework #2
Due date: 28 October. 2024; 9am.
Do not copy and paste the answers from your classmates. Two identical homework will be treated as cheating. Do not copy and paste the entire output of your statistical package’s. Report only the relevant part of the output. Please also submit your R-script. for the empirical part. Please put all your work in one single file and upload via Moodle.
Part I Multiple Choice (3 points each, 30 points in total)
Please choose the answer that you think is appropriate.
1. The slope estimator, β1 , has a smaller standard error, other things equal, if
a. there is more variation in the explanatory variable, X.
b. there is a large variance of the error term,u.
c. the sample size is smaller.
d. the intercept, β0 , is small.
2. The reason why estimators have a sampling distribution is that
a. economics is not a precise science.
b. individuals respond differently to incentives.
c. the values of the explanatory variable and the error term differacross samples.
d. in real life you typically get to sample many times.
3. To decide whether or not the slope coefficient is large or small,
a. the slope coefficient must be statistically significant.
b. the slope coefficient must be larger than one.
c. you should analyze the economic importance of a given increase in X.
d. you should change the scale of the X variable if the coefficient appears to be too small.
4. The p-value for a one-sided right-tail testis given by
a. Pr(Z > tact ) < 1.645
b. Pr(Z < tact ) = φ(tact )
c. Pr(Z > tact ) = 1 - φ(tact )
d. cannot be calculated, since probabilities must always be positive.
5. Imagine you regressed earnings of individuals on a constant, a binary variable (“Male”) which takes on the value 1 for males and is 0 otherwise, and another binary variable (“Female”) which takes on the value 1 for females and is 0 otherwise. Because females typically earn less than males, you would expect
a. the coefficient for Male to have a positive sign, and for Female a negative sign.
b. both coefficients to be the same distance from the constant, one above and the other below.
c. none of the OLS estimators to exist because there is perfect multicollinearity.
d. this to yield adifference in means statistic.
6. Using the textbook example of 420 California school districts and the regression of testscores on the student-teacher ratio, you find that the standard error on the slope coefficient is 0.51 when using the het- eroskedasticity robust formula, while it is 0.48 when employing the homoskedasticity only formula. When calculating the t-statistic, the recommended procedure is to
a. use the homoskedasticity only formula because the t-statistic becomes larger
b. first test for homoskedasticity of the errors and then make a decision
c. use the heteroskedasticity robust formula
d. make a decision depending on how much different the estimate of the slope is under the two procedures
7. When there are omitted variables in the regression, which are determinants of the dependent variable, then
a. you cannot measure the effect of the omitted variable, but the estimator of your included variable(s) is (are) unaffected.
b. this has no effecton the estimator of your included variable because the other variable is not included.
c. the OLS estimator is biased if the omitted variable is correlated with the included variable.
d. this will always bias the OLS estimator of the included variable.
8. If you had a two regressor regression model, then omitting one variable which is relevant
a. will have no effecton the coefficient of the included variable if the correlation between the excluded and the included variable is negative.
b. will always bias the coefficient of the included variable upwards.
c. can result in a negative value for the coefficient of the included variable, eventhough the coefficient will have a significant positive effect on Y if the omitted variable were included.
d. makes the sum of the product between the included variable and the residuals different from 0.
9. Imperfect multicollinearity
a. implies that it will be difficult to estimate precisely one or more of the partial effectsusing the data at hand
b. violates one of the four Least Squares assumptions in the multiple regression model
c. means that you cannot estimate the effect of at least one of the Xs on Y
d. suggests that a standard spreadsheet program does not have enough power to estimate the multiple regression model
10. If yourejecta joint null hypothesis using the F-test in a multiple hypothesis setting, then
a. a series of t-tests may or may not give you the same conclusion.
b. the regression is always significant.
c. all of the hypotheses are always simultaneously rejected.
d. the F-statistic must be negative.
Part II Short Questions (29 points)
Please limit your answer (except for tables or figures) to less than or equal to 5 lines per sub-question. (13 points) 2.1 In the linear consumption function
con^s =β^0 +β^1 inc
the (estimated) marginal propensity to consume (MPC) out of income is simply the slope, β(^)1 . Using observa-tions for 200 families on annual income (ranging from $10,000 to $100,000) and consumption (both measured in dollars), the following equation is obtained :
con^s = -80.47 + 0.783inc
n = 200, R2 = 0.63
(2 points) (i) Interpret the intercept in this equation, and comment on its sign and magnitude. (2 points) (ii) Interpret the coefficient of inc.
(2 points) (iii) What is the predicted consumption when family income is $50,000. (2 points) (iv) What is the meaning of the regression R2
(3 points) (v) Will the regression give reliable predictions for an individual with annual income as $200,000? Why or why not?
(2 points) (vi) The average income in this sample is $40,000 per year. What is the average value of the consumption in the sample?
(16 points) 2.2 The Solow growth model suggests that countries with identical saving rates and population growth rates should converge to the same per capita income level. This result has been extended to include investment in human capital (education) as well as investment in physical capital. This hypothesis is referred to as the “conditional convergence hypothesis,” since the convergence is dependent on countries obtaining the same values in the driving variables. To test the hypothesis, you collect data from the Penn World Tables on the average annual growth rate of GDP per worker (g6090) for the 1960-1990 sample period, and regress it on the da代 写Econ7810: Econometrics for Economic Analysis, (i) initial starting level of GDP per worker relative to the United States in 1960 (RelProd60 ), (ii) average population growth rate of the country (n), (iii) average investment share of GDP from 1960 to1990 (sK - remember investment equals savings). The results for close to 100 countries is as follows:
g6090 = 0.005 - 0.34 × n + 0.266 × sK - 0.02 × RelProd60 (1)
(0.001) (0.09) (0.102) (0.015)
R2 = 0.437
(4 points) (i) Interpret the coefficinets of n and sK . Do the coefficients have the expected signs?
(4 points) (ii) Why does a negative coefficient on the initial level of per capita income indicate conditional convergence (“beta-convergence”)? Do we observe the conditional convergence in equation (1)?
(8 points) (iii) You remember that human capital in addition to physical capital also plays a role in determining the GDP growth rate. You therefore collect additional data on the average educational attainment in years for 1985, and add this variable (Educ) to the above regression. This results in the modified regression
g6090 = 0.004 - 0.172 × n + 0.133 × sK + 0.002 × Educ - 0.044 × RelProd60 (1)
(0.001) (0.08) (0.062) (0.001) (0.021)
R2 = 0.52
When missing variable Educ is added, what happen to the coefficient estimates of n , sK and RelProd60 ? Explain the reason and mechanism in detail.
(How has the inclusion of Educ affected your previous results?)
Part III Empirical exercise (41 points)
For all regressions, please report the heteroskedasticity-robust standard errors. Please limit your answer (except for tables or figures) to less than or equal to 8 lines per sub-question. You may use appropriate table in answering the questions. Please hand in your R script. file with the problem set.
(21 points) 3.1 This question deals with the estimation of betas of the Capital Asset Pricing Model (CAPM), and it is a relatively straightforward application of a simple linear regression.
Rt(e) = Q + βRmt + ut
Rt(e) is the expected return (return), Rmt is the market return (market). You are given data on monthly stock returns for 15 companies in 7 industries for the period from January 1978 to December 1987. They are:
Industries |
Companies |
Oil Computers
Electric Utilities
Forest Products Airlines
Banks
Foods |
Mobil (11) Texaco (14) IBM (10) DEC (Digital Equipment Corporation) (6) DataGen (Data General) (5) ConEd (Consolidated Edison) (3) PSNH (Public Service of New Hampshire) (13) Weyer (Weyerhauser) (15) Boise (1) PanAm (Pan American Airways) (12) Delta (7) Contil (Continental Illinois) (4) Citcrp (Citicorp) (2) Gerber (9) GenMil (General Mills) (8) |
Table 1: Companies in the dataset capm3__2024.dta
These data are contained in the file capm3__2024.dta. The file also contains information on the market monthly return (market,a value-weighted average of returns on stocks listed on the New York Stock Exchange) and information on the risk-free rate of return (return, the return on 30-day U.S. Treasury Bills). The stock and market returns in the file are excess returns over the risk-free rate of return.
From the list of industries, choose IBM from industry of computer (comparatively highly “risky”) and PSNH (Public Service of New Hampshire) from the industry of Electric Utilities (relatively “safe”) (Hint: The variable ncomp runs from 1 to 15 and identifies the company in each observation, while the corresponding number for each company is listed in the table in the paranthesis). You can use the subset on R commands to choose sample and run the regression with. For example, IBM < -subset(capm, capm$ncomp == 10) uses the data from company IBM only. )
(6 points) (i) Estimate Q and β in the CAPM by OLS for each of the two firms. Please report the results in one table. How do the estimates of Q and β differ between the two firms? Does this accord with your expectation?
(6 points) (ii) The monthly stock and market returns (return and market) are in decimal. Convert them into percentage andre-estimate Q and β . Report the results in the same table as above. Are the new estimates different from the estimates you got in part (a)? Explain.
(5 points) (iii) For each company, compute the proportion of total risk that is market risk. Are the results consistent with your expectations?
(4 points) (iv) Do large estimates of β correspond to higher R2 values? Do you expect this to be the case? Why or why not?
(20 points) 3.2 Use the data teachingrating__2024.dta for this exerciingse and check TeachingRatings_Description.doc for the data description. Dr. Qin would like to explore the relationship between the course_eval and beauty.
(Hint: command geom_point() can be used to plot the data. subset() or filter() functions on R commands might also be useful.)
(8 points) (i) Construct a scatter plot of average course evaluations (Course_Eval) on the professor’s beauty (Beauty). does there appear to be a relationship between the variables? Run a regression of course evaluation on beauty, female and minority, report it in a table and interpret the slope of Beauty. Is there any particular problem with the data that might drive your regression results?
(8 points) (ii) Correct the data problem you find in (i) and give out your argument. Re-run the regression of course evaluation on beauty, female and minority, report it in the same table and interpret the coefficinets of the three variables. Comment on the size of the slope. Is the estimated effect of Beauty on Course_Eval large or small? Explain what you mean by “large” and “small” .
(4 points) (iii) Dr. Qin has an average value of Beauty, while Dr. Pretty’s value of Beauty is one standard deviation above the average. Both of them are female and non-white. Predict Dr. Qin and Dr. Pretty’s course evaluations.