CFA® charterholders know how to interpret regression output to conclude whether their estimates and forecasts are reliable to support their investment recommendations, decisions or actions.

This RapidDigest only includes what is covered in the 2021 CFA® Curriculum Readings (Readings 4 and 5).

Trademark Disclaimer: CFA Institute does not endorse, promote, or warrant the accuracy or quality of knowell.rapidquizzer.com. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.

Contents

#### Regression Analysis

Refer to Linear Regression (Assumptions) for the sample data underlying this regression analysis.

Microsoft Excel-generated regression output of the Application to Real-World Data

Top Panel: Simple Linear Regression | Bottom Panel: Multiple Linear Regression #### Simple Linear Regression

Y = inflation | X = M3 money supply

 Model Specification Slope Coefficient Positive coefficient is expected as you learned in CFA® Exam Level I Economics of the monetary policy transmission mechanism from money supply to inflation.  The sign also reflects the quantity theory of money M x V = P x Y where V (velocity of money) is constant and Y is unaffected because of money neutrality leaving M (money supply) and P (prices or inflation) remaining, which is our regression model. Predicted 2018 inflation  Prediction Error   Observed Values Israel below the regression line. Russia on the regression line. Model Hypothesis Null Hypotheses: that Y (inflation) has no relationship with X (M3 money supply)  Alternative Hypotheses: that Y (inflation) has positive relationship with X (M3 money supply)  Microsoft Excel regression output is a two-tailed hypothesis test. Set up alternative hypothesis as the "hoped for" or suspected condition (e.g. Ha: slope > 0) if strongly believed. t-test statistics on regression coefficients (intercept and slope)    t-test decision   at 5% significance level tα/2,n-2 = t0.05/2,8-2= t0.025,6 = 2.447 reject (statistically significant): | t-statistic |  > critical t value reject b0 = 0: | -3.67 | > 2.447  reject b1 = 0: | 4.21 | > 2.447 Confidence Interval at 5% significance level   critical tα/2,n-2 = t0.05/2,8-2= t0.025,6 = 2.447 specifies the range of values within which the true parameter value falls at a given significance level -12.42268596 -/+ (2.447)(3.382329853) = -20.69894895 to -4.146422975 0.121941844 -/+ (2.447)(0.028943608) = 0.051119385 to 0.192764302 reject (statistically significant) if the hypothesized 0 is outside the interval p-value test lowest significance level at which to reject the null hypothesis reject (statistically significant): p-value reject b0 = 0: 0.010419245 reject b1 = 0.005603983 Coefficient of Determination (R2) measures Y variation explained by X  Correlation Coefficient (R) measures degree of linear relationship between X and Y  ANOVA F-test statistic (only has one-tail) F-statistic = (t-statistic of slope coefficient)2 17.75007386 = (4.213083652)2 redundant to t-test in simple linear regression    F-test decision at 5% significance level   Fα,k,n-2 = F0.05,1, 8-2= F0.05,1,6 = 5.99 reject (statistically significant): | F-statistic |  > critical F value reject b1 = 0: | 17.75 | > 5.99 Significance F at 5% alpha lowest significance level at which to reject the null hypothesis that the regression model is not statistically significant identical to p-value of slope coefficient = 0.005603983 reject if p-value Standard Error of Regression standard deviation of residuals  assessed relative to units of Y (smaller is better) Model Conclusion subject to breaches of regression assumptions and their correction as discussed in Linear Regression (Heteroskedasticity and Serial Correlation) reliable regression model (significant up to 0.5% alpha) Caveat (not in CFA® Curriculum Reading): Sample size should be at least 10 for every independent variable. Manual Regression     Manual ANOVA Manual Standard Errors (optional)      #### Multiple Linear Regression

Y = inflation | X1 = M3 money supply | X2 = GDP per hour worked

 Model Specification Slope Coefficients Positive coefficient for M3 money supply (see Simple Linear Regression above) Negative coefficient for GDP per hour worked because, as a measure of productivity, it reduces costs and therefore prices (inflation) as you learned in CFA® Exam Level I Economics. Each partial regression coefficient value is the average change in Y (inflation) for a unit change in that independent variable, holding all other X's constant. Predicted 2018 inflation  Prediction Error   Observed Values Israel below the regression line. Russia above the regression line. Model Hypothesis (F-test) Null Hypotheses: that Y (inflation) has no dependent relationship with X1 (M3 money supply) and X2 (GDP per hour worked) simultaneously (i.e. all slope coefficients jointly = 0) Alternative Hypotheses: that Y (inflation) has dependent relationship with X1 (M3 money supply) and/or X2 (GDP per hour worked) simultaneously and/or individually (i.e. at least one slope coefficient ≠ 0)  F-test statistic (only has one-tail) refer to Simple Linear Regression for the F-statistic formula reformulate the model if F-test fails F-test decision   at 5% significance level Fα,k,n-(k+1) = F0.05,2, 8-(2+1)= F0.05,2,5 = 5.79 reject (statistically significant): F-statistic > critical F value reject b1 = b2 = 0: | 8.17 | > 5.79 Significance F at 5% alpha reject if p-value 0.026545006 p-value Model Hypotheses (t-test) Null Hypotheses: that Y (inflation) has no dependent relationship with X1 (M3 money supply) and/or X2 (GDP per hour worked) individually and/or with the intercept (i.e. each coefficient = 0)   Alternative Hypotheses: that Y (inflation) has dependent relationship with X1 (M3 money supply) and/or X2 (GDP per hour worked) individually and/or with the intercept (i.e. each coefficient ≠ 0)   t-test decision   at 5% significance level tα/2,n-2 = t0.05/2,8-2= t0.025,6 = 2.447 reject (statistically significant): | t-statistic |  > critical t value refer to Simple Linear Regression for the t-statistic formula t-statistics still significant for M3 money supply but not for GDP per hour worked and no longer for the intercept Coefficient of Determination (R2) always increases if the new X is correlated with Y Adjusted R2 always reduces R2 by increasing the unexplained variation with removal of independent variables k in the degrees of freedom  adding new independent variable increases R2 but offset by increasing k in the denominator of the fraction increasing the regression residuals that reduces R2 Standard Error of Regression 0.460575399 slightly increased from 0.436673353 of simple linear regression but still relatively small compared with units of Y Model Conclusion subject to breaches of regression assumptions and their correction as discussed in Linear Regression (Heteroskedasticity, Serial Correlation and Multicollinearity) reliable regression model (significant up to 2.65% alpha) Caveat (not in CFA® Curriculum Reading): Sample size should be at least 10 for every independent variable.
Key to Learning

#### Understand the Why

The regression model and its estimators must be statistically significant to rely on them for estimation and forecasting in business and investment.

#### Real-World Practical Application

https://blog.thinknewfound.com/2016/07/alphas-measurement-problem/

Legal Notice: no copyright (public domain) or copyright exception (free use) under fair dealing/fair use laws (i.e. educational use, critique, not substantial quote) and proper attribution with link to source #### RapidInsight: First Thing First

• In multiple linear regression, the first thing people should look at should be the Significance F or p-value without having to obtain the F-statistic. The F-test evaluates the overall reliability of the regression model.
• In simple linear regression, the Significance F is the square of the t-statistic of the slope coefficient so the F-test is redundant to the t-test. Therefore, the first thing to look at is the p-value of the slope coefficient without the need to calculate the t-statistic or specifying the significance level. The p-value is 0 so the model is impossibly reliable (i.e. cannot reject the null hypothesis that  there is zero alpha for value stocks at whatever level of confidence: 95%, 90% or 99%, for example). This conclusion terminates the regression analysis.
• Notwithstanding the regression model is not reliable, the intercept is only significant at below 90% confidence level  (i.e. 1 - 0.1115836).

Legal Notice: no copyright (public domain) or copyright exception (free use) under fair dealing/fair use laws (i.e. educational use, critique, not substantial quote) and proper attribution with link to source #### RapidInsight: Know Thy Number

• The article mentions two regression assumptions: uncorrelated error term (independent from month to month) and zero-mean error-term. See Linear Regression (Assumptions).
• It further mentions alpha (intercept) is a constant that can be regarded as a random variable. While the intercept is a constant in the regression formula (Yi = α + βXi), it is still an estimate with its own standard error (i.e. 0.0011569 or 11 bps as the article put it), not the standard error of the residuals (i.e. 0.03773 or 377 bps).