Multiple Regression and Model Building
15.1 The Multiple Regression Model and the Least Squares Point Estimate
15.2 Model Assumptions and the Standard Error
15.3 R2 and Adjusted R2 (This section can be read anytime after reading Section 15.1)
15.4 The Overall F Test
15.5 Testing the Significance of an Independent Variable
15.6 Confidence and Prediction Intervals

15 trang |

Chia sẻ: thanhlam12 | Ngày: 07/01/2019 | Lượt xem: 202 | Lượt tải: 0
Bạn đang xem nội dung tài liệu **Chapter 15: Multiple Regression and Model Building**, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên

Multiple Regression and Model BuildingChapter 15Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/IrwinMultiple Regression and Model Building15.1 The Multiple Regression Model and the Least Squares Point Estimate15.2 Model Assumptions and the Standard Error15.3 R2 and Adjusted R2 (This section can be read anytime after reading Section 15.1)15.4 The Overall F Test15.5 Testing the Significance of an Independent Variable15.6 Confidence and Prediction Intervals15-*Multiple Regression and Model Building Continued 15.7 The Sales Territory Performance Case15.8 Using Dummy Variables to Model Qualitative Independent Variables15.9 Using Squared and Interaction Variances15.10 Model Building and the Effects of Multicollinearity15.11 Residual Analysis in Multiple Regression15.12 Logistic Regression15-*15.1 The Multiple Regression Model and the Least Squares Point EstimateSimple linear regression used one independent variable to explain the dependent variableSome relationships are too complex to be described using a single independent variableMultiple regression uses two or more independent variables to describe the dependent variableThis allows multiple regression models to handle more complex situationsThere is no limit to the number of independent variables a model can useMultiple regression has only one dependent variableLO15-1: Explain the multiple regression model and the related least squares point estimates.15-*15.2 Model Assumptions and the Standard ErrorThe model isy = β0 + β1x1 + β2x2 + + βkxk + Assumptions for multiple regression are stated about the model error terms, ’sLO15-2: Explain the assumptions behind multiple regression and calculate the standarderror.15-*15.3 R2 and Adjusted R2Total variation is given by the formulaΣ(yi - ȳ)2Explained variation is given by the formulaΣ(ŷi - ȳ)2Unexplained variation is given by the formula Σ(yi - ŷi)2Total variation is the sum of explained and unexplained variationThis section can be covered anytime after reading Section 15.1LO15-3: Calculate and interpret the multiple and adjusted multiple coefficients of determination.15-*R2 and Adjusted R2 ContinuedThe multiple coefficient of determination is the ratio of explained variation to total variationR2 is the proportion of the total variation that is explained by the overall regression modelMultiple correlation coefficient R is the square root of R2 LO15-315-*15.4 The Overall F TestTo test H0: β1= β2 = = βk = 0 versus Ha: At least one of β1, β2,, βk ≠ 0The test statistic isReject H0 in favor of Ha if F(model) > F* orp-value < *F is based on k numerator and n-(k+1) denominator degrees of freedomLO15-4: Test the significance of a multiple regression model by using an F test.15-*15.5 Testing the Significance of an Independent VariableA variable in a multiple regression model is not likely to be useful unless there is a significant relationship between it and yTo test significance, we use the null hypothesis H0: βj = 0Versus the alternative hypothesisHa: βj ≠ 0LO15-5: Test the significance of a single independent variable.15-*15.6 Confidence and Prediction IntervalsThe point on the regression line corresponding to a particular value of x01, x02,, x0k, of the independent variables isŷ = b0 + b1x01 + b2x02 + + bkx0kIt is unlikely that this value will equal the mean value of y for these x valuesTherefore, we need to place bounds on how far the predicted value might be from the actual valueWe can do this by calculating a confidence interval for the mean value of y and a prediction interval for an individual value of yLO15-6: Find and interpret a confidence interval for a mean value and a prediction interval for anindividual value.15-*15.8 Using Dummy Variables to Model Qualitative Independent VariablesSo far, we have only looked at including quantitative data in a regression modelHowever, we may wish to include descriptive qualitative data as wellFor example, might want to include the gender of respondentsWe can model the effects of different levels of a qualitative variable by using what are called dummy variablesAlso known as indicator variablesLO15-7: Use dummy variables to model qualitative independentvariables.15-*15.9 Using Squared and Interaction VariablesThe quadratic regression model relating y to x is:y = β0 + β1x + β2x2 + Where:β0 + β1x + β2x2 is the mean value of the dependent variable yβ0, β1x, and β2x2 are regression parameters relating the mean value of y to x is an error term that describes the effects on y of all factors other than x and x2LO15-8: Use squared and interaction variables.15-*15.10 Model Building and the Effects of MulticollinearityMulticollinearity is the condition where the independent variables are dependent, related or correlated with each otherEffectsHinders ability to use t statistics and p-values to assess the relative importance of predictorsDoes not hinder ability to predict the dependent (or response) variableDetectionScatter plot matrixCorrelation matrixVariance inflation factors (VIF)LO15-9: Describe multicollinearity and build a multiple regression model.15-*15.11 Residual Analysis in Multiple RegressionFor an observed value of yi, the residual isei = yi - ŷ = yi – (b0 + b1xi1 + + bkxik)If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance σ2 LO15-10: Use residual analysis to check the assumptions of multipleregression.15-*15.12 Logistic RegressionLogistic regression and least squares regression are very similarBoth produce prediction equationsThe y variable is what makes logistic regression differentWith least squares regression, the y variable is a quantitative variableWith logistic regression, it is usually a dummy 0/1 variableLO15-11: Use a logistic model to estimate probabilities and odds ratios.15-*