Background The dataset includes 9 baseline numeric variables…
Background The dataset includes 9 baseline numeric variables: age, body mass index, average blood pressure, and six blood serum measurements for each of n = 442 diabetes patients. The response of interest is a quantitative measure of diabetes disease progression one year after baseline. The dataset is obtained from sklearn.datasets. We will be fitting multiple linear regression models to the train dataset and making predictions on the test dataset. Attribute Information: age: age in years bmi: body mass index bp: average blood pressure s1: tc, total serum cholesterol s2: ldl, low-density lipoproteins s3: hdl, high-density lipoproteins s4: tch, total cholesterol / HDL s5: ltg, possibly log of serum triglycerides level s6: glu, blood sugar level Target: quantitative measure of disease progression one year after baseline (Response variable) Note: All features have NOT been standardized.
Read DetailsQuestion 3: Stepwise Regression – 14 points For this questio…
Question 3: Stepwise Regression – 14 points For this question, use the trainData. Perform forward stepwise regression using BIC. Let the minimum model be the one with only an intercept, and the maximum model to be model1. Display the model summary of your final model. Call it forward_model. (2 points) Which variables were selected in the forward_model? Which regression coefficients are significant at the 99% confidence level in forward_model? (1 points) Perform backward stepwise regression using AIC. Let the minimum model be the one with only an intercept, and the maximum model to be model1. Display the model summary of your final model. Call it backward_model. (2 points) Which regression coefficients are significant at the 99% confidence level in backward_model? Are the selected variables different in forward and backward models? (2 points) Perform 2 Partial F-tests to compare the backward_model with the full model (model1) and the forward model with model1. What is your interpretation at the 95% confidence level? (2 points) Perform forward-backward stepwise regression model using AIC, starting with minimal model. Call it both_model.(2 points) Which variables are selected in both_model? Are all the selected variables significant at 99% level? Explain the reason. (2 points) What is the difference in variable selection between forward, backward and forward-backward stepwise regresssion? (1 points)
Read Details