Question 1: EDA and Full Model – 4 points For this question,…
Question 1: EDA and Full Model – 4 points For this question, use the trainData. Build a correlation matrix plot on the trainData dataset. Interpret it. (1 points) Fit a multiple linear regression with the variable ‘Target’ as the response and the other variables as predictors. Call it model1. Display the model summary. (2 points) What are the Mallow’s Cp, AIC, and BIC values for the full model (model1)? (1 points)
Read DetailsA spring is attached to the hook of the iolab. The following…
A spring is attached to the hook of the iolab. The following data is collected showing the spring force (in N) on the vertical axis plotted as a function of position on the horizontal axis (in meters). From this data, determine the equilibrium position of the spring (approximately). In case you find it hard to read the summary statistics on the right, the slope of the best fit line is approximately -2 and the intercept is approximately 0.4.
Read DetailsInstructions The R Markdown/Jupyter Notebook file includes t…
Instructions The R Markdown/Jupyter Notebook file includes the questions, the empty code chunk sections for your code, and the text blocks for your responses. Answer the questions below by completing the R Markdown/Jupyter Notebook file. You may make slight adjustments to get the file to knit/convert but otherwise keep the formatting the same. Once you’ve finished answering the questions, submit your responses in a single knitted file as HTML only. Partial credit may be given if your code is correct but your conclusion is incorrect or vice versa. Next Steps: 1. Save the .Rmd/.ipnyb in your working directory – the same directory where you will download the “diabetes_dataset.csv” data file into. Having both files in the same directory will help in reading the “diabetes_dataset.csv” file. 2. Read the question and create the code necessary within the code chunk section immediately below each question. Knitting this file will generate the output and insert it into the section below the code chunk. 3. Type your answer to the questions in the text block provided immediately after the response prompt. 4. Once you’ve finished answering all questions, knit this file and submit the knitted file as HTML on Canvas. Mock Example Question This will be the exam question – each question is already copied from Canvas and inserted into individual text blocks below, you do not need to copy/paste the questions from the online Canvas exam. “`{r}# Example code chunk area. Enter your code below the comment““Mock Response to Example Question: This is the section where you type your written answers to the question. Depending on the question asked, your typed response may be a number, a list of variables, a few sentences, or a combination of these elements. Ready? Let’s begin. We wish you the best of luck! Data Set diabetes_dataset.csv Starter TemplatesYou may use either the R Markdown or Jupyter Notebook Starter Template: R Markdown Starter Template: Final_Exam_starter_template_Fall24_R.Rmd Jupyter Notebook Python Starter Template: Final Exam_starter_template_Fall24_Python.ipynb Jupyter Notebook R starter Template: Final_Exam_starter_template_Fall24_R.ipynb
Read DetailsQuestion 6: Prediction – 9 points For this question, use the…
Question 6: Prediction – 9 points For this question, use the testData. Using testData and with the previously built models in Q2,3,5, predict the Target and output the average of these probabilities for each of the models below and summarize the results: i) Full linear regression model from question 2b (model1) ii) Reduced model from question 2b (model2) iii) Stepwise forward model from question 3a (forward_model) iv) Stepwise backward model from question 3c (backward_model) v) Stepwise forward-backward model from question 3f (both_model) vi) Ridge regression model from question 5a (ridge.model) vii) Regular Lasso model from question 5c (lasso.model) viii) Group Lasso model from question 5f (group_lasso) ix) Elastic Net model from question 5i (enet.model)
Read DetailsQuestion 2: Statistical Significance – 6 points For this que…
Question 2: Statistical Significance – 6 points For this question, use the trainData. In model1, which regression coefficients are significant at the 95% confidence level? Are these the exact same regression coefficients that are significant at the 90% confidence level? (2 points) Build a new model using only the variables whose coefficients were found to be statistically significant at the 95% confidence level. Call it model2. Display the model summary. (2 points) Perform a Partial F-test to compare this new model with the full model (model1) and interpret it at the 95% confidence level. Which one would you prefer? Is it good practice to select variables based on the statistical significance of individual coefficients? Why or why not? (2 points)
Read Details