Data Set Background “Personal Financial Wellness Dataset” T…
Data Set Background “Personal Financial Wellness Dataset” This dataset contains 1,000 individuals and is designed to study how demographic characteristics, income, expenses, debt, investments, and financial habits influence a person’s savings rate. Savings_Rate_Percent : Represents the percentage of income that an individual saves (Response Variable) Numerical (Continuous) Age: Age of the individual in years (Numeric) Employment_Status : Current employment situation of the individual. (Categorical) Education_Level: Highest level of education completed. (Categorical) Marital_Status: Current marital status of the individual. (Categorical) Housing_Type: Type of housing arrangement the individual lives in. (Categorical) Annual_Income_USD: Total yearly income earned in U.S. dollars. Monthly_Expenses_USD: Average monthly spending on living and household expenses. Debt_Amount_USD: Total outstanding debt owed by the individual. Investment_Amount_USD: Total amount invested in financial assets and accounts. Credit_Score: Numerical measure of the individual’s creditworthiness. Financial_Literacy_Score: Score representing the individual’s financial knowledge and skills. Monthly_Discretionary_Spending_USD : Monthly spending on non-essential goods and services. Emergency_Fund_Months: Number of months of expenses covered by emergency savings.
Read DetailsQuestion 1 Exploratory Data Analysis (6 points) Use dataset…
Question 1 Exploratory Data Analysis (6 points) Use dataset “Personal_Financial_Wellness” for this question a) (2 points)i) (1 point) Which category of Employment_Status is most common, and what percentage of the dataset does it represent?ii) (1 point) Using scatterplot and correlation coefficient, does a higher Financial_Literacy_Score consistently correspond to a higher savings rate? b) (2 points)i) (1 point)How does average Financial_Literacy_Score vary across Employment_Status categories?ii) (1 point) Which combination of Employment_Status and Education_Level has the highest average income? c) (2 points)i) (1 point) After dividing individuals into income quartiles, which quartile has the highest median savings rate and the largest spread in savings?ii) (1 point) Using boxplots, which Housing_Type has the highest debt burden, and are debt levels more variable within some housing categories than others?
Read DetailsQuestion 5 Prediction (8 points) Use testData for this quest…
Question 5 Prediction (8 points) Use testData for this question a) (4 points) Using testData, predict the Savings_Rate_Percent with both model1 and model2.i) Show the predictions of both models along with the true values.ii) Calculate the mean squared prediction error (MSPE) of each model. Which model predicts better on the test data? b) (4 points) Consider a new individual who owns their home (Housing_Type = “Own”), has monthly expenses of 4000 USD and a credit score of 700. Using model1:i) Compute the 95% confidence interval for the mean response and the 95% prediction interval for this new individual.ii) Provide an interpretation of each interval in the context of the problem. Why is the prediction interval wider than the confidence interval?
Read Details