Q4 PREDICTION (11 points) A. (6 points) Using test_churn dat…
Q4 PREDICTION (11 points) A. (6 points) Using test_churn dataset, predict the probability of the customer being churned using the following models: i) model1 ii) model2 (full model) iii) naive bayes model iv) Decision tree model B. (5 points) Calculate the mean absolute prediction error for all the four models using the threshold=0.6. Which model has the smallest mean absolute prediction error?
Read DetailsGeneralized Linear Models refer to a collection of models wh…
Generalized Linear Models refer to a collection of models where the response variable, Y, is normally distributed and its transformed expectation, , has a linear relationship with the predicting variables via a “link” function.
Read DetailsQ6 FITTING THE POISSON MODEL (9 points) i) (3 points) Fit a…
Q6 FITTING THE POISSON MODEL (9 points) i) (3 points) Fit a poisson regression model using all the predictors and “incidents” as response variable. Display the summary. ii) (3 points) Interpret the coefficents of “typeD” ,“operation1975-79”,“service” with respect to the log expected incidents count. iii) (3 points) Cook’s distance is not appropriate for poisson regression. What other approaches can we use to detect the outliers? Implement your approach to get the number of outliers.
Read DetailsCustomer Churn Dataset This dataset is part of a data scienc…
Customer Churn Dataset This dataset is part of a data science project focused on customer churn prediction for a subscription-based service. Customer churn, the rate at which customers cancel their subscriptions, is a vital metric for businesses offering subscription services. Predictive analytics techniques are employed to anticipate which customers are likely to churn, enabling companies to take proactive measures for customer retention. SubscriptionType: Type of subscription plan chosen by the customer (e.g., Basic, Premium, Deluxe) PaymentMethod: Method used for payment (e.g., Credit Card, Electronic Check, PayPal) PaperlessBilling: Whether the customer uses paperless billing (Yes/No) ContentType: Type of content accessed by the customer (e.g., Movies, TV Shows, Documentaries) MultiDeviceAccess: Whether the customer has access on multiple devices (Yes/No) DeviceRegistered: Device registered by the customer (e.g., Smartphone, Smart TV, Laptop) GenrePreference: Genre preference of the customer (e.g., Action, Drama, Comedy) Gender: Gender of the customer (Male/Female) ParentalControl: Whether parental control is enabled (Yes/No) SubtitlesEnabled: Whether subtitles are enabled (Yes/No) AccountAge: Age of the customer’s subscription account (in months) MonthlyCharges: Monthly subscription charges TotalCharges: Total charges incurred by the customer ViewingHoursPerWeek: Average number of viewing hours per week SupportTicketsPerMonth: Number of customer support tickets raised per month AverageViewingDuration: Average duration of each viewing session ContentDownloadsPerMonth: Number of content downloads per month UserRating: Customer satisfaction rating (1 to 5) WatchlistSize: Size of the customer’s content watchlist Churn (response variable): 1 if the customer has cancelled the subscription, 0 if not. Read the data and answer the questions below: NOTE: The categorical variables have already been converted into factors in the code below. The dataset has been divided into train and test datasets. # Loading of the data churn= read.csv(“Customer churn.csv”, header=TRUE, sep=”,”) churn$SubscriptionType=as.factor(churn$SubscriptionType) churn$PaymentMethod=as.factor(churn$PaymentMethod) churn$PaperlessBilling=as.factor(churn$PaperlessBilling) churn$ContentType=as.factor(churn$ContentType) churn$MultiDeviceAccess=as.factor(churn$MultiDeviceAccess) churn$DeviceRegistered=as.factor(churn$DeviceRegistered) churn$GenrePreference=as.factor(churn$GenrePreference) churn$Gender=as.factor(churn$Gender) churn$ParentalControl=as.factor(churn$ParentalControl) churn$SubtitlesEnabled=as.factor(churn$SubtitlesEnabled) churn$Churn=as.factor(churn$Churn) set.seed(123) # Setting seed for reproducibility nrows
Read Details