
Hello Community Members!
A solution to last week’s challenge can be found here.
Who is up for a predictive challenge? With Black Friday just around the corner, it is the perfect time to put your predictive skills to the test. This week, take on the task of predicting how many customers are likely to switch their telecommunications provider, also known as churning.
This challenge, submitted by Ollie Clarke (@OllieClarke), is designed to take your abilities with the Designer Predictive palette to the next level.
You have a training dataset (Training Data.yxdb) containing information about a telecommunication company’s customers and whether they have churned. You also have a testing dataset (Testing Data.yxdb) with information about new customers. Your task is to predict how many of the new customers from the testing dataset are likely to churn from the company.
Just follow the steps and you will be able to tackle this challenge, even if it is your first time building a predictive model!
Here is how to get started:
- Split the Training Data: Divide the training data into two samples with a seed number of 1: Estimation (70%) and Validation (30%).
- Build Four Models: Using the Estimation output, create four models—Boosted Model, Decision Tree, Forest Model, and Logistic Regression. Use their default configurations to predict churn, using all variables (columns) except for ID.
- Compare Models: Use the Validation output to compare the models and identify the one with the highest F1 score. (Use the E anchor from the Model Comparison tool to identify the highest F1 score.)
- Score the Testing Data: Rerun the entire training dataset through the best model, then score the testing data to calculate the likelihood of churn (Score_yes > 0.5). Count how many customers in the testing dataset are likely to churn.
Need a refresher? Review the following lessons in Academy to gear up:
Happy solving!
Source: https://kaggle.com/competitions/customer-churn-prediction-2020
The Academy Team