General Discussions

Discuss any topics that are not product-specific here.

Feature Engineering - Interaction Timing

phottovy
13 - Pulsar
13 - Pulsar

Hi Alteryx Community,

 

I am currently building a data set using interactions with customers to use in a predictive model. I have been playing around with the "Build Features" tool from the Intelligence Suite. I notice quite a few options related to Date fields such as "Avg Time Between" and "Time Since Previous" among others. These are great ideas for features to include in my data set but I have an even more fundamental feature engineering question that I'm hoping the community can give me some ideas/suggestions/best practices about. Here's a very simplified example:

 

I am using the last six months of interactions with my customers to predict who will churn in the upcoming month. Here is some information and a sample table for several different customers along with whether they churned or not:

  • Customer_A - 6 total interactions over six months, on average they were 3 days apart and all occurred during the most recent month
  • Customer_B - 6 total interactions over six months, on average they were 3 days apart and occurred all occurred over 5+ months ago
  • Customer_C - 6 total interactions over six months, 5 occurred in the last week with one several months ago on average they were 15 days apart and one interaction occurred during each month during the window
  • Customer_D - 6 total interactions over six months, one interaction occurred during each month during the window
  • Customer_E - New customer with only 1 interaction that happened 30 days ago
  • Customer_F - No interactions in the past 12 months 

 

CustomerTotal InteractionsAvg Time BetweenTime Since PreviousChurn (Target)
A

6

310N
B63150Y
C6152Y
D62515N
E1?30N
F0??Y

 

My basic questions are:

  1. How should I prepare my data to let the model know which interactions are more recent? The first four customers are all relatively similar if you look at the table, but their timelines are clearly very different in reality.
  2. For the last two customers, how do I fill in the question marks? How do you calculate an average on one or no interactions and what do you use for time since previous with no interactions? Plugging in a zero would make it seem like they had an interaction 0 days ago and they were 0 days apart. 

 

I ultimately want to include information such as the type of interaction or who the interaction was with (e.g. fancy dinner with our top salesperson vs an unsolicited voicemail from a summer intern) but to start I'm just looking for any advice people have for the more straight forward example above. I'm sure there are many approaches and any suggestions will be greatly appreciated.

0 REPLIES 0
Labels