Here is this week’s challenge, I would like to thank everyone for playing along and for your feedback. The link to the solution for last challenge #10 is HERE. For this exercise let’s look at some simple text mining that can be performed with Alteryx. There are several ways to do this challenge, I will provide one solution that uses a batch macro and one that does it without. It is a great example to see how batch macros can simplify a workflow.
The use case:
A manufacturing company receives customer complaint data on a daily basis from their call centers about the medical parts they distribute to their customers. The company monitors these comments to understand which parts and part groups have the highest complaint rate. This helps the company prioritize which parts to focus on from a development standpoint.
In this exercise, take the customer complaint data (Field_6 in the Test2 data) and identify which bucket the complaint falls within. The complaint can fall into multiple buckets and needs to be flagged as these complaints take highest priority. Create an aggregate view of which buckets or bucket pairings have the highest # of complaints.
This is only a subset of data so all records will not be assigned to buckets and can be ignored.
This exercise is a packaged Alteryx module due to the size of the input file. Double click it after saving the attachment and it should extract and open in Alteryx.
This article has been updated with 2 Solutions.
I love comparing all these solutions and learning more about how to utilize different tools. I'd like to see an explanation on what the search macro is doing. I've attached a screen shot of my workflow in the spoiler window. It works great if you know how many words you are looking for. On my machine, it finished in under 3 seconds as compared to 10 secs for solution 1 and 12 secs for solution 2. :manhappy:
In looking at the Solution 1, I learned that I could've eliminated the sort and multi-row tool I used and used the concatenation option in the first summarize tool. I tested that approach on my workflow and it also worked and took the same amount of time. I found the tricky part to be how to introduce the search items if you don't know how many you will have. I also included a test to ensure each complaint ticket only had one row of data to be on the safe side. I'm happy to send a packaged workflow to anyone who wants it.
similar solution to the 2 provided - differences in approach in the spoiler below