Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Two Unstructured columns need to be organized

Sriram369
5 - Atom

i have a very large data set (more than 3 lakhs rows) very two columns are unstructured means (same component mentioned in various format) Aim is to filter out with respective to individual components. Attached sample for reference.

3 REPLIES 3
Luke_C
17 - Castor
17 - Castor

@Sriram369 What would the desired output look like? 

KarolinaRoza
11 - Bolide

Hi,

 

I would start with Summarize Tool (grouped by Remarks, Count: Item Name) then Sort Tool : Count - Descending.

 

This will allow you to look at most common Remarks, and maybe come up with some Filter Tools: for example Contains([Remarks],"STORE") to create some subsets of the original data and then group by specific category. 

 

It depends what you need, what kind of details you need.

 

Karolina

DawnDuong
13 - Pulsar
13 - Pulsar

hi @Sriram369 

it feels to me that you want to solve a classification. Based on my personal experience, you need to get the “key word” list from a domain expert to narrow down the field into the key categories and the iteratively whittle down the “residual” unmatched.

If you have access to the Word Cloud tool, that may be one way to get the initial key word list.

dawn 

Labels