Converting a concatenated string variable into columns with booleans
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello,
I'm kind of new to Alteryx and haven't yet gotten the hang of manipulated tables in the program
I was wondering if anyone knows of an efficient way to convert a variable that is a concatenation of non-exclusive attributes (such as a variable named keywords that looks like this: "1920's|antique|american") into columns with a binary variable (variables named keyword_1920's , keyword_antique and keyword_american that are either TRUE or FALSE.
First, I would need to find all the unique keywords currently concatenated in the various keywords strings
Second, I would need to turn them into columns and append them somehow to the table.
Third I would need to populate the fields of the newly created columns with TRUE or FALSE based on the keyword appearing in the concatenated string of that row.
Any Ideas?
Thank you,
Talks2Rocks
Solved! Go to Solution.
- Labels:
- Best Practices
- Preparation
- Udacity
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Here's my take: I grouped all of the keyword combinations together then added a lookup value (TRUE) for each incoming record. Using a FIND REPLACE tool, I append the keyword and lookup value to each record. Next I crosstab the data and then fill in the Null values as false. It is likely easier for you to open the workflow, run it and watch each tool create output to see how this could apply. I limit the data to ID and Keywords on input. If you have more fields, I'd join back to that data after this process.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Mark,
Thank you so much for the quick reply!
I understand your solution, but I wanted to make every keyword have its own separate column.
In the case you shared the columns would be:
keyword_american, keyword_canadian, keyword_1920's and keyword_antiques
(without combinations like american_antiques etc. )
Any ideas on how to do that?
Cheers!
Talks2Rocks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This isn't dynamic (new keywords require some handling), but it does the trick.
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Great!
This works. Thank you so much!
Any idea on how to make it dynamic with the addition of new keywords?
Cheers,
Talks2Rocks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
That may require some macro development...
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The output (as I understand it) needs to be one record per ID. You need to achieve a dynamic summary for all boolean fields getting the maximum (true > false) result.
Is it Sunday yet?
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Yes exactly,
@AdamR
@MarqueeCrew has the right idea. The question is how to do that with any number of keywords and rows as an input and not have to fiddle around manually.
Cheers,
Talks2Rocks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Once again late to the party, but try this one if you're still looking for some tweaking :)
1. Text to columns (split to rows) by delimiter |
2. Summarize by keyword
3. Add "keyword_" to the front of each keyword (and remove any punctuation using RegEx), plus add a dummy field for "value" (needed for next step)
4. Cross-Tab tool to bring keywords in as headers, "Value" field for values
5. Append keyword columns to original data
6. Multi-Field formula for all fields (plus dynamic/unknown) except original keyword field using Contains formula & some more RegEx cleanup for punctuation
This should be dynamic for any number of new/unidentified keywords. Let me know! :)
Cheers!
NJ
