Split field and value from one cell into multiple columns
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi - I have a .csv file I need to parse and one of the fields contains a number of attributes (called ExtendedAttributes in the sample .csv file). The attributes have a pattern of fieldname:fieldvalue, The issue is that I do not know how many of these attributes will be in the field and I cannot split by comma as some of the field values contain commas within them.
I would like the result to have the initial columns and then a column for each of the attributes. For instance country, firstname, etc.Some of the new columns will have null because there isn't an attribute in the field for them.
ID | First Name | Last Name | Display Name | HireDate | Country | Firstname | iplanet-am-user-alias-list | nextcol | etc |
1JK5L | John | Jones | John Jones | 20190103 | US | John | jjones@myemail.com | ||
201938 | Beth | Becker | Beth Becker | 20180607 | US | null | bethbecker@myemail.com | ||
fe1305982309834 | Rose | Roth | Rose Roth | 20181011 | US | Rose r | rroth@myemail.com |
I have only done simple manipulation in Alteryx so I think I need to somehow incorporate REGEX and text to rows and possibly pivot the data but I am not really sure how to do this. Any help pointing me in the right direction would be great. Thank you again!
Solved! Go to Solution.
- Labels:
- Datasets
- Expression
- Parse
- Preparation
- Transformation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The general strategy I take in scenarios like this is to isolate that field so it can be parsed and transformed, then rejoin it to the original data. A RecordID tool at the beginning will make this possible, along with some parsing tools.
I have put together the attached example to get you started. It works for many fields and shows how this can be achieved. That being said, there are still some inconsistent punctuation and characters that need to be handled in further testing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you Charlie! I will download and take a look at your solution and will post how it works. I really appreciate the help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @amw1 ,
solved without RegEx ...
I use the Text to Columns tool (separator "," and splitting "to rows") and "re-build" the original attribute using a Multi-Row formula tool. Sample workflow attached.
Best,
Roland
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Charlie,
I am not sure this will work because one of the attributes has commas within it:
dn: CN=John,OU=Temp Users,OU=Remote,OU=Users,OU=People,DC=us,DC=ad,DC=lfg,DC=com,
so this one attribute has the name of dn and has commas within its value. Splitting on the comma to start will split each of these values apart from what I can see. Also when I run this some of the values are column names so something else isn't quite working.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Agreed. Check out the solution by @RolandSchubert. It's brilliant and does a better join of handling those scenarios.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Roland,
Thank you for posting this! Downloading and taking a look today.
Thanks again!!
Anne
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Roland,
Wow this is great! It looks like it works on the data set and if there are any anomalies I'll address them separately. Thank you again this was a huge help!
Best,
Anne
