parsing raw data
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
I have a .csv file including billions of records like these in one column 'accounts':
accounts
{"com.viber.voip":" 000000000","org.securetelegram.messenger":"33333333","com.sgiggle.production.account":"Sync Tango friends","com.whatsapp":"WhatsApp","com.google":"xxxxx@gmail.com","com.imo.android.imoim":"imo"}
{"org.securetelegram.messenger":"55555"}
{"com.viber.voip":" 00000000","com.sgiggle.production.account":"Sync Tango friends","com.whatsapp":"WhatsApp","com.google":"xxx@gmail.com","com.imo.android.imoim":"imo","org.securetelegram.messenger":"0000"}
{"org.securetelegram.messenger":"666666"}
{"com.sonyericsson.localcontacts":"Phone contacts","com.whatsapp":"WhatsApp","org.securetelegram.messenger":"9999999"}
{"org.abbasnaghdi.messenger":"111111","org.securetelegram.messenger":"7777"}
sorting of every record is different from each other and every {} is in one row.
how can I parse this file completely automatic using Alteryx Designer like these:
com.viber.voip, org.securetelegram.messenger, com.sgiggle.production.account, com.whatsapp, com.google, com.imo.android.imoim, com.sonyericsson.localcontacts, org.abbasnaghdi.messenger
000000000, 33333333, Sync Tango friends, WhatsApp, xxxxx@gmail.com, imo, Null, Null
Null, 55555, Null, Null, Null, Null, Null, Null
00000000, 0000, Sync Tango friends, WhatsApp, xxx@gmail.com, 000000, Null, Null
Null, 666666, Null, Null, Null, Null, Null, Null
Null, 9999999, Null, WhatsApp, Null, Null, Phone contacts, Null
Null, 7777, Null, Null, Null, Null, Null, 111111
.....
.......
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your help.
Could you give me more information about each step?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The JSON Parse tool is splitting the contents of each cell into multiple rows:
Some of the results have leading/trailing spaces, which get removed by the Data Cleansing tool. Then the results are turned around into your desired format using the Crosstab tool:
Crosstab replaces any spaces or special characters in the headers with an underscore, so the final Dynamic Rename tool simply replaces _ with .
