Alteryx Designer Desktop Discussions

Esha_Reddy · ‎02-03-2019

Hello,

I'm having trouble cleaning this data to as shown below (2nd table),

Entity;"Alternative Name";"Story";"Year";"Records Lost";"Sector";"Method of Leak";"1st source";"2nd source";"3rd source";"Source name"

River City Media;"Spam operator";"A dodgy backup has allegedly resulted in over a billion leaked email addresses

Could someone please help me get through cleaning this kind of dirty data.?

Also attached is the file I'm trying to clean.

Output should look like,

Entity	Alternative Name	Story	Year	Records Lost	Sector	Method of Leak	1st source	2nd source	3rd source	Source Name
River City Media	Spam Operator	A dodgy backup has allegedly resulted in over a million leaked email addresses	2017	1,37e+09	Web	Accidentally Published	https://betanews.com/2017/03/06/river-city-media-spam-database-leak/			Beta News

Thanks in advance.

Aguisande · ‎02-03-2019

Hi @Esha_Reddy

What I'd do is to grab your input data, and read it as a csv, using "\0" as delimiter and untick "First Row Contains Filed Names".

Once the 278 records (including the headers) are into Alteryx, just use a Cleanse Tool to get rid of the punctuation in the values (If you want, you can keep Field 8 out this cleaning process, since all ":", "/" from the urls are going to be removed). You can remove the quotes of this field with the Formula tool later (ReplaceChar([Field],"Char to find", "Char to replace with")

Next, use a Text to Columns tool to separate the 11 columns using ";"as the delimiter.

Finally, with the Dynamic Rename Tool, connect your data to the top input and select Take Field Names from the First Row of Data.

Probably, your data may need further cleaning, but this will help you start. I'm attaching the previous steps in a WF to give you a better idea of the whole process.

Best,

_AG_

PS: As a best practice, use a Select Tool after the Text to Columns, and get rid of the Original field.

aish49 · ‎02-04-2019

Hi ! Please find the workflow in the attachment. Hope it solves the problem!

Pumpkin
Zombie
Ghost
Vampire

Alteryx Designer Desktop Discussions

Cleaning data