Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Best way to approach cleaning up a file with multiple similar columns

irmb
7 - Meteor

HI newbie here. Hello, team members. I am working on a file that should be easy to clean up but I am trying to find the most efficient way to do so.  The file has multiple columns but this are the critical ones (in this format):

Employee idlast namefirst nameCourse A title- statusCourse A title- completion dateCourse B title - statCourse B title- completion date Course C title- stCourse C- completion date
1234blueivycompleted12/8/19In progress   
1234blueivy  completed12/8/19  
1234blueivy    In progress 

 

 

 

I want it cleaned like this:

Employee idlast namefirst nameCourse titlestatuscompletion date

 

The biggest issue are these:

a) there are  about 10-15 columns with the completion status and an additional 10-15 columns with the title of the course and the completion date. The course title columns are not uniform. For example, sometimes the status column is named "Course title" stat, sometimes "Course title" status", sometimes the column has the course title but no "status" in the column's name. Same with the "completion" heading. I tried using Regex to first cleaning up the titles so they would be uniform and then join every thing using the column name. But that did not work because I could not figure out how to use regex to remove all unwanted characters and have all columns the same name.

 

b) The other issue is, there are several records for the one person. So, sometimes we end up with having about 30 records for the person to indicate their status for the 10 courses, because each record indicates the completion of the course.

 

I hope this is clear. I tried using transpose and formula using regex but I am a bit stuck now.

 

Thanks

 

15 REPLIES 15
MichalM
Alteryx Alumni (Retired)

You'd need to use RegEx Parse to do that. If you could send over couple of scenarios I can give you an idea about how to do this.

irmb
7 - Meteor

Here is a sample

 

Quiz: Species - Humans - Gender -Nationality - StatusQuiz: Species - Humans - Gender -Nationality - CompletionAssessment:  Customer - Review - Networks - StatusAssessment:  Customer - Review - Networks TeleNetworks - Status
MichalM
Alteryx Alumni (Retired)

Do you want to check the attached?

 

(.+)(- st[a-zA-Z]+\>|- com[a-zA-Z]+\>)

 

regex.png

irmb
7 - Meteor

@MichalM Awesome thanks! I tried the formula and overall it worked great, except in the cases where the word status ended with a 2 like so "status2" in those cases, the regex out ended up with "null".

 

Also, is there a place where I can learn how to form expressions in Alteryx? I have no idea what you sent means. I want to be able to create those on my own.

 

Thanks for your help,

 

 

MichalM
Alteryx Alumni (Retired)

@irmb 

 

This is because the expression only expects - st or - com followed by any number of letters. If you wanted this to match both status and status2 as well as an eventuality where a number is added to completion, it would look like the below

 

(- st\w+\>|- com\w+\>)

 

The expressions we use here are called Regular Expressions and are not Alteryx specific. It is a language of sort that helps you express patterns in a piece of text. Having said that, there are number of great resources available on Alteryx Community that will help you learn more about them

 

 

irmb
7 - Meteor

YOu are awesome! I combined both expressions and it worked! Thanks

Labels