Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Text to Columns/Reg Ex help to clean a list of company names

novawaly
7 - Meteor

Hi, 

I've been extracting names using a screen grab tool and getting them into Excel and now attempting to clean them up. 

 

For some reason, some names extract fine and have one company per row, other times, it'll have 5-6 companies that'll be listed in one row. 

 

Another strange thing is looks it'll end a lot of rows with a 0 or a CJ or Cî (not sure why) but I'm attempting to use these characters to clean my list so that each company has it's own row. 

 

When I've tried to use text to columns and I split on 0, CJ - it seems to behave strangely and I cant quite figure out what/how to accomplish this. 

 

Is this something I'd need to use reg ex for?

12 REPLIES 12
binu_acs
21 - Polaris

@novawaly 

binuacs_0-1646750005979.png

 

novawaly
7 - Meteor

Thanks for the quick reply - however, when I look at this solution - its separating out the Inc. or Ltd to their own rows. I could just clean that up, however it's also separating our the foreign filing designations (like s.r.o) etc

 

novawaly_0-1646750258046.png

 

binu_acs
21 - Polaris

@novawaly 

binuacs_0-1646750529857.png

 

novawaly
7 - Meteor

Thanks again for the quick response! @binu_acs 

 

Now the only problem is certain names are missing the first letter? like this company is called Continental. The first "C" dropped off

novawaly_0-1646751219811.png

 

binu_acs
21 - Polaris

@novawaly I used Trim function instead of TrimRight. please check now

 

binuacs_0-1646751628217.png

 

novawaly
7 - Meteor

I see - I'm having a hard to time following so I cant make adjustments. I'm realizing some of the cells can be quite long - should changing the data type at the beginning fix that?

 

Also, is there anway to also replace the CI at the end?

 

novawaly_0-1646752324660.png

 

novawaly
7 - Meteor

I was trying to do this by first, replacing all the "CJ"s and "CI"s with 0 then using parsing to rows on the 0.

 

Unfort, when I'm using ReplaceChar([Names],"CJ","0"), it's putting two 00s for CJ instead of just 0. 

 

So i can learn as I go, is there a reason why that path to solution is less optimal and do you have any ideas why it's replacing with double 0?

binu_acs
21 - Polaris

@novawaly use TrimRight Function 

binuacs_0-1646752668285.png

 

binu_acs
21 - Polaris

@novawaly ReplaceChar([Names],"CJ","0") woks on each character thats why it was replacing '00' for CJ, 

Labels
Top Solution Authors