Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Cleaning up Excel input with Nickname in Quotes

JPSeagull
8 - Asteroid

Hi. I have to ingest an Excel file where the send insists on adding quotations around a manager's name. This causes join issues when summarizing later.

 

I have this:

 

IF ([MANAGER ] = 'John Jack Jones' THEN 'John "Jack" Jones' ELSE [MANAGER ] ENDIF

 

Note that the nickname of Jack coming in from the Excel file has the inverted quotation marks. Whereas Alteryx is a different format. I tried different suggestions searching around here, but nothing comes close to solving. Any suggestions?

5 REPLIES 5
AGilbert
11 - Bolide

Here are two examples of regex expressions which could be helpful. The first will identify where a nickname exists and the second accomplishes the removal of the nickname.

 

Screenshot 2024-05-13 113203.png

 

Side note: your if statement above identifies a string, then returns it without any change and returns all other non-matching strings without changes. So, I'm unsure what your expected output is. If my answer is off, then we can move the capturing groups (the parenthesis) in the regex replace expression to get the right answer. 

JPSeagull
8 - Asteroid

@AGilbert This is pretty cool and helpful. I also had an epiphany after posting my question - I have a database table I pull each month and the manager Jack is listed with clean quotes around his middle name. The Excel file we get each month updating all the newly promoted managers is manually managed and we use that to check for changes. Every month, the names with nicknames or alias in quotes causes us fits. So, I was trying to just hard code individual names and that is why I asked the community. However, I realized, it would be easier to clean the database name like this:

 

IF [MANAGER] = 'John “Jack” Jones' THEN 'John "Jack" Jones' ELSE [MANAGER] ENDIF

 

And for the Excel input, I cleaned up the name like this: 

 

IF Contains([MANAGER ], 'Jack') THEN 'John "Jack" Jones' ELSE [MANAGER ] ENDIF

 

Using contains made it easier to work with the Excel input and now both Manager name outputs are the same and joins perfectly (I wish they would use employee IDs). This only is the solution if we only have one Jack in our manager pool. So for now, we have to work with this until there is another Jack joining the company.

AGilbert
11 - Bolide

What happens when you have a Jackson or Jackie in the excel input? 😬

 

Preprocessing (cleaning) data is a good strategy, but you have to anticipate some edge cases which may not exist yet. Also, I would advise against hardcoding values in an expression. That is going to become very tedious to keep updated. If you share a workflow we can get something more robust in place. 

JPSeagull
8 - Asteroid

@AGilbert - 100% agree and I thought of that and right now we have no Jackie or Jackson in the management team. I have asked them to let me create a new column for Aliases or nicknames, but no one is budging. Meanwhile when I run the monthly report, I have to reconcile the issues of mismatches because the quotes around the nickname are not the same format as the quotes around the Excel file nicknames.

 

Here is a mock up of the challenge. Somehow, the font the person sending me the Excel file forces those upside down quotes so it comes over weird. I took your advice about the chances of a Jackie or Jackson and decided to just replace each side of the quotes. The attached workflow seems to do the trick, but I would feel better addressing any chance of any version of quotes can be cleaned up and use the Alteryx version.

AGilbert
11 - Bolide

I think that replacing the smart/curly quotes with straight ones would work just fine. That seems to be the only difference in the two formats. Just wrap this up as a formatting macro and have it available each time you have to use data from that source. 

Labels