Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Is this a job for Fuzzy Matching? I need to extract common values from a string.

jackdaniels
8 - Asteroid

 

I need to extract each University from the String field and place it in new column Institution.

 

The University could be named in many different ways throughout all the records.

 

Is this a job for fuzzy matching or something else?

 

StringInstitution
University of Limerick - Centre for Robotics & Intelligent Systems&ltUniversity of Limerick
University of Limerick - Synthesis & Solid State Pharmaceutical Centre (SSPC)<University of Limerick
University College Dublin - UCD School of Languages, Cultures & Linguistics<brUniversity College Dublin
University of Leiden - Institute of Political Science of the Faculty of Social and Behavioural Sciences <University of Leiden
Dublin City University - School of Applied Language and Intercultural Studies<brDublin City University
University of British Columbia<bUniversity of British Columbia
NHTV Breda University of Applied Sciences - The Academy for Digital Entertainment<bNHTV Breda University of Applied Sciences
Warwick Business School, The University of Warwick University of Warwick
University College Dublin - UCD School of Mathematics and Statistics<brUniversity College Dublin
6 REPLIES 6
NickC
Alteryx Alumni (Retired)

Hello,

 

I have used a parsing tool to pull the relevant string from the text.  The challenge is identifying the logic to break it out. 

 

I used the RegEx tool with the Output Method set to Parse.  Using the following Regular Expression (.*-|.*&|,.*), I'm sure somebody on the community can put together a shorter expression but this got me to the goal.  

 

Please see completed workflow attached.

 

Thanks,

Nick

 

jackdaniels
8 - Asteroid

Thanks

 

MarqueeCrew
20 - Arcturus
20 - Arcturus

Hey @NickC,

 

I do have an alternative to your wonderful formula:

 

(.*?)\s*[^a-z\s].*

It is a bit wordy, but here it goes:

 

(.*?) = Create a group of any characters up until the first time that you encounter whatever comes next.

\s* = 0 or more spaces followed by

[^a-z\s] = a non letter character or a non space

.* = followed by anything

 

We're default to case insensitive.  Based upon the examples provided, this will work for you @jackdaniels.  The extra benefit is that there will be no trailing spaces in the output.

 

Cheers,

Mark

 

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
jackdaniels
8 - Asteroid

That is amazing. I really wish I properly understood regex. 

 

Thanks @NickC and @MarqueeCrew for the explanation.

MarqueeCrew
20 - Arcturus
20 - Arcturus

Here's a video to start you off...

 

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
bharti_dalal
10 - Fireball

hi @jackdaniels,

 

Meanwhile learning Regex, you can also  attain the results using text to column tools. I am attaching the solved workflow. 

Labels