Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.

Alteryx Server Discussions

Find answers, ask questions, and share expertise about Alteryx Server.
SOLVED

Group by similar name

JAM9210
5 - Atom

Hi everyone,

 

I found a couple similar posts but couldn't find one that had a solution for my specific scenario. I am also fairly new to creating workflows from scratch, so I apologize in advance.

 

I'm trying to identify IDs that have multiple company names and addresses linked to them from a large data set of thousands of records. I was able to group by ID and Company name, however the data could use some cleansing as some of the company names are being counted more than once due to the different naming conventions (see sample attached). How can I clean the data to group by similar company name under the same ID?

 

I hope my question makes sense.

 

StateIDCompany
WA123456789BIG POPPA
AZ123456789BIG POPPA INC
AZ123456789BIG POPPA INCORPORATED
NY123456987AYE AND BEE INC
TX123456987AYE & BEE INC
FL254170270BEYONCE
FL254170270SOLANGE
CA313076766PARKWOOD WAY
WA313076766PARKWOOD WY
AZ678912345STONEBRICK PL
AZ678912345STONERICK PL
TX876543219NICKI MONET
AZ876543219NICKY MONET
CA987654321MACY GRAY
AZ987654321MACY GREY
CA987654321RIH STUDIOS
6 REPLIES 6
PanPP
Alteryx Alumni (Retired)

Hi @JAM9210 

 

Can you provide what the output should look like?

 

We will need to utilize the Fuzzy match tool in this scenario.

 

TimN
13 - Pulsar

Try Fuzzy Match and Make Group.  Something like this.

TimN
13 - Pulsar

Here's a better example if I'm following your problem.  I haven't done this in a while ;).

 

TimN_0-1670631488000.png

 

JAM9210
5 - Atom

The output should look something like the below. I want to be able to group similar company names (i.e. BIG POPPA INC and BIG POPPA INCORPORATED) and then count how many actual different named companies fall under the same ID. State2, ID2, and Company2 would be the result columns after cleansing the company name field. I also added the last line to have a different company name under that same ID to add to the count:

 

StateIDCompanyState2ID2Company2Count
WA123456789BIG POPPAWA123456789BIG POPPA2
AZ123456789BIG POPPA INCAZ123456789BIG POPPA2
AZ123456789BIG POPPA INCORPORATEDAZ123456789BIG POPPA2
NY123456987AYE AND BEE INCNY123456987AYE AND BEE INC1
TX123456987AYE & BEE INCTX123456987AYE AND BEE INC1
FL254170270BEYONCEFL254170270BEYONCE2
FL254170270SOLANGEFL254170270SOLANGE2
CA313076766PARKWOOD WAYCA313076766PARKWOOD WAY1
WA313076766PARKWOOD WYWA313076766PARKWOOD WAY1
AZ678912345STONEBRICK PLAZ678912345STONEBRICK PL1
AZ678912345STONERICK PLAZ678912345STONEBRICK PL1
TX876543219NICKI MONETTX876543219NICKI MONET1
AZ876543219NICKY MONETAZ876543219NICKI MONET1
CA987654321MACY GRAYCA987654321MACY GRAY1
AZ987654321MACY GREYAZ987654321MACY GRAY1
CA987654321RIH STUDIOSCA987654321RIH STUDIOS1
CA123456789XYZ STUDIOSCA123456789XYZ STUDIOS2
JAM9210
5 - Atom

Thanks Tim! I think we're on the right track here :)

TimN
13 - Pulsar

Here's how I understand it...

 

If this works, please mark my answer as the resolution.

 

Thanks.