Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

A Find & Join combination - how to accomplish a complex join?

Glycogenolysis
6 - Meteoroid

Hi there community!

 

I have some financial data that I am trying to analyse, unfortunately I can't seem to find a way to combine the join and find replace tools in a way that will accomplish what I am trying to do. My current solution (see screenshot below) is very messy and manual whenever it needs to be updated.

 

My actual data I can't upload, but I have created a small sample dataset using different data to show what I am trying to achieve.

 

Data table; 

 

CityShop Name
BerlinKaufCentrum
BerlinMax Werk NER
BerlinEDEKA Markt
BerlinLIDL
LondonBilly Corner Store
LondonBorough Market
LondonTescos Supermarket
LondonLeasure Centre

 

Reference Table; 

CityText in Shop Name findShop Category
BerlinKaufGeneral
BerlinnerHardware
BerlinMarktSupermarket
BerlinSupermarketSupermarket
LondonSupermarketSupermarket
LondonMarkMarket
LondonCorner

Convinence

 

Desired Results Table ; 

 

CityShop NameShop Category
BerlinKaufCentrumGeneral
BerlinMax Werk NERHardware
BerlinEDEKA MarktSupermarket
BerlinLIDL-
LondonBilly Corner StoreConvinence
LondonBorough MarketMarket
LondonTescos SupermarketSupermarket
LondonLeasure Centre-

 

So what I want is to enrich the data table using my reference table, I want to use a join on the [City] to [City] and then simultaneously a findstring to match some amount of the [Shop Name] to the [Text in Shop Name find], returning the [Shop Category]. My actual data is ofcourse much much larger in scale and complexity than this so the solution needs to scale to ~100k Shop Names rows in the data table and ~1k [Text in Shop Name find] rows in the reference table.

 

The problems with my current approach is that I can either join on the City or do the findstring, but doing both at once I can't find a way, my current solution splits the data by city using 20+ filters so that I can then apply the findstring, but this is horrible to maintain.

 

There might be details I am missing out here, but I hope I have been clear in what I need to achieve, many thanks to anyone who helps 🙂

 

Glycogenolysis_0-1628526184243.png

 

13 REPLIES 13
atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

Here is how you can do it.

 

Workflow:

atcodedog05_0-1628526824749.png

 

1. Using find and replace to get required columns.

2. Using filter to match cities.

 

Hope this helps : )

 

Glycogenolysis
6 - Meteoroid

Thanks that is so simple I am annoyed I didnt think of it - I'll give it a go and report back if it works for my full usecase, but I expect it to. Many thanks

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

I added some new data. I have updated my workflow to improve the city join case and keep all rows. Please take a look into it.

 

Workflow:

atcodedog05_0-1628527053428.png

 

Hope this helps : )

 

Glycogenolysis
6 - Meteoroid

Hi @atcodedog05 

Thanks a lot for your help, but even with that ammendment I still have some issues due to substrings not matching when a full string matches instead - I've slightly adjusted my input data to match some of my financial data where I am seeing large holes of missing matches.  In the below "Tescos Supermarket" is matching to "Supermarket" in Berlin instead of "Super" in London, hense I miss this match after the formula result and the Find Replace completely misses this field - I need a more matrix way of matching these so that I can get multiple matches.... and im back to being a bit puzzled.

 

Data Table;

CityShop Name
BerlinKaufCentrum
BerlinMax Werk NER
BerlinEDEKA Markt
BerlinLIDL
LondonBilly Corner Store
LondonBorough Market
LondonTescos Supermarket
LondonLeasure Centre

 

Ref Table;

 

CityText in Shop Name findShop Category
BerlinKaufGeneral
BerlinnerHardware
BerlinMarktSupermarket
BerlinSupermarketSupermarket
LondonSuperSupermarket
LondonMarkMarket
LondonCornerConvinence

 

Output Data;

 

CityShop NameShop CategoryCities
BerlinKaufCentrumGeneralBerlin
BerlinMax Werk NERHardwareBerlin
BerlinEDEKA MarktSupermarketBerlin
BerlinLIDL- 
LondonBilly Corner StoreConvinenceLondon
LondonBorough MarketMarketLondon
LondonTescos Supermarket-Berlin
LondonLeasure Centre- 

 

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

I have posted an updated workflow above. It works for this scenario.

 

Workflow:

atcodedog05_0-1628614460492.png

 

Hope this helps : )

Glycogenolysis
6 - Meteoroid

@atcodedog05  The Output data has >[London] [Tescos Supermarket] as "-" for the [Shop Category]

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

Your output and my output is matching. Is there any issue?

Glycogenolysis
6 - Meteoroid

Yea out ouputs are matching. The issue is that I have "Tescos Supermarket" which matches to "Supermarket" in Berlin and does not match to "Super" in London, the City of the "Tescos Supermarket" is London, so I only get the 1 match back from Berlin leading to a "-". For my full workflow I need to get that match on the "Super" and then I can also see the London results hense this "Tescos Supermarket" should get a category, but in the current process it does not due to matching to Berlin instead of the partial string in London. 

Does that make sense? I have tried my best to explain the issue - in this example this looks like a relatively trivial problem, but in my data this is thousands and thousands of rows 😞

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

I understood now 😅 Let me see what can be done. How big your ref store list?

Labels