Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

A Find & Join combination - how to accomplish a complex join?

Glycogenolysis
6 - Meteoroid

Hi there community!

 

I have some financial data that I am trying to analyse, unfortunately I can't seem to find a way to combine the join and find replace tools in a way that will accomplish what I am trying to do. My current solution (see screenshot below) is very messy and manual whenever it needs to be updated.

 

My actual data I can't upload, but I have created a small sample dataset using different data to show what I am trying to achieve.

 

Data table; 

 

CityShop Name
BerlinKaufCentrum
BerlinMax Werk NER
BerlinEDEKA Markt
BerlinLIDL
LondonBilly Corner Store
LondonBorough Market
LondonTescos Supermarket
LondonLeasure Centre

 

Reference Table; 

CityText in Shop Name findShop Category
BerlinKaufGeneral
BerlinnerHardware
BerlinMarktSupermarket
BerlinSupermarketSupermarket
LondonSupermarketSupermarket
LondonMarkMarket
LondonCorner

Convinence

 

Desired Results Table ; 

 

CityShop NameShop Category
BerlinKaufCentrumGeneral
BerlinMax Werk NERHardware
BerlinEDEKA MarktSupermarket
BerlinLIDL-
LondonBilly Corner StoreConvinence
LondonBorough MarketMarket
LondonTescos SupermarketSupermarket
LondonLeasure Centre-

 

So what I want is to enrich the data table using my reference table, I want to use a join on the [City] to [City] and then simultaneously a findstring to match some amount of the [Shop Name] to the [Text in Shop Name find], returning the [Shop Category]. My actual data is ofcourse much much larger in scale and complexity than this so the solution needs to scale to ~100k Shop Names rows in the data table and ~1k [Text in Shop Name find] rows in the reference table.

 

The problems with my current approach is that I can either join on the City or do the findstring, but doing both at once I can't find a way, my current solution splits the data by city using 20+ filters so that I can then apply the findstring, but this is horrible to maintain.

 

There might be details I am missing out here, but I hope I have been clear in what I need to achieve, many thanks to anyone who helps 🙂

 

Glycogenolysis_0-1628526184243.png

 

13 REPLIES 13
atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

Here is how you can do it.

 

Workflow:

atcodedog05_0-1628526824749.png

 

1. Using find and replace to get required columns.

2. Using filter to match cities.

 

Hope this helps : )

 

Glycogenolysis
6 - Meteoroid

Thanks that is so simple I am annoyed I didnt think of it - I'll give it a go and report back if it works for my full usecase, but I expect it to. Many thanks

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

I added some new data. I have updated my workflow to improve the city join case and keep all rows. Please take a look into it.

 

Workflow:

atcodedog05_0-1628527053428.png

 

Hope this helps : )

 

Glycogenolysis
6 - Meteoroid

Hi @atcodedog05 

Thanks a lot for your help, but even with that ammendment I still have some issues due to substrings not matching when a full string matches instead - I've slightly adjusted my input data to match some of my financial data where I am seeing large holes of missing matches.  In the below "Tescos Supermarket" is matching to "Supermarket" in Berlin instead of "Super" in London, hense I miss this match after the formula result and the Find Replace completely misses this field - I need a more matrix way of matching these so that I can get multiple matches.... and im back to being a bit puzzled.

 

Data Table;

CityShop Name
BerlinKaufCentrum
BerlinMax Werk NER
BerlinEDEKA Markt
BerlinLIDL
LondonBilly Corner Store
LondonBorough Market
LondonTescos Supermarket
LondonLeasure Centre

 

Ref Table;

 

CityText in Shop Name findShop Category
BerlinKaufGeneral
BerlinnerHardware
BerlinMarktSupermarket
BerlinSupermarketSupermarket
LondonSuperSupermarket
LondonMarkMarket
LondonCornerConvinence

 

Output Data;

 

CityShop NameShop CategoryCities
BerlinKaufCentrumGeneralBerlin
BerlinMax Werk NERHardwareBerlin
BerlinEDEKA MarktSupermarketBerlin
BerlinLIDL- 
LondonBilly Corner StoreConvinenceLondon
LondonBorough MarketMarketLondon
LondonTescos Supermarket-Berlin
LondonLeasure Centre- 

 

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

I have posted an updated workflow above. It works for this scenario.

 

Workflow:

atcodedog05_0-1628614460492.png

 

Hope this helps : )

Glycogenolysis
6 - Meteoroid

@atcodedog05  The Output data has >[London] [Tescos Supermarket] as "-" for the [Shop Category]

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

Your output and my output is matching. Is there any issue?

Glycogenolysis
6 - Meteoroid

Yea out ouputs are matching. The issue is that I have "Tescos Supermarket" which matches to "Supermarket" in Berlin and does not match to "Super" in London, the City of the "Tescos Supermarket" is London, so I only get the 1 match back from Berlin leading to a "-". For my full workflow I need to get that match on the "Super" and then I can also see the London results hense this "Tescos Supermarket" should get a category, but in the current process it does not due to matching to Berlin instead of the partial string in London. 

Does that make sense? I have tried my best to explain the issue - in this example this looks like a relatively trivial problem, but in my data this is thousands and thousands of rows 😞

atcodedog05
22 - Nova
22 - Nova

Hi @Glycogenolysis 

 

I understood now 😅 Let me see what can be done. How big your ref store list?

Labels