Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Designer| Dynamic summarize and v-lookup

Mj9715
8 - Asteroid

Hi All, 

 

I have multiple data sets like below which I have created  using workflows in Alteryx. All datasets are firm and corresponding values. 

 

I have a final output sheet with a master database of unique firm names , against which I need to populate the value corresponding to it. 

 

For eg., 

Scenario 1 : the firm name Adv is only found once in all input datasets so the value 300,000 is the value pasted in the master output file. 

Scenario 2: a firm is not found in any dataset, it needs to be left blank 

Scenario 3 : firm is found in multiple datasets, the final value needs to be a sum of all values in the multiple dataset corresponding to its name. Eg., App final value needs t0 be (2500000 + 1600000)

 

Please note that the naming is a bit messy, i.e., one input may have "App limited" and another "App Ltd". This poses a problem when using "group by" in summarize tool. Attached file with input and desired output. Please can someone guide?

 

Input Dataset 1 
Firm Value 
Adv Limited300,000
App Ltd2500000
F2F700000
FD3500000
GFPL200000
Smith1100000
Input Dataset 2
FirmValue 
App Limited1600000
Craf Ltd 500000
Smith 10,100
6 REPLIES 6
patrick_mcauliffe
14 - Magnetar
14 - Magnetar

@Mj9715 Can you show us what you've tried so far; so we aren't duplicating any work?  Maybe something you've tried is just one step away.

Christina_H
14 - Magnetar

If you have many datasets you might want to use a dynamic input, but basically the process would be to input all your datasets, union them together, group and sum by firm name, then join to the master list.  Union any unmatched firms from the master list and you should have what you need.

 

(Data cleansing tools are included to deal with leading/trailing whitespace, you might not need them for the final data)

Mj9715
8 - Asteroid

Hi Christina, @Christina_H  I had already built till the summarize tool, however there was one issue I forgot to mentioned. For summarize tool to work, the group by field needs to be exactly the same. For eg it could be App Limited or App Ltd. Any way for summarize tool to do a partial match with just the first word for the group by?

Mj9715
8 - Asteroid

@patrick_mcauliffe Patrick, I have union the dataset to input all fields and tried to summarize using group by firm name. The issue I am having as explained below is varying name formats. For eg., one dataset calls it AP Ltd while another one says AP Limited. The master dataset may match one of the two.. however summarize isnt considering them as the same to group by and sum it up. 

Christina_H
14 - Magnetar

@Mj9715 This is really a data cleaning problem then.  If you're confident that you will get correct matches by cutting down to the first word only you could do that in a formula tool before matching.  Or you could create a list of common variations and either remove them all before matching or replace with a standardized form (e.g. replace Limited / LTD / LTD. with LTD)

 

It all comes down to what works best with your data.

patrick_mcauliffe
14 - Magnetar
14 - Magnetar

@Mj9715 Check out the Fuzzy Match tool. This does require some tuning on your part though.

You'll want to add a field identifying each of the sources so when you union them you can still identify where they came from:

 

patrick_mcauliffe_0-1633517849244.png

Generally, after the Fuzzy Match tool, a Make Groups tool is warranted.  This consolidates A:B, B:A matches so there's only one group identifier for many keys.

 

Also, a good idea to change the case of all text to be the same before running.

Labels