Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
RithiS
Alteryx
Alteryx

I remember the first serious reason I used Excel. It was autumn 2005, I was a college freshman, and had major aspirations: becoming a professional poker player. That was around the peak of poker. Chris Moneymaker and Greg Raymer won in 2003 and 2004 respectively. The first place prize for 2006 World Series of Poker Main Event was $12,000,000. I was engrossed in it playing with friends, family, and strangers both online and in-person.

 

As a video gamer, I’m used to having my “stats” tracked so I wanted the same with poker. Excel was my first choice after seeing a friend create his own. It was designed to track session dates, times, and monetary returns while blending online and offline results. It was nothing spectacular since all it had were basic functions and a pretty format.

 

The spreadsheet revealed a few important findings. My monetary return over time was terrible. I couldn’t control the variance and wasn’t mature enough to handle it. Essentially, I wasn’t very good. Nothing hurts more than the shattered dreams of becoming a professional poker player (totally not hyperbolic).

 

MFW I realize I'm not good at poker.MFW I realize I'm not good at poker.

MFW I realize I’m not good at poker.

 

It was time to move on and evolve my dreams. I used to love sports (I still do, but used to too), so I set my goal to join the front office of the Denver Nuggets or Broncos. Neither gave me the time of day, so I moved on to the next best thing: fantasy basketball and football.

 

I painstakingly copied and pasted NBA and NFL data into spreadsheets to create my own draft rankings. Then used them to manage my rosters with great results. At the time, the overhead was too much so I scraped them and decided to use my amazing and flawless judgement to manage fantasy teams going forward (I’ve won about two fantasy leagues since).

 

Semesters into my collegiate career, my finance and accounting professors began assigning projects and homework that required Excel. Saying I was excited is an understatement.

 

Why does this all matter? For better or worse, Excel is the number one BI tool and may stay that way for a long time. According to Business Insider, Microsoft CEO Satya Nadella says their best consumer product is Excel. For many including myself, Excel is their first experience with data preparation, blending, and analytics. I was a business student learning finance and accounting where spreadsheets are the backbone of those industries. So how did Excel lay this foundation for me? It starts with two functions: VLOOKUP and IF.

 

The IF function is outside the scope of this post, so we’ll explore the VLOOKUP function. Let’s do a quick refresher. The function finds a specified value in a table and returns a value from a specified column.

 

Here’s the rundown of a VLOOKUP example below:

  • The VLOOKUP arguments box shows for cell H3, which contains =VLOOKUP(G3,A:E,4,FALSE). (green box)
  • The Lookup_value is G3, which is the name Lee. (red boxes)
  • The Table_array is columns A through E. (blue boxes)
  • The Col_index_num is 4, which is the Sales column. (orange boxes)
  • The argument FALSE means the match must be exact to return the desired value.
  • The function will look for Lee in the leftmost column of the table array, which is First Name. If Lee is found, then Sales is returned. Since Lee does exist in Table 1, the function returns the Sales value $170,000. (purple boxes)

 

vlookup_example.png

 

As other articles have stated, Alteryx can duplicate VLOOKUP with the Join, Join Multiple, and Find Replace tools. Let’s look at how to duplicate this simple VLOOKUP with a Find Replace tool. As shown below, both tables are loaded with Input Data tools and connect to a Find Replace tool.

 

 findreplace_input.png

 

The tables as viewed in Alteryx:Table1_Table2.png

 

The Find Replace tool does the following:

  1. Looks for Name from Table 2 in the field First Name from Table 1.
  2. If the Name is found in First Name, then Sales is returned.

 

The output of the Find Replace tool has Sales appended to each name:

 findreplace_output.png

 

Let’s review the configuration of the Find Replace tool.

 

 findreplace_config.png

 The Find  section:

 

The tool takes each record from the Name field of Table 1 and looks for it in the First Name field of Table 2.

 

 

The Replace  section:

 

If found, append Sales record to the Name.

 

 

But wait! There is something wrong with the data that affects both the VLOOKUP and Find Replace. Table 2 will look for Angelina, but there are two Angelinas in Table 1. The VLOOKUP function returns $130,000, which is the first Sales for both Angelinas. The Find Replace tool returns the second Sales number of $150,000 for Angelina. Neither is wrong, but they don’t tell the whole story. To get all the data, Last Name will be included. Table 4 is an updated version of Table 2 with both Angelinas. We will perform a lookup on First Name and Last Name.

 

table4.png

 

So how do we perform a VLOOKUP on multiple columns? And what is the Alteryx equivalent?

 

A new column for each table is created by concatenating the First Name and Last Name fields. Either the CONCATENATE function or the “&” symbol may be used. The VLOOKUP function and Find Replace tool must point to Concatenated Name as the field to lookup and find.

 

Table 3 shows an example of the new Concatenated Name field in cell A3.

Table 6 is an updated version of Table 4 containing the Concatenated Name field. It also shows the updated VLOOKUP function in cell K3.

Table3_Table6.png

 

While this can be used for Find Replace, I don’t recommend it. It’s not scalable when additional lookup columns are needed later. Concatenating so many fields is time consuming and can lead to many mistakes. Instead, let’s use Find Replace’s cousin the Join tool.

 

The Alteryx workflow looks nearly identical as above.

join_input_wf.png

 

Table 1 is identical as before and will be joined to Table 4. Table 6 is not used since the Join tool doesn’t need a concatenated field to merge datasets.

Table1_Table4.png

 

The Join tool will blend Table 1 and 4 together based on the selected fields. Here is the configuration of the Join tool:

 

 join_config.png

 

The section Join by Specific Fields merges the two tables together based on First Name and Last Name. This does exactly what we want without having to create an additional field.

 

Joining by multiple fields in the Join tool is exactly like using a VLOOKUP function with multiple columns.

 

The Join tool is easier to scale and manage as dashboard and data requirements evolve.

 

 

Each sales amount outputs correctly with each name. Here’s the output of the Join tool:

 

join_output.png

 

Let’s do a quick recap of what we just did with VLOOKUPs, the Find Replace tool, and the Join tool:

  • Looked up Name from Table 2 to find in Table 1 to return Sales for each person.
  • Showed how the lookup works in the VLOOKUP function and the Find Replace tool.
  • Create a concatenated field using First Name and Last Name. This shows how to use multiple fields for the VLOOKUP function. This can be applied to the Find Replace tool as well.
  • Showed how to configure the Join tool to merge data on multiple fields. In this context, it is equivalent to the VLOOKUP field with multiple criteria.

 

The Alteryx Join tools are a natural progression from Excel’s VLOOKUP. As great as the VLOOKUP function is, its limitations keep it from easy scalability and automation. If you’re familiar with combining the INDEX and MATCH functions, then you might’ve realized that the Join tools cover this as well.

 

While the VLOOKUP function played a major role in shaping my Alteryx foundation, it didn’t do so alone. Other concepts include Excel’s IF function and SQL database concepts such as groups, joins, unions, and dimension and fact tables. I won’t touch them in this post, but may do so in the future. I’m 50/50 on starting my 99-part blog series discussing Excel and Alteryx, so be on the lookout!

 

What helped build your foundation for using Alteryx? Everyone has a different path to enlightenment and I would love to hear yours! Smiley HappyWoman HappyMan Happy

Rithi Son
Product Manager

Rithi started at Alteryx in March 2016 as a product engineer before becoming a product manager in 2019. He has worked as a business and data analyst in ecommerce and health care business intelligence utilizing Excel and SQL. Rithi lives in Denver enjoying life in the Colorado front range.

Rithi started at Alteryx in March 2016 as a product engineer before becoming a product manager in 2019. He has worked as a business and data analyst in ecommerce and health care business intelligence utilizing Excel and SQL. Rithi lives in Denver enjoying life in the Colorado front range.

Comments