Background
Most if not all of us have been guilty of writing large, hard-coded statements to filter our data, flooding our expression editors with an unreasonable amount of AND/OR operators, even if - at the back of our minds - we know it’s not best practice. Luckily, there are other ways of achieving the same result that can make things easier to implement and maintain. All three of these tips/methods involve the use of a lookup table in order to store our filtering values and so this can also help to make the process more dynamic, especially if we can populate these tables via a database or well-governed form/spreadsheet. It’s worth noting that, although we’re only looking at these examples in Alteryx, the concepts hold true regardless of the tool, meaning we can use the exact same techniques in SQL/Python and so on.
Filtering exact matches via joins
The first approach we’ll cover here is filtering for multiple discrete values in a column. An example of this would be isolating records relating to an origin airport of ‘Heathrow’ and ‘JFK’ from a dataset covering airline journeys. Whilst we could do this using a fairly simple filter expression along the lines of - [orig_airport] IN (‘Heathrow’, ‘JFK’) - what happens when we want to select records from, say, 75 airports? That starts to become a lot more hassle, right? Right. So that’s where our first method comes into play. Instead of typing out an enormous ‘IN’ statement, we can simply set up a lookup table and then conduct a Join on <field in main table> = <field in lookup table>.
Using a dataset of New York City restaurant inspections, let’s have a look at how that looks in practice. First we start with our main dataset: