This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 12-02-201601:35 PM - edited on 06-19-201910:46 AM by CristonS
You have multiple fields in your data that correspond to customer addresses. Some customers may have more than one address listed in their record. However, you want to whittle that list to one address per customer. That one address is the first, non-null address found when moving sequentially across a set of fields that contain address data.
For our example, we have three fields of data related to addresses: Address1, Address2 and Address3. The preferred field of data to use is Address1. However, if Address1 does not contain data, then we’ll use the data in Address2. If both fields of Address1 and Address2 do not have data, then we’ll use the data in Address3. The final output should contain a single list of the addresses highlighted in yellow in Figure 1.
Figure 1: The final output will contain a single list of the data highlighted in yellow.
Method 1: Write a Conditional Statement
The most common approach to this type of problem is to craft a conditional statement to select data based on a set of criteria (in this case, order). In the case of our data, it would look something like this:
IF IsNull([Address1]) AND IsNull([Address2]) THEN [Address3]
ELSEIF IsNull([Address1]) THEN [Address2]
ELSE [Address1] ENDIF
However, what if I had 20 fields of addresses instead of 3? Just getting that statement with three fields took me too long to write out! If you do have 20 fields, you might want to start typing that Formula now….
IF IsNull([Address1] AND IsNull([Address2]) AND IsNull([Address3]) AND IsNull([Address4]) AND IsNull([Address5]) AND IsNull([Address6]) AND IsNull([Address7]) AND IsNull([Address8])...
You get the idea. And now you’re thinking, “You’re going to tell me there’s a better way, right?!?” Well, yes...I am!
Method 2: Data Manipulation
An alternative method of solving this problem is to manipulate the data using the Transpose, Filter and Sample tools. I’ll share some advice from @RodL here: “If you want to make something…truly dynamic, then the ‘best practice’ when you are dealing with an unknown number of columns is to ‘verticalize’ the data”. In our case, we may know the total number of columns of address data we have; what we don’t know is which column the data we want is actually in.
Following @RodL’s suggestion, we’ll ‘verticalize’ the addresses using the Transpose tool. This stacks the addresses for each customer in order of the fields in the table (Figure 2). We’ll use the Client ID (or Record ID, if you’ve chosen to add one) as our Key Field and the fields that contain address data as our Data Fields.
Figure 2: All address fields per Record ID (or Client ID) are stacked vertically in order of the field sequence.
Since Null values are not usable records for our purposes, we’ll use a Filter to identify all the usable data (non-Null values). Now that our usable data is stacked vertically in order of field selection, we can Sample the first record from each Record ID (or Client ID) group. We’ll configure the Sample tool to identify the First N (where N = 1) Records from a group (Group by Record ID or Client ID).
Figure 3: Sample the first record from every Record ID or Client ID group.
After some data clean-up with a Select tool, we're left with a column of the selected address for each of our customers: