Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Data Padding

bfoll
6 - Meteoroid

Hello,

 

I'm brand new to Alteryx and have spent the last couple of hours going through the intro paint by numbers and trying to solve what I believe is a basic data prep use case. I'm not used to the tool and am having trouble googling as I'm not sure what I should be look at in the tool. I'll explain my use case and I would appreciate it if anyone could be kind enough to point me to some knowledge articles that would be relevant! 

 

I have two data sets that contain two different activities a person performs. Each data set only have a record for the persons NAME if they've performed that activity. I need to pad the data so that each dataset has rows for every person and has a record for three types of activity they could perform (we'll call it A, B and C) and if that person didn't perform that activity yet, it'd be blank. 

 

So if I have 3 people (Sarah, Mike, Tiffany):

 

Current Data:

PersonActivityCount
SarahA1
SarahB1
SarahC1
MikeA1

 

Desired Output:

PersonActivityCount
SarahA1
SarahB1
SarahC1
Mike

A

1

Mike

B

0

MikeC0
TiffanyA0

Tiffany

B0

Tiffany

C0

 

I saw on article mention a constant formula but that looked like it was only allowing me to insert one item. I then saw the "Generate Rows" icon but similarly couldn't figure out how it wanted the formula set up. I'm sure this is a very novice question but I appreciate any direction as I get my footing. :) 

 

Thanks!

4 REPLIES 4
Joe_Mako
12 - Quasar

I assume that there is some data source that has at least a person value of "Tiffany" to draw from, I called this data source "Other Person list". Using these two Inputs, the workflow Unions them together keeping only common subset, then Summarizes the result grouping by Person, another Summarize to group by Activity from your "Current Data" input, then Append the results of the two Summarize tools with the option Allow All Appends, then a Join Multiple tool with the result of the Append as the first connection, and Current Data as the second input, and then a Formula tool to replace Null values with Zero.

 

Densify Data.png

bfoll
6 - Meteoroid

Hi Joe - thanks so much for your fast reply. That looks like what I need. I am curious as to if there's a reason you didn't use the "Generate Rows" option? As a beginner it sounded like what I would need, at some point, in the workflow. I do need to add in all calendar dates to my dataset as well so would you reccomend that as a good option for that task?

 

Many thanks!

Joe_Mako
12 - Quasar

Yes, to generate all calendar dates with a beginning and end date, the Generate Rows tool would be a good choice.

 

The Generate Rows tool would be appreciate if you had a beginning, end, and interval. For example monthly or daily date range with a start and end date, or if you wanted a sequence of numbers.

 

The Summarize tool with the Group by option is useful to generate a unique list of individual values or combinations of values that exist in a data set.

bfoll
6 - Meteoroid

Awesome, Joe. Thanks - I'll keep this in mind!

Labels