Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Creating a list of all the values in a column

rachel9989
5 - Atom

Hi,

 

I'm working with some data and from looking at a sample it looks like there are a few values where someone has manually inputted "N/A" and similar values. I would usually open data in excel and look at what values are in the column using the filter option, to do the relevant find and replace, but the main data that I'm using is too big for my computer to open in excel.

 

Is there any way to create a list of the unique values in each column? e.g. - 

 

StringsNumbers
a1
b1
b2
a3
c3

Would return - 

 

StringsNumbers
a1
b2
c3

 

Thanks!

9 REPLIES 9
estherb47
15 - Aurora
15 - Aurora

Hi @rachel9989 

 

You can use the Summarize tool once for each column, using Group By as the method, and then (as long as there are the same number of rows), you can join together by row position.

 

Let me know if that helps!

 

Cheers,

Esther

rachel9989
5 - Atom

Hi Esther,

 

Thanks for this, do you know if there's a way to do this in one step? I have about 30 fields so this would be helpful.

 

Thank you!

seven
12 - Quasar

Hi @rachel9989 

 

your question can be taken in one of two ways. @estherb47 has a correct solution to generate a distinct list of rows per field. I would add that using a Join Multiple by row position will generate the list regardless of whether the number of values per field is equivalent. See attached wf1. Again, credit to @estherb47 

 

one downside of doing it in that fashion is the need for a new Summarize tool per incoming field. Check out wf2 for a dynamic way of getting the same list. This is the second workflow. It can handle any amount of incoming fields. Please note that getting these lists in any fashion can be computationally expensive as the data grow in size.

 

seven_0-1593093579732.png

 

 

If I read into your question a different way, it sounds like you want to decompose every field into a list of strings and numbers. That is trickier. I'm working on that and will post shortly.

seven
12 - Quasar

 

Hi @rachel9989 ,

 

Here is an answer to the second interpretation of your question. I am not sure you're looking for this and it might be largely academic. Still, fun workflow. Between this and the wf2, your question probably has been resolved.

 

seven_0-1593094278798.png

 

estherb47
15 - Aurora
15 - Aurora

Thanks, @seven 

 

Definitely a fun exercise!!

@rachel9989 , you can try this approach to use one summarize tool overall.

EstherB47_0-1593095003657.png

 

Transpose the data to get everything into 2 columns. I include the sort for my own sanity in viewing, and for the downstream Tile tool.

Unique on both columns will get you all unique header/value columns. The Tile tool assigns row numbers to rebuild the table. And then a crosstab tool to rebuild.

Let me know if this is easier. No summarize required!

Cheers,

Esther

seven
12 - Quasar

Hi @estherb47 

 

good tag team! I like your approach, one thing I've noticed is that a Multirow is drastically more performant than the Tile tool. Might be good to do a side by side performance profile.

rachel9989
5 - Atom

@EstherB47 @seven thank you both! Sorry, in my original post it made it seem like I wanted to separate strings and numbers - I just meant these as example columns, I should have done this differently... but both give the result I was looking for! thanks so much 🙂 

seven
12 - Quasar

@rachel9989 

 

gotcha, definitely use @estherb47 's solution or my wf2 solution for the best results.

estherb47
15 - Aurora
15 - Aurora

Great solving with you, @seven . That's my favorite part of community, working through challenges with great minds

 

Cheers,

Esther

Labels