Alteryx Designer Desktop Discussions

Rob48 · ‎05-14-2024

I have two sets of data I'm working with. Set #1 is as below, a "data" column with thousands of rows and multiple columns with one of 3 values, "yes," "no" or "-" as below:

Data	column 1	column 2	column 3	column 4	column 5	column 6	column 7	column 8	column 9	column 10
aaa1	yes	no	no	no	no	no	no	no	no	-
aaa2	yes	-	yes	yes	yes	no	yes	yes	no	no
aaa3	no	-	-	-	-	no	-	-	no	no
aaa4	-	no	no	no	no	yes	no	no	yes	no
aaa5	-	yes	yes	yes	no	-	yes	no	-	yes
aaa6	no	-	-	-	-	no	-	-	no	yes
aaa7	no	no	no	no	-	yes	no	-	yes	-
aaa8	yes	yes	no	no	no	-	no	no	-	no
aaa9	-	-	no	no	yes	no	no	yes	no	yes
aaa10	no	no	yes	yes	-	no	yes	-	no	no
aaa11	yes	no	-	yes	no	no	yes	no	no	-

Set #2 is a simple key which will have the same column names and similar values as set#1 but with an added suffix after the yes/no value that will vary:

column 1	column 2	column 3	column 4	column 5	column 6	column 7	column 8	column 9	column 10
yes ab	yes ba	yes aa	yes ab	yes ab	yes ab	yes ab	yes aa	yes ab	yes ab
no ab	no aa	no aa	no aa	no aa	no ab	no aa	no aa	no ab	no aa

So if set #1 shows "yes" in column 1 this should convert to "yes ab" from set #2, and a "yes" in column 2 converts to "yes ba," and so on.

I need a conversion so that columns 1-10 in set #1 will show the values from columns 1-10 in set #2. I need this to be a robust flow because there are way more than 10 columns, data #2 will periodically update with new suffixes, and at times the data in set #1 may increase from "yes, no, -" to something like "yes, no, maybe, sometimes, often, rarely, -". Set #2 would update accordingly so the values would mirror set #1 but again with variable suffexes. Appreciate any help, thanks.

AGilbert · ‎05-14-2024

Here is an idea centered around transposing both tables and extracting the yes/no value from set #2 in order to match with set #1.

Rob48 · ‎05-15-2024

Hi AGilbert, thanks for this. I'm weak on REGEX formulae, can you explain what's happening in the formula tool here:

REGEX_Replace([Value], "(\w+)\s(\w+)", "$1")

reason I ask is I oversimplified my sample data, the two data sets are better represented here:

Set #1:

Data	column 1	column 2	column 3
aaa1	yes	no	-
aaa2	yes	-	no

Set #2:

column 1	column 2	column 3
yes widget ab	yes widget ab	yes widget aa
no widget ab	no widget aa	no widget aa

Set #1 is the same but Set #2 has added characters after the "yes / no" and the REGEX formula is dropping the "widget" when it creates the key column, so it's returning "yes xx" and I need "yes widget xx."

The "_widget_" characters are consistant so the variable suffix is the "aa/ab/xx" characters. Can we alter the REGEX formula to get the desired return?

AGilbert · ‎05-15-2024

Sure thing. At their core, regex patterns are used to match string data. The basic elements can define what character to match and how many of those characters to match.

For example, 'a' will match an 'a' and 'b' will match 'b'. To match one of many letters character classes can be defined with square brackets (example: [ab] or [a-z]). Because these are so common that predefined classes such as \w, \d, and \s, were created as a shorthand to match any alpha, any digit, and any whitespace characters respectively.

As for the match quantity, the elements above can be followed by a + for "one or more", a * for "zero or more", or ? for "zero or one".

So, the expression I gave is looking for one or more letters, a space, and another set of one or more letters.

Wrapping certain elements in parenthesis is called grouping. These capture the text an element matches, to be returned with "$1" (indicating the first captured group).

You should try out https://regex101.com/ . It's a great tool which give immediate, visual feedback for an expression.

The regex rabbit hole goes deep, and you can make some very strict expressions. Alternatively, like the example I gave you, expressions can be somewhat forgiving. You just need to consider your use case and how variable your input text may be.

In this case, if the text and schema is standardized, you could try "(\w+)\s(\w+)\s(\w+)" where "$1" = yes/no, "$2" = widget, and "$3" = aa/bb/ab. But this is keeping things pretty simple.

Alteryx Designer Desktop Discussions

Converting like columns with different data

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...