Alteryx Designer Desktop Discussions

taylorbyers · ‎10-11-2022

I have textual data that are sentences contained in a single column. I am looking to shrink this data down into a new column with a maximum of 7 words. Some columns contain more less than 7 words and some contain more. I tried to use this regular expression, but RegEx returns a NULL Column if the column doesn't contain at least 7 words.

(\<\w+\> \<\w+\> \<\w+\> \<\w+\> \<\w+\> \<\w+\> \<\w+\>)

binuacs · ‎10-11-2022

@taylorbyers Can you provide the input file and expected output?

taylorbyers · ‎10-11-2022

I cannot due to confidentiality.

taylorbyers · ‎10-11-2022

I cannot due to confidentiality

hellyars · ‎10-11-2022

@binuacs

I am not sure if I understood your question correctly, but I gave it a try...

Workflow attached. I used two examples from the image you provided (one that produced a null and one that did not).

Steps:

1. Assign a Record ID. This is important when re-combining the data later.

2. I parsed the original text using (\w+) and set the Regex Tool to Tokenize and Split by Columns (=7).

3. I flipped the data using the Transpose tool.

4. I then used a Summary tool to recombine by the text grouped by Record ID using the Concat function.

taylorbyers · ‎10-11-2022

This works well, thank you!

Christina_H · ‎10-11-2022

If you need it in just RegEx, try this

^((?:\<\w+\>\s?+){1,7})

It will parse up to 7 words from the start of the string.

Alteryx Designer Desktop Discussions

RegEx Tokenize Group of Words to Column