The Alteryx Community is a finalist in three 2026 CMX Awards! Help us win Customer Support Community, Most Engaged Community, and User Group Program of the Year - vote now! (it only takes about 2 minutes) before January 9.
ACT NOW: The Alteryx team will be retiring support for Community account recovery and Community email-change requests Early 2026. Make sure to check your account preferences in my.alteryx.com to make sure you have filled out your security questions. Learn more here
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
RESUELTAS

RegEx Tokenize Group of Words to Column

taylorbyers
Meteoro

I have textual data that are sentences contained in a single column. I am looking to shrink this data down into a new column with a maximum of 7 words. Some columns contain more less than 7 words and some contain more. I tried to use this regular expression, but RegEx returns a NULL Column if the column doesn't contain at least 7 words.

 

(\<\w+\> \<\w+\> \<\w+\> \<\w+\> \<\w+\> \<\w+\> \<\w+\>)

 

 

6 RESPUESTAS 6
binu_acs
Polaris

@taylorbyers Can you provide the input file and expected output?

taylorbyers
Meteoro

I cannot due to confidentiality. 

taylorbyers
Meteoro

I cannot due to confidentiality 

hellyars
Púlsar

@binu_acs 

 

I am not sure if I understood your question correctly, but I gave it a try...

Workflow attached.  I used two examples from the image you provided (one that produced a null and one that did not).

 

Steps:

 

1.  Assign a Record ID.  This is important when re-combining the data later.

2. I parsed the original text using (\w+) and set the Regex Tool to Tokenize and Split by Columns (=7).

3. I flipped the data using the Transpose tool.

4. I then used a Summary tool to recombine by the text grouped by Record ID using the Concat function.

 

 

hellyars_1-1665497596392.png

 

 

hellyars_0-1665497551253.png

 

taylorbyers
Meteoro

This works well, thank you!

Christina_H
Magnetar

If you need it in just RegEx, try this

 

^((?:\<\w+\>\s?+){1,7})

 

It will parse up to 7 words from the start of the string.

Etiquetas
Autores con mayor cantidad de soluciones