Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Extraction of substrings and sorting them accordingly in appropriate columns

SumitVerma
5 - Atom

Hello,

 

I have a long string of text i.e. 1 row and 1 column. I want to parse the entire string, extract substrings based on delimiters and place them in appropriate columns. Below is an example of my string:

 

"Backfire – To produce an unexpected and unwanted result. The company’s new efficiency measures backfired when workers protested and staged a walkout, thus stopping production completely. Balance – The remaining part or leftover amount. The publishing division accounted for 25% of the profits, and the film division for the balance."

 

This is how I want it to look like:

 

WordMeaningExample
BackfireTo produce an unexpected and unwanted result.The company’s new efficiency measures backfired when workers protested and staged a walkout, thus stopping production completely
BalanceThe remaining part or leftover amount.

The publishing division accounted for 25% of the profits, and the film division for the balance.

 

I tried using the "Text to Columns" Node and  '"Regex" Node, however, nothing seems to work. I don't know what I am doing wrong because my Regex is giving the correct output when I try it on Regex101.com.

 

Any help would be highly appreciated.

 

-Thanks

3 REPLIES 3
PangHC
12 - Quasar

split by rows first. then split to column. 

Provided only two dots in every pair.

 

first regex: (?:\s|^)(.*?–.*?\..*?\.)


(?:\s|^) = (?:) mean do not take. \s|^ mean it start with space or it is first character. mainly to remove the space for second line onward.

(.*?–.*?\..*?\.) = mean get rows by pattern "word - anywordanyword."

 

.*? = not greedy. take as least text as possible

\. = dot (add \ to tell we required the symbol)

 

 Screenshot 2023-08-28 174712.png

 

second regex: (.*?) – (.*?\.) (.*?\.)

 

Screenshot 2023-08-28 174708.png

SumitVerma
5 - Atom

Hi  Pang_Hee_Choy,

 

Thanks for the response. I tried your method and it seemed to work, however, it's running only for one iteration. I was not able to get the second row in the first step. Don't know what I am doing wrong.

 

-Thanks

PangHC
12 - Quasar

here the workflow. you can compare it. 

Labels