Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Search and extract words from continuous text file, then paste text to columns

FraM
6 - Meteoroid

Dear Community, aiming to build a workflow to extract certain words from a continuous text between delimiters from a text file and paste respective words to columns. 

 

Specifically: 

1. Import ~ 100 text files with running text; in each file:

2. Search for alt=

3. Paste text that is following between "" to column 1 (Company name)

4. Search for "company-name" title=

5. Paste text that is following between "" to column 2 (Company description)

 

Example File 1: 

xxxxxxxxxx alt="Company 1" yyyyyyy "company-name" title="Company 1: Company 1 Description"zzzzzzz alt="Company 2"yyyyyyy "company-name" title="Company 2: Company 2 Description"zzzzzzz

 

Note: Each file has ~ 40 Companies in its text  and total text lenght per file is very long. 

 

Output

 

Company nameCompany description
Company 1Company 1: Company 1 Description
Company 2Company 2: Company 2 Description

 

Looking forward to your help! 

10 REPLIES 10
binuacs
21 - Polaris

@FraM One way of doing this with the help of Regex Tokenize

binuacs_0-1653148891101.png

 

FraM
6 - Meteoroid

Hi binuacs, thanks for your quick response! The probem here is that the actual text in my text file is too long to paste into Text Input. I tried the same flow with importing the text file via Input Data - but here my text gets truncated..

binuacs
21 - Polaris

@FraM you can use the input tool instead of text tool

FraM
6 - Meteoroid

Yes I tried that, but even then the text is too long and automatically gets shortened: 

 

FraM_0-1653150186311.png

 

binuacs
21 - Polaris

@FraM try increasing the field length

 

binuacs_0-1653150500341.png

 

 

 

FraM
6 - Meteoroid

Still seems to loose some text on the way as it does not find my right expression - but i double checked in the input file that the expression is actually inlcuded in the text

FraM_0-1653152174765.png

 

binuacs
21 - Polaris

@FraM can you check those missing text are having the pattern alt=" and title=“

FraM
6 - Meteoroid

Works now!! :) To import all ~100 text files at once, would you recommend to add a dynamic data input or consolidate the text of all files in one single text file before importing? 

binuacs
21 - Polaris

@FraM There are multiple options. If all the .txt files are having the same schema  you can use the wild character like below or you can use a batch macro to combine all the files and perform the parse

 

binuacs_0-1653154836428.png

 

Labels
Top Solution Authors