Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx / Text to Columns / Downloading Multiple Files From FTP

djodts
8 - Asteroid

Hello,

 

I am trying to pull in a list of files from an FTP site since there doesn't seem to be a way to just download all files that are in a particular folder (which will change each month, so I cannot create a static list).  I have been able to pull down a list of files, but the data is needs to be split.  

 

After using a download tool and downloading the output to a string and then using text to columns to create rows for each file, I get the following.  I do not care about most of the data in the DownloadData column except the filenames which are listed to the far right... first example is highlighted in blue.  In reality, this text of course is different, but all files are in this naming convention and are the same number of characters.  I'm not great at RegEx at all or I would use it.  Happy to use it, just don't know how to come up with the expression.  How can I get a column that lists the filenames only?

 

clipboard_image_0.png

 

On a side note, if anyone can point me to a good straightforward reference that can help me learn how to write Regular Expressions so I can learn to parse data, I'd be grateful.

 

 

Thank you.

 

2 REPLIES 2
JordyMicheal
11 - Bolide

I'll say this is one of the BEST places to test RegEX code and how i learned!

https://regexr.com/

 

You can type out what you want to filter and use a list of commands.
Hope that helps that section out at least

CharlieS
17 - Castor
17 - Castor

If the filename is always the last string in Download Data that's preceded by a space, then you can use the following, non-RegEx formula:

 

right([DownloadData],FindString(ReverseString([DownloadData])," "))

 

Another way would be to use a Text to Columns tool to split to rows on space characters, then filter to the string that ends with a file extension (the last 4 characters are a period and three letters)

 

REGEX_Match(Right([DownloadData],4),"\.\w{3}")

 

Examples of these are in the attached workflow.

 

 

As far as learning RegEx: the RegEx tool has a handy reference built into the dropdown from the "Regular Expression" field in the tool configuration. Otherwise, RegEx is not unique to Alteryx, so there's a lot of posts on StackExchange and other websites about parsing various scenarios. 

Labels