Alteryx Designer Desktop Discussions

CsmithData31 · ‎03-30-2023

Hello!

I have a text file that I am having trouble parsing out a file using the RegEx tool. I am able to parse out date and reference using the following regex but having trouble with Ful name and remaining columns.

(\d{2}/\d{2}/\d{4})\s(\<\w+\>)\s+

Sample text file attached.

CsmithData31 · ‎03-30-2023

Also having trouble with the data shifting if the code column is blank.

BS_THE_ANALYST · ‎03-30-2023

@CsmithData31 I won't lie, that was horrible. The data was tricky. Hopefully this works for you, I have no desire to do it again 🤣:

All the best,
BS
LinkedIN

Matthew · ‎03-30-2023

i think a better idea it to talk to someone about their data quality...

BS_THE_ANALYST · ‎03-30-2023

@Matthew lmao. I was saying this as I was doing the RegEx. Bad data can be fine if it's consistently bad, but there's no consistency amongst this bad data 😂.

All the best,
BS
LinkedIN

Yoshiro_Fujimori · ‎03-30-2023

@CsmithData31 ,

Another way of parsing the data by column width, not by separators.

When we parse a data with fixed width on Excel, we specify the length of each field before importing the data.

In parallel to this, I propose to add a line to pass the length information to Alteryx as below:

Then input this data to Alteryx:

Using the "+" sign on the 2nd line, we can split the data rows into fields.

I think there are pros and cons in this solution.

Pro: You don't need to "count" the column position to edit a long RegEx formula.

Con: You need to edit the input data beforehand. So this may not work in some use cases.

I hope you like it.

flying008 · ‎03-30-2023

Hi, @CsmithData31

There is a specific way by regex method for you: (and it be easy, if you can give more sample data, maybe it can just for you want.)

录制_2023_03_31_10_45_28_485.gif

1- Select : change you field data size to adjust more string.
2- Formula: format your string by regex expression again.
3- Text To Columns: Split string by specific delimiter.
4- Dynamic Select: remove the input field. (no need)
5- Data Cleansing: remove null columns.
6- Dynamic Rename: Rename by first row.

Spoiler

REGEX_Replace(REGEX_Replace(REGEX_Replace(REGEX_Replace([Txt], '(\/[[:alpha:]]+)(\s+?)(?=[A-Z]+)', '$1 '),'^[\d\/]+','$&  '),'(COLUMN)(\s+?)(\d+)','$1$3  '),'\s{2,20}','|')

REGEX_Replace(REGEX_Replace(REGEX_Replace(REGEX_Replace([Txt], '(\/[[:alpha:]]+)(\s+?)(?=[A-Z]+)', '$1 '),'^[\d\/]+','$& '),'(COLUMN)(\s+?)(\d+)','$1$3 '),'\s{2,20}','|')

******

If it can help you , please mark it as a solution and give a like for more share.

BS_THE_ANALYST · ‎03-31-2023

@Yoshiro_Fujimori
Nice idea with the column width.

By the way:

I think one of the names crept into the next column!

All the best,
BS
LinkedIN

BS_THE_ANALYST · ‎03-31-2023

@flying008 same issue with yours:

I wonder if that happens because of the column width idea? Interesting.

Unless CABABAGE is actually a code and I made a wrong assumption 😂.

All the best,
BS
LinkedIN

Yoshiro_Fujimori · ‎03-31-2023

@BS_THE_ANALYST ,

Thanks!

Re your comment:

> I think one of the names crept into the next column!

Well, just looking at the column width, it seems to be working as intended...

In this particular case, we cannot use "\s+" as a separator, as "Full name" contains a space in it.

If we separate "Full name" to "First name" and "Last name", it might work. (assuming nobody has a "Middle name" 😀)

Input

Output

Alteryx Designer Desktop Discussions

RegEx Tool

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...