Alteryx Designer Desktop Discussions

182129 · ‎06-06-2023

I get the below data from a system which includes 3 fields I need to parse out, but they don't always appear in the same place. Two of the fields are the exact same size but one of the two fields is not always present. I've uploaded the sample data and the desired parsed results, color coded for where they appear in the string. I can pretty easily get cost center from first 9 digits or No Cost Center just from regular formula, but I'm struggling with getting the other 2 fields because of the same size.

Payment

0650/1596_SN00-5301311

0650/1899_SN00-5309104--No_ARIO_Number

No_Cost_Center-Need_ARIO_Number--1034664-10475100_0_1987001_SN00

0650/1574_SN00-7301181--No_ARIO_Number

0650/1645_SN00-1987001

0650/1645_SN00-5309104

0650/1575_SN00-5301311

No_Cost_Center-Need_ARIO_Number--1027660-10369066_0_1987001_SN00

0650/1748_SN00-5301311

0650/1748_SN00-5309104--No_ARIO_Number

0650/1631_SN00-5301809--No_ARIO_Number

No_Cost_Center-Need_ARIO_Number--1028736-10123404_0_1987001_SN00

0650/1688_SN00-5301809--No_ARIO_Number

0650/1591_SN00-6204301--No_ARIO_Number

No_Cost_Center-Need_ARIO_Number-HDQ-1036659-10475065_0_1987001_SN00

0650/1871_SN00-5302300

Yoshiro_Fujimori · ‎06-10-2023

Hi @182129 ,

I tried to follow the format as much as possible, but still not sure if it works for larger volume.

Please adjust the formula tool if needed.

Workflow

RegEx tool configuration

Formula tool expression

[HasAR#] = REGEX_Match([GL], ".*_\d{7}_.*")

[GL] = IF ![HasAR#] THEN [AR#] ELSE REGEX_Replace([GL], ".*_(\d{7})_.*", "$1") ENDIF

[AR#] = IF [HasAR#] THEN [AR#] ELSE Null() ENDIF

[Cost Center] = IF StartsWith([Cost Center], "No") THEN "No_Cost_Center" ELSE Replace([Cost Center], "_SN00", "") ENDIF

Output

hellyars · ‎06-10-2023

@182129 @Yoshiro_Fujimori

I took a slightly different approach.

I thought it might be easier to split the data into the three (3) data configurations present and tackle each case individually.

This might offer a little more flexibility should other patterns appear.

As you know, the Record ID field will let you split, process in groups, and re-join each record.

I hope this helps.

182129 · ‎06-12-2023

@hellyars I like this approach but because we're on version 2020.3 of Alteryx I can't import your solution. Would you mind posting the configurations?

hellyars · ‎06-13-2023

@182129will do. back in a few.

hellyars · ‎06-13-2023

@182129

Cant downgrade...but this should help.

Reference my image above. The steps below will reference the name of the discussion boxes that contain the RegEx tools.

0. Add a Record ID tool.

1. Set the first Filter Tool to filter Payments, Contains, No_ARIO.

2. Connect the True output to a RegEx tool in the No_ARIO box and configure it to parse Payment using the following configuration.

^(\d{4}\/\d{4}).*?SN00-(\d{7})

3. Connect the False output from the first Filter tool to a second Filter tool.

Configure this second Filter tool to filter Payment, Contains, No_Cost.

4. From the second Filter tool, connect the True output to a RegEx tool in the No_Cost_Need Present box and configure it to parse Payment as follows.

-(\d{7})-(\d{8})_0_(\d{7})

5. From the second Filter tool, connect the False output to a third (and last) RegEx tool in the Cost Center Present box and configure it to parse Payment using the following configuration:

(\d{4}\/\d{4})_.*-(\d{7})

6. Add a Join tool to bring all the streams back together and then Sort using Record ID.

182129 · ‎06-13-2023

@hellyars Do I need Record ID if the final sort order doesn't matter?

hellyars · ‎06-13-2023

Technically you do not need it here, but it is a good practice to include it.

You never know when it might come in handy.

Alteryx Designer Desktop Discussions

RegEx help needed to parse for not always standard input