Alteryx Designer Desktop Discussions

Gsiva3 · ‎07-24-2023

Hi Team,

I have a column named "Title" which has some text content where each row ends with some 5 character or 10 character or 15 character or 20 character code. I want to segregate the text alone from this code which comes at the end of the text.

Have attached the Sample input and required output.

INPUT	OUTPUT
*Title*	*Title 1*	*Title 2*
Soap packet ADFGH	Soap packet	ADFGH
Freshener AP3PP/PIYTR	Freshener	AP3PP/PIYTR
Disinfectant AAQQ3/MNJKK/TT0UU	Disinfectant	AAQQ3/MNJKK/TT0UU
Floor Cleaner KJYTR/WWWQA2/MNBBV/KK9KK	Floor Cleaner	KJYTR/WWWQA2/MNBBV/KK9KK

binuacs · ‎07-24-2023

@Gsiva3 One way of doing this

Kenda · ‎07-24-2023

If you're interested in trying RegEx, you could use an expression like the following in a regular Formula tool:

REGEX_Replace([Title], "(.*)\s(.*)", "$1")

Basically, it's looking for two sections of any length of any characters (.*), separated by a space \s. In this case, since we aren't specifying otherwise, it will use the last space found in the text. Then, it will keep the first group $1. This will get you Title 1.

Try this out and give it a go to find the Title 2 you're wanting.

Alternatively, if you're looking for a non-RegEx answer, you could use add a RecordID tool first then use the Text to Columns tool with \s as the delimiter, split to rows and keep only the last record.

Gsiva3 · ‎07-24-2023

Hi @binuacs @Kenda

The regex is working fine. But the think is in my input in some of the cases there is no space at the end between the title and the code.

Please refer the below sample input. How to proceed in this case

INPUT

Title

Soap packet ADFGH

Freshener AP3PP/PIYTR

Disinfectant AAQQ3/MNJKK/TT0UU

Floor Cleaner KJYTR/WWWQA2/MNBBV/KK9KK

WasherMMNHJ

OUTPUT
Title 1	Title 2
Soap packet	ADFGH
Freshener	AP3PP/PIYTR
Disinfectant	AAQQ3/MNJKK/TT0UU
Floor Cleaner	KJYTR/WWWQA2/MNBBV/KK9KK
Washer	MMNHJ

Kenda · ‎07-24-2023

@Gsiva3 Since we cannot use a space, what logic would you like to use to identify the separation that would work for all of your records? 3+ capital letters in a row, perhaps?

Gsiva3 · ‎07-24-2023

We cab use 4+ Capital letters in a row

Kenda · ‎07-24-2023

Hi @Gsiva3 just checking that this will work as in the Freshener example above, the code starts with AP3, only 2 capital letters in a row

Gsiva3 · ‎07-24-2023

So we might need 2 conditions

1. 2Capital letters + 3rd character as "Number"

2. 1st 4 characters as capital letters

The reason is I do have few title which has 3Capital letters in it which I dont want to seggregate

Kenda · ‎07-24-2023

@Gsiva3

Here is an updated expression that should get you the case for 4 upper case letters as the condition:

trim(REGEX_Replace([Title], "(.*?)([A-Z]{4}.*)", "$1",0))

You'll see this is pretty similar to the prior expression, but rather than the space \s, we have the second group as [A-Z]{4}.*

This where you're telling Alteryx to look for 4 capital letters in a row [A-Z]{4} plus anything after that .*

We must put the question mark in the first group to ensure that we get the first instance of this match for the second group.

I also added the ,0 at the end to let Alteryx know to be case sensitive.

Use this as a starting point and modify it slightly to create your other rule of 2 capital letters then a number. Note digits can be identified using \d.

Alteryx Designer Desktop Discussions

Cell value segregation