Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Cell value segregation

Gsiva3
8 - Asteroid

Hi Team,

I have a column named "Title" which has some text content where each row ends with some 5 character or 10 character or 15 character or 20 character code. I want to segregate the text alone from this code which comes at the end of the text.

Have attached the Sample input and required output.

 

INPUTOUTPUT
TitleTitle 1Title 2
Soap packet ADFGHSoap packetADFGH
Freshener AP3PP/PIYTRFreshenerAP3PP/PIYTR
Disinfectant AAQQ3/MNJKK/TT0UUDisinfectantAAQQ3/MNJKK/TT0UU
Floor Cleaner KJYTR/WWWQA2/MNBBV/KK9KKFloor CleanerKJYTR/WWWQA2/MNBBV/KK9KK

 

 

8 REPLIES 8
binuacs
20 - Arcturus

@Gsiva3 One way of doing this

image.png

Kenda
16 - Nebula
16 - Nebula

If you're interested in trying RegEx, you could use an expression like the following in a regular Formula tool:

REGEX_Replace([Title], "(.*)\s(.*)", "$1")

 

Basically, it's looking for two sections of any length of any characters (.*), separated by a space \s. In this case, since we aren't specifying otherwise, it will use the last space found in the text. Then, it will keep the first group $1. This will get you Title 1. 

 

Try this out and give it a go to find the Title 2 you're wanting.

 

 

Alternatively, if you're looking for a non-RegEx answer, you could use add a RecordID tool first then use the Text to Columns tool with \s as the delimiter, split to rows and keep only the last record.

Gsiva3
8 - Asteroid

Hi @binuacs @Kenda 

The regex is working fine. But the think is in my input in some of the cases there is no space at the end between the title and the code.

Please refer the below sample input. How to proceed in this case

 

INPUT
Title
Soap packet ADFGH
Freshener AP3PP/PIYTR
Disinfectant AAQQ3/MNJKK/TT0UU
Floor Cleaner KJYTR/WWWQA2/MNBBV/KK9KK

WasherMMNHJ

 

OUTPUT
Title 1Title 2
Soap packetADFGH
FreshenerAP3PP/PIYTR
DisinfectantAAQQ3/MNJKK/TT0UU
Floor CleanerKJYTR/WWWQA2/MNBBV/KK9KK
WasherMMNHJ
Kenda
16 - Nebula
16 - Nebula

@Gsiva3 Since we cannot use a space, what logic would you like to use to identify the separation that would work for all of your records? 3+ capital letters in a row, perhaps? 

Gsiva3
8 - Asteroid

We cab use 4+ Capital letters in a row

Kenda
16 - Nebula
16 - Nebula

Hi @Gsiva3 just checking that this will work as in the Freshener example above, the code starts with AP3, only 2 capital letters in a row

Gsiva3
8 - Asteroid

So we might need 2 conditions 

1. 2Capital letters + 3rd character as "Number"

2. 1st 4 characters as capital letters

The reason is I do have few title which has 3Capital letters in it which I dont want to seggregate

Kenda
16 - Nebula
16 - Nebula

@Gsiva3 

 

Here is an updated expression that should get you the case for 4 upper case letters as the condition:

trim(REGEX_Replace([Title], "(.*?)([A-Z]{4}.*)", "$1",0))

 

You'll see this is pretty similar to the prior expression, but rather than the space \s, we have the second group  as [A-Z]{4}.*

This where you're telling Alteryx to look for 4 capital letters in a row [A-Z]{4} plus anything after that .*

We must put the question mark in the first group to ensure that we get the first instance of this match for the second group.

I also added the ,0 at the end to let Alteryx know to be case sensitive.

 

Use this as a starting point and modify it slightly to create your other rule of 2 capital letters then a number. Note digits can be identified using \d.

Labels