Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Insert delimiter between uppercase and lowercase characters in string

shivraj_r
7 - Meteor

Hi All,

 

I'm trying to split this string in a column into two columns based on the Case - I've been looking for a while to figure out how to do this? I figured add a delimiter checking for case and use a text to columns function - but now I'm not sure how to add the delimiter based on Case. Here's a sample of what I want

Col 1 Col 2col 3
ABCaaaABCaaa
AHCNaaartyAHCNaaarty
BGhnfBGhnf

 

 

Thanks - appreciate your help in advance!

9 REPLIES 9
MarqueeCrew
20 - Arcturus
20 - Arcturus
Hint

\u is uppercase
\l is lowercase.

\u+ is 1 or more uppercase letters
\l+ is 1 or more lowercase letters.

(\u+)(\l+) is 2 groups



Cheers,
Mark
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
shivraj_r
7 - Meteor

Thanks @MarqueeCrew 

 

Not sure I follow though - (\u+)(\l+) as a REGEX input? I'm trying to insert a delimiter to split the columns -  What do I need to add to check for case and then split the columns?

 

Thanks!

 

 

MarqueeCrew
20 - Arcturus
20 - Arcturus
There is a regex tool that you can use to parse. But as a step,

Regex_replace([string], "(\u+)(\l+)",'$1|$2',0)

Should work. Else switch to ,1. I forget which way the case sensitivity works.

Cheers,

Mark
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
shivraj_r
7 - Meteor

@MarqueeCrew Works perfect - thanks so much!!

MarqueeCrew
20 - Arcturus
20 - Arcturus
Note: this post was answered from a queue at mall of America while waiting for the log chute.
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
MarqueeCrew
20 - Arcturus
20 - Arcturus

@shivraj_r,

 

Now that I am in front of my computer I can more easily reply.  When you requested the delimiter, I wanted to explain that you can use the RegEx tool to complete the parsing for you.  If you configure the tool as shown below, all parsing is completed in one tool.  You can identify the groups of letters and parse them into their two component fields as shown:

 

capture.png

 

The regular expression used was:

(\u+)(\l+)

This puts the uppercase (1+) characters into the first group and the lowercase (1+) characters into the second group.  You then De-Select the "case insensitive" option and then rename the groups inside of the output fields configuration.

 

This results in 2 output fields being created from you original input field.

 

Cheers,
Mark

 

P.S.  Even when I'm in front of my computer, speling errors can happen.

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
shivraj_r
7 - Meteor

@MarqueeCrewHaha! Thanks for your nifty reply from your phone standing in line at a mall!! That was awesome! :)

 

I ran into a interesting situation that I didn't realize previously with my data - 

 

Expected Input - ABCabc, ADJChasggh

 

Actual Input - ABCAbce, ADJCHasGgh

 

To give you some context I basically want to split this into two strings but the problem is my first string I know is all CAPS - then 2nd half of the string essentially sometimes has two words/Caps but I want to split that as one string into a column.

 

Let me give you an example with actual data 

Input                                                                Output

HOMERoot -                                           HOME + Root

SETTINGSSettingsRootView -     SETTINGS +SettingsRootView

 

 

Thanks in advance! 

 

I knew I should've looked at the data before but I make mistakes at my computer all the time ;)

MarqueeCrew
20 - Arcturus
20 - Arcturus

@shivraj_r,

 

That is quite a challenge.  The string  "SETTINGSSettingsRootView" looks like CAPS followed by anything where one would not immediately know if the pair of "S" characters are part of the same word/values.  I was afraid that I could not solve the problem for you with Alteryx.  I have however conquered that fear by breaking down the problem.

 

Pattern #1 we have already identified:  A set of CAPS followed by lowercase.  Now Pattern #2 is CAPS followed by MixedCase.

 

We can use a REGEX_MATCH() function to see which pattern we are dealing with and write different expressions for each case.

 

Here is a solution formula:

IF 
     REGEX_Match([Field1], "\u+\l+\u+.*",0) THEN   REGEX_Replace([Field1], "(\u+)(\u\l+.*)", '$1+$2',0)
ELSE
     REGEX_Replace([Field1], "(\u+)(\l+)", '$1+$2',0)
ENDIF

We check to see if the pattern is:  1+ Uppercase followed by 1+ Lowercase followed by 1+ Uppercase followed by anything (with case sensitivity set).  Then we apply a replacement formula where we take the first Uppercase group of letters (prior to the first uppercase followed by a lowercase) and assign that to group #1.  We add a + sign (as you used it for a delimiter in your example) and place the balance of the expression into group #2.

 

Otherwise, we use our proven expression.

 

I love Alteryx and am very fond of RegEx.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
shivraj_r
7 - Meteor

This is awesome @MarqueeCrew!!! Thank you so much - This is some next level stuff!!

Labels