Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Take only numeric characters or Only lines with N characters

max_hfr
8 - Asteroid

Hi,

 

I am having an issue separating different types of data in the same column.

 

The column looks like this:

 

CMA 201802

CMA 201803

CF 201802 

CF 201804

201804

201806

201807

 

 

I want to put CMA, CF, and month (201804) in different columns.

 

I was able with if contains "cma" or "cf"  but for the month i cannot use contain since the other ones contain the same data.

 

Is there a formula to either only take lines with numeric value or only take the lines which are 6 characters long.

 

PS: I don't want to trim any of the cma / cf. 

 

Thank you in advance,

 

Max

9 REPLIES 9
MarqueeCrew
20 - Arcturus
20 - Arcturus

@max_hfr ,

 

I can read your requirements different ways.  Can you please post your desired output too?

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Thableaus
17 - Castor
17 - Castor

Hi @max_hfr 

 

Parsing with Regex would work fine.

 

 

 

Parse.PNG

 

(\D*)(\d+)

 

Not a digit (0 or more times) / digit 1 or more times.

 

If your data may look different in the future, let me know so we can adjust RegEX.

 

Cheers,

max_hfr
8 - Asteroid

I wasn't orginally thinking of it this way but it could be even better.

 

I actually have more diffrences here are all the possible formats:

 

CMA 201802

Cuf 201802

Bimestre 201802

Cma 201802

Semestre 201802

Trim 201802

 

I assume it works with this regex

 

thank you in advance

Thableaus
17 - Castor
17 - Castor

@max_hfr 

 

It works fine, as long as the "word" part doesn't have any numbers on it.

 

If there's always a space between the word and the number (month), you could be more specific

 

(\D*?)\s*(\d{6})

 

So you don't need to trim the word after that. But that's up to you.

 

Cheers,

mbarone
16 - Nebula
16 - Nebula

Probably lots of ways to do this, some might even be more efficient than mine, but here's what I did:

I recreated your sample using a text tool, and then dropped a regex tool after it:

2019-03-08 09_23_24-Alteryx Designer x64 - New Workflow1_.jpg2019-03-08 09_23_15-Alteryx Designer x64 - New Workflow1_.jpg

 

I parsed out the column like this:

2019-03-08 09_23_07-Alteryx Designer x64 - New Workflow1_.jpg

 

Which splits your data like this:

2019-03-08 09_22_52-Alteryx Designer x64 - New Workflow1_.jpg

 

Next I would trim the RegExOut1 filed to remove the extra space at the end.
Then, assuming you want the numeric only values in Column1 to to into the RegExOut2 column, I would simply update that field with the formula IIF(IsNull([RegExOut3),[Column1],[RegExOut3]).

At that point you can create two fields where one is Left([RegExOut3],4) and one is Right([RegExOut3],2).

max_hfr
8 - Asteroid

Hi Thanks a lot, it's even better.

 

I have one last problem 

 

I have 

CMA and Cma.  I would like to transform Cma into CMA so all of them are written the same.

 

I have tried a formula if "Cma" then "CMA" but that doesn t seem to work.

 

Any idea?

mbarone
16 - Nebula
16 - Nebula

Yes, in your formula where you trim the white space, just wrap it in the function "Uppercase()".

Thableaus
17 - Castor
17 - Castor

@max_hfr 

 

Is it just CMA you want to uppercase?

 

If not, you could use function Uppercase([Field]) and you would uppercase all of your strings.

 

Or your IF condition (specific to CMA) could be this way:

 

IF Contains([Field], "Cma") THEN "CMA" ELSE [Field] ENDIF

 

I'd use Contains function in case you have any extra whitespaces, dashes or something like that.

 

Cheers,

max_hfr
8 - Asteroid

Hi Thableaus,

 

you're right it was because of additional spaces. Thanks a  lot for all your help

Labels