Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Cleaning and Parsing

vishwakar
8 - Asteroid

Hi 

 

I have currently have input in this format , Have more than 10K + rows , Just showing an example 

Example 1 Input : 

 

 

([PS Customer Lifecycle].[none:project_number:nk] / ([PS Customer Lifecycle].[none:project_name:nk] / ([PS Customer Lifecycle].[none:wbs_user_status:nk] / ([PS Customer Lifecycle].[none:proj_program_delivery_manager_name:nk] 

 

Expected Result :

 

 

[PS Customer Lifecycle].[project],[PS Customer Lifecycle].[project_name],[PS Customer Lifecycle].[wbs_user_status],[PS Customer Lifecycle].[proj_program_delivery_manager_name]

 

 

 

 

 

Example 2 Input  

 

 

([greenplum.41719.542383587963].[sum:lead_count:qk] + ([greenplum.41719.542383587963].[sum:opp_count:qk] + ([greenplum.41719.542383587963].[sum:book_num:qk] + [greenplum.41719.542383587963].[sum:total_booked:qk])))

Expected Output :

 

 

[greenplum.41719.542383587963].[Sum(lead_count)],[greenplum.41719.542383587963].[Sum(opp_count)],[greenplum.41719.542383587963].[Sum(book_num)],[greenplum.41719.542383587963].[sum(total_booked)]

 

 

Let's take an example ,I basically need to clean the stuff after period which is

 [none:project:nk]

,

 

Input: 

[PS Customer Lifecycle].[none:project_number:nk]

 

Output:

if we have none , we just take the value after the first colon , However if we have anything other than none we take that into consideration.

 

[PS Customer Lifecycle].[project]

 

I have achieved the output for one variable using Split , But I have multiple variables separated by ( /, + )

 

Any Help is appreciated on How to dynamically have one formula to take both the scenario

 

Thanks

Karthik

7 REPLIES 7
MarqueeCrew
20 - Arcturus
20 - Arcturus

@vishwakar,

 

Here might be some help:

 

 

([\x5b].*[\x5d][.])([\x5b]none:)(.+):(.*)

used within a regex_replace, it can provide you with 4 groupings:

 

  1. Everything in BRACKETS prior to seeing [none:]
  2. [none:
  3. Contents after [none: and before colon (:)
  4. Everything after that colon

To construct your desired (adjusted) output:

 

IF Regex_Match([FIELD],"([\x5b].*[\x5d])[.]([\x5b]none:)(.+):(.*)")
THEN Regex_Replace([FIELD],"([\x5b].*[\x5d])[.]([\x5b]none:)(.+):(.*)","$1.[$3]")
ELSE [FIELD]
ENDIF

This will create:

  • INPUT:  [PS Customer Lifecycle].[none:project_number:nk]
  • OUTPUT:  [PS Customer Lifecycle].[project_number]

Does this help you?

 

Thanks,

Mark

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
vishwakar
8 - Asteroid

@MarqueeCrew

 

Thanks for your input. But it does not solve when we have more than one values. As shown below 

 

([PS Customer Lifecycle].[none:project_number:nk] / ([PS Customer Lifecycle].[none:project_name:nk] / ([PS Customer Lifecycle].[none:wbs_user_status:nk] / ([PS Customer Lifecycle].[none:proj_program_delivery_manager_name:nk] 

Expected Result :

 

[PS Customer Lifecycle].[project],[PS Customer Lifecycle].[project_name],[PS Customer Lifecycle].[wbs_user_status],[PS Customer Lifecycle].[proj_program_delivery_manager_name]

 

However if we have anything other than none , we need to take that in to consideration :

 

([greenplum.41719.542383587963].[sum:lead_count:qk] + ([greenplum.41719.542383587963].[sum:opp_count:qk] + ([greenplum.41719.542383587963].[sum:book_num:qk] + [greenplum.41719.542383587963].[sum:total_booked:qk])))

Expected Output :

 

[greenplum.41719.542383587963].[Sum(lead_count)],[greenplum.41719.542383587963].[Sum(opp_count)],[greenplum.41719.542383587963].[Sum(book_num)],[greenplum.41719.542383587963].[sum(total_booked)]

 

 

All the above can come in a single scenario , so I am looking for a formula to be dynamic enough to handle all the scenarios,

 

The separation can be a (/,+) as shown in example 1 ( the values are separeted by / and ine example 2 where as the values are seprated by + value (i.e., value is other than none).

 

Thanks for your help , let me know if i am not clear with my ask.

 

Thanks

Karthik

MarqueeCrew
20 - Arcturus
20 - Arcturus
It was meant to be directional :)

Sent from my iPhone
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
vishwakar
8 - Asteroid

Thanks @MarqueeCrew. I definetly understand how to deal with different values (like none and sum) , However still i did not understand how can i get it to solve when we have more than 1 values as shown in the above example separated by (/,+) . Any thoughts on it would be really helpful.

 

Thanks for your time.

 

Thanks

Karthik

MarqueeCrew
20 - Arcturus
20 - Arcturus

([PS Customer Lifecycle].[none:project_number:nk] / ([PS Customer Lifecycle].[none:project_name:nk] / ([PS Customer Lifecycle].[none:wbs_user_status:nk] / ([PS Customer Lifecycle].[none:proj_program_delivery_manager_name:nk]

 

looks like you can check for the / in the field and then parse it to include many groups. 

 

The replacement would be like:

 

$1.$2 , $3.$4

 

I am not near a computer to help further today. 

 

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
MarqueeCrew
20 - Arcturus
20 - Arcturus

@vishwakar,

 

here is some more help.

 

Thanks,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
vishwakar
8 - Asteroid

Thanks @MarqueeCrew

 

It works like charm.

 

Found below website useful for trying regex functions

https://regex101.com/r/xF4jF3/10

 

Thanks

Karthik

Labels