Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

if it contains

mmustkee
10 - Fireball

Hi Team, 

 

I have below data, I have two condition here.

1. I want to keep only those data in which we have 5 times either 2 or 9  as 22222 or 99999 and 

2. I also want to keep those data in which we have 2 starting letter as alpha and 3 numeric after that for exp ab345, wr765 etc.

apart from this just delete all data. 

 

COD 8
ab345
ab765
wr345
asd234
345az
22222
99999
222222
999999
df345
fv567
tz678
22234
78867
222226
 
 
za567
789az
qw211
qw000
5 REPLIES 5
Qiu
21 - Polaris
21 - Polaris

@mmustkee 

I think I can help you with this one.😁

0210-mmustkee.PNG

mmustkee
10 - Fireball

Thanks Qiu,

 

It worked but can we delete the value in the same column "Cod" which does not fall in the criteria we want.

 

Thanks,

Mohammad Mustkeem

Qiu
21 - Polaris
21 - Polaris

@mmustkee 
A filter tool would do it since I already flag the entries.

0210-mmustkee-1.PNG

MarqueeCrew
20 - Arcturus
20 - Arcturus

@mmustkee ,

 

 the regular expression that @Qiu provided you can be used in the filter tool as:  regex_match() without the IF expression. That eliminates the formula and select tools. But, I did notice that his formula only works on 5 character codes.  In your examples 222226 or if ab1234 existed it won't match.  I assume that the 5 character codes is also an implied requirement.

I was wondering if this would perform faster as:

 

 

[cod 8] in ("222222","999999") OR

( !IsInteger(left([cod 8],2)) AND IsInteger(Right([cod 8],3)) )

 

 

So I created 1,000,000 rows of data and tested this FILTER expression against @Qiu 's FILTER expression (modified to just be a filter)

 

 

REGEX_Match([cod 8], "[2,9]{5}|\w{2}\d{3}")

 

 

RegEx, something that I advocate, takes more resources to execute.  It provides great ease in working with unstructured data.  In some cases, it is easier to craft than string functions.  In this case it is more concise if you understand what it says.  But what about performance?

 

With one million records tested the difference in timing is under 1 second.  but the profile in milliseconds is substantial.  The RegEx solution takes 1,451 ms and my alternate solution takes 670 ms.  

 

Using a regular expression, you should expect 2-3 times the processing time for each function.  In this case I did use an evaluation of 3 expressions to equal the single RegEx statement.

 

I just wanted to offer an alternate solution for the community.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
mmustkee
10 - Fireball

Thanks a ton for explanation.

 

Let me check and update you soon.

 

Much appreciated.

 

 

Thanks,

Mohammad Mustkeem

Labels
Top Solution Authors