Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Removing Unwanted Data from a Field

Andy_Katona
8 - Asteroid

Team,

 

I have a long list of Lab Result values that need to be set to blank or Null(). What is the best method?  I thought of the RegEx, Formula, & Filter. Yet, there are so many I am not sure what is the best method. 

 

I have to remove the following:

 

Unremarkable
Stable
see provider notes
see provider note
see docs
see attached report
Normal, Alb 10, Cre 300,
Normal Alb 30mg/L, Cre 30
Normal A:C <30mg/g, Alb 3
normal
nl
High
Duplicate
Abnormal
Abn 30-300 150 cre 200,
30-300mg/g
30-300mg
30-300mg
30-300 Abnormal
30-300
30-300
30 mg
150mg/L
150 mg
>300mg
>300 mg/g
>300 mg
>300
>291.00
<5.0
<4
<30mg/g
<30mg normal
<30mg
<300
<30 normal
<30 mg/g
<30 mg normal
<30 mg
<30
<20
<0.7
<0.500
<0.5
< 5.0
,30
16-Sep
4/26/2016
Mar-16
2/18/2016
1/20/2016

11 REPLIES 11
MarqueeCrew
20 - Arcturus
20 - Arcturus

I'd use a FIND REPLACE tool and where there is a match, update the match with the next (NULL) column.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Andy_Katona
8 - Asteroid

Thus, I should take my list & create a "FIND" document, then attach that to the F of the FIND tool, then on the Replace I could blank it out?

 

Would you have an example?

MSalvage
11 - Bolide

@Andy_Katona, 

 

Here is an example. I used a formula to create a replacement field that was all nulls.

 

-MSalvage

jlefeaux
8 - Asteroid

Is the list of values that you want to replace static or could it possibly grow?  Also, what kind of values are you trying to keep

 

If you might get more of these kinds of "results" and if you only want to keep actual numbers, what about using REGEX_CountMatches to identify values that contain non-numeric, non-decimal point characters? The use an If function in a Formula tool -- If the count is non-zero, set the value to null, otherwise use ToNumber to convert to a numeric data type.

 

I'll try to work up a sample this afternoon.

 

Example workflow attached.

Andy_Katona
8 - Asteroid

Yes, the invalid list could grow.

I am looking for numeric values (Lab Result numeric values) & exclude anything with  <, >, / or alpha character. Values should be numeric only like 4, 4.9, or 100.

 

Thank you for your help

 

 

jlefeaux
8 - Asteroid

@Andy_Katona, I attached a sample workflow to my previous post.  Hope it helps.

Andy_Katona
8 - Asteroid

I can not thank you enough. I have learned so much in the 3 weeks I have had this software. 

 

I don't understand, based on the expression definitions, how REGEX_CountMatches([Results], "[^\d\.]") Looks for non-standard characters. 

 

Still what a great simple solution. This is one of those that needs to be built in for Healthcare. 

 

jlefeaux
8 - Asteroid

Glad it helped! 

 

The expression definition... Fun with RegEx :)

 

The brackets [ ] enclose the pattern.  The caret ^ inside the bracket indicates negation, that you're looking for things that don't meet the enclosed pattern.

 

\d represents the set of digits 0,1,2,..9

 

\. represents a dot/period/decimal place.  The dot itself is a reserved character so you have to use the escape character, the backslash, to indicate that you really just want a dot. 

 

So wrapped up, [^\d\.] means a character that isn't a digit or a dot.  Then the function, REGEX_CountMatches, counts all of them in your target string, which is the value in the field [Results].

 

Here is a good tutorial on regular expressions.  They're not pretty but they're incredibly useful.

 

http://www.zytrax.com/tech/web/regex.htm

 

 

jlefeaux
8 - Asteroid

One more thought...  You might want to use the Trim function to trim leading & trailing whitespace from your results before applying the regex test/null replacement.

 

Otherwise, something like " 3.07" would be flagged with a count > 0 because of the leading space.

Labels
Top Solution Authors