Hello,
I need to use a formula to filter out several items and then for those that are false to come out in an output
thanks
Solved! Go to Solution.
so I have all the columns below that have to be audited each month and they should only be a word or a number. Since they are Vstrings and Double-how would I do the formula for each separate column? I was told to use 'error in a' then to output those from there that are FALSE. I was first creating a new column for each column but calling it the same with audit attached (ex. Program audit as a new column with the error in a statement true/false) THEN do a filter for all the 'error in a' as the false output. Does this make sense? Have you used the 'error in a' before?
two ways - either you check each value against the proper value which is type specific (ie for your number row it's is number=x, and for your string row it's is string="y"...
how i'd do it:
or you lose all the field types and you transpose with your record id as a group identifier. Then you check each column (now in value) as if it was string.
use a filter tool to filter out record ids where you have "false"
now you have the record ids - but not the full data set. You get the datasset back and the proper formatting by using a join with the pre-transform dataset with record id as the identifier.
quick question--can you do a filter to check for duplicates on a specific column?
and, for a specific column to have certain # of digits or spaces only?
thanks!
YES..
For 1) if you use summarize you can see what values appear multiple times by grouping by values and then using a count.
2) yes. but... to check for length of digits you'd either need to look to see if it's larger than a certain number of check the length via the string function len() - but that requires a string so len(tostring()) would be your nested function.
whitespaces would mean " " as opposed to null() or empty(). empty/null can be checked by isempty(). For whitespaces of an unknown length you can use a regex_match([field],"\s+") but it's a bit more complicated. Basically that formula would match fields where only " " appear.
So I placed summarize tool but do I add columns at the bottom inside of Summarize?
Also, for certain length to use len(tostring()) for that specific column but what do you mean by nested function?
THANKS!!!
summarize - group by record id. count distinct whatever column you are looking for duplicates in... duplicates will have two values or more.
nested in this case refers to two functions on the same field. so first convert it to string tostring() then get length on the string with len. so len(tostring())...
Thank you so much so far!!!
I got the summarize but for the nested functions --I have 3 columns: one is Sequence that needs to be # and 6 digits, one is Track ID needs to be 15 spaces, and One Mem ID that needs to be 9 spaces and not begin with 901 or 902.
Do I do this in the formula section for each one toString(Sequence) and Len(tostring(Sequence)) ???
here's how I would do it
logic test for sequence:
regex_match([sequence],"\d{6}")
len([trackid])=15 (note you could also use regex_match([trackid],".{15}")
with the last one the regex synatx is a bit more complex. you can do something like if len([memid]=9 and left([memid],3) in ("901","902") but i'll work on it.
if you are testing numbers for lengths - tostring - if the fields are already strings you can skip this step.