Hi,
I found in this forum how to use a REGEX along with the "false" path of the Filter tool to pull only those rows where a column does not contain numeric data, REGEX_Match([column name], "\w+"). I placed this in a variable expression and utilized the 'false' of the Filter path to obtain only those rows where that column contains only alpha data. But, some of the cells in the column have dates formatted, YYYY-MM-DD.
Can I modify the REGEX expression to include data formatted like that? Or would I need to use an AND clause in the expression? If the latter, how would I formulate the expression to target date formatted information?
Thank you,
Chris
Solved! Go to Solution.
The following expression will find 4 numbers followed by a dash followed by 4 numbers followed by a dash followed by 2 numbers.
Regex_Match([Column Name],"\d{4}-\d{2-\d2")
Cheers,
Mark
The following expression will find 4 numbers followed by a dash followed by 4 numbers followed by a dash followed by 2 numbers.
Regex_Match([Column Name],"\d{4}-\d{2}-\d{2}")
Please mark this as the accepted solution. I mistakenly signed in with a wrong account.
Cheers,
Mark
I kinda sorta still get dates. This is the expression I'm using. I'm not sure how to attach a screen shot.
Would the below be the correct way to code the expression?
REGEX_Match([NEW VALUE],"\w+") OR
REGEX_Match([NEW VALUE],"\d{4}-\d{2-\d2")
REGEX_Match([NEW VALUE],"\w+") OR REGEX_Match([NEW VALUE],"\d{4}-\d{2}-\d{2}")
That should work to test for either condition as TRUE
I captured the rejected data in the True path of the Filter, and while the REGEX expressions are working 99.999%, there are two rows that should not have been rejected. The [New Value] column in those two records contain "Withdrawal" and the other "Rejection". I do a data cleanse before we reach this filter to remove trailing and leading spaces. I ran the workflow a couple of times to just see if it was a fluke. But this condition remains. I inspected the data in the desired 'false' path. Those two records do not appear there.
do you mean that they contain
"Withdrawl" or "Rejection" ?
Including the quote symbols?
Among the data I do not want ( 'true' path of filter) there is data that I do want ('false' path of filter).
Two records went to the 'true' path of not wanted data that should have gone to the 'false' path of wanted data.
The value of the column [New Value] in one of those records is Rejection. The value of the column [New Value] in the other record is Withdrawal.
They went to the True path because they match the pattern.
REGEX_Match([NEW VALUE],"\w+") OR REGEX_Match([NEW VALUE],"\d{4}-\d{2}-\d{2}")
The first statement looks for anything that is a single string of characters (e.g. WhyNotMatchMe)
Maybe you need to tweak that statement a little bit. Can you describe what you DO want to match a little more precisely?
Cheers,
Mark
Well, those two records represent about .0001% of the data that was properly rejected.
Here is what I understand the REGEX to be doing:
REGEX_Match([NEW VALUE],"\w+") will give us only numeric data (?)
REGEX_Match([NEW VALUE],"\d{4}-\d{2}-\d{2}") will give us only date formatted data (?)
The whole expression::
REGEX_Match([NEW VALUE],"\w+") OR
REGEX_Match([NEW VALUE],"\d{4}-\d{2}-\d{2}")
I take the above expression to mean that rows where the [New Value] column is numeric or where the [New Value] column has date formatted data will be selected. (?) This represents data that I do not want. The plot thickens.....
I use the false path of the filter tool because I want records where there [New Value] data is alpha only.
The true path of the filter would therefore contain the numeric and date information I do not want, the throw away information. Herein lies the problem. This path contains two measly rows that have defied the REGEX.