I'm trying to extract all numeric data from a cell that has only 9 digits however it also is extracting data that is 9+ digits.
Right now I am using this regular expression: (\d{9})
What else can I add to this expression to stop it from collecting data longer than 9 digits?
Is this one string of data? are you tokenizing? Perhaps you can think about what should come after the 9th digit? like a space so (\d{9})\s or any non-character (\d{9})\U
If they are individual values - you can apply a regex_match([yourfield],"\d{10,}") to filter for values where there are more than 9 digits.
@arbencukaj maybe there are multiple matches in your data e.g. 2x or 3x 9 digits?
you may also want to check if these digits appear at the start of the cell - then you could add "^" at the beginning of regex pattern
@arbencukaj
Better you can show us some sample data? 😁
In order to only pick up the cells that contain 9 digits, you need to update your regular expression to the following:
^(\d{9})$
The updates I made to your regular expression are in red. The starting carrot (^) indicates the start of a string, and the ending dollar sign ($) indicates the end of a string. Without them, the regular expression you previously put (\d{9}) will match any cell that contains 9 digits, even if those 9 digits are not at the beginning or if the 9 digits are within a cell containing other data.
By including the ^ and the $ at the beginning and end, respectively, we are crafting a regular expression to match a string that begins, has 9 digits, and ends.
I've attached the solution (created in Alteryx Designer, version 2020.4).
Hope this helps!
@jbichachi003- that would work if the 9 digits are always the only element in the entry - but it wouldn't work if say b111111111 was a potential match. At this point we need more clarity from the OP how the data looks
Agreed. I based my response on OP's initial comment (below), assuming OP meant they were looking for cells with only 9 digits. But I may be wrong, and we'll have to wait for OP to clarify.
@arbencukaj wrote:I'm trying to extract all numeric data from a cell that has only 9 digits