I have some email data, that I have split into rows, so that each line in the email is on it's own record.
I am trying to remove lines that are like "========" or "---------" or similar, that are used as spacers.
I have tried using both the Regex_Match() function, and the Regex tool to find strings that do not have any letters or numbers. I am using "\w+", which the internet tells me will match strings using word characters. However, I seem to be getting true from the expression only for single words, or single numbers.
In my investigations, I have used "[a-z]+" to try to match strings with any lower case characters, but even this does not work. I must be doing something very stupid, but I have worked as a coder, and used regex a lot, in code and in Alteryx, so I cannot figure out why this is not working.
Thanks in advance for any help.
Solved! Go to Solution.
I would use Regex_Match([Email], '.*\w.*') to find strings containing at least one alphanumeric character
Hi @PhilipMannering, thanks for that. It did not seem to pick up lines that were all hyphens/dashes/minus signs. This character is an active character for defining a range I think, so I tried escaping it with a backslash and still could not get it to go true for "----" type lines. In the end I went with...
REGEX_Match([Email], "^(.)\1*$")
Found it on the internet, seems to say to put a character in a capture group, and look for repeated instances of that character from the start to the finish of the line, so this is quite flexible if people use other characters as spacers.
Thanks for the reply.
Thanks @Christina_H that worked. I went with something slightly different (see above) but this is also good, and will surely come in handy in future.
Much obliged!