Hi
I am new in Alteryx and have a problem creating the right regular expression.
I have a list of adresses. I have used this regular expressen: \d+\s.|\d+ (tokenize)
to get the "Husnr1" column.
adresse1 | Husnr1 | vejnavn1 | vejnavn2 | vejnavn3 |
Egehegnet 2 C | 2 C | Egehegnet | ||
Mariendalsvej 50 F,4 tv | 50 F | Mariendalsvej | tv | |
Dronninggårds Alle 138 | 138 | Dronninggårds | Alle |
For vejnavn1, vejnavn2 and vejnavn3 I have used this expression:
\<\u\l+\> (tokenize).
What I would like to get as an output is this:
adresse1 | Husnr1 | vejnavn1 | vejnavn2 | vejnavn3 |
Egehegnet 2 C | 2 C | Egehegnet | ||
Mariendalsvej 50 F,4 tv | 50 F | Mariendalsvej | ||
Dronninggårds Alle 138 | 138 | Dronninggårds | Alle |
I want everything after the comma (eg 4 tv) to be removed.
Thanks for your help in advance.
Regards, Trine
Solved! Go to Solution.
To remove everything after the comma you could just use a split-to-columns on comma delimiter.
I think your regular expression is correct. Just untick the Case Insensitive box. This will ensure that you tokenize on a word starting with a capital letter.