Good morning folks,
I'm using address data in a project and I need to extract only the numbers from the given address. In other software packages I was able to use the INCLUDE or EXCLUDE function to allow me to arrive at a field with 'just numbers' or 'just letters'. I cannot seem to replicate this action in Alteryx and I have very little experience using REGEX. Any help is appreciated. The table below shows my address field and Address Key is my desired result.
Thanks,
George
Address Field | AddressKey |
830 New Jersey Avenue, Brooklyn NY 11207 | 83011207 |
2344 Lorillard Place, Apt 3, Bronx NY 10458 | 2344310458 |
3333 101st Street, Corona NY 11368 | 333310111368 |
1415 Ave O, Brooklyn NY 11230 | 141511230 |
Solved! Go to Solution.
In this case, I would recommend REGEX to look for address patterns. I'm not good enough to rattle off regex by memory, but i'm sure others will chime in.
You could also do some fancy formulas by finding the numeric location of the "first open space" in the string and then using LEFT function to grab the leading numbers and then similar to grab the ending (zip code) numbers. Once you've got those, simply concatenate them together.
Depending on your comfort level, will dictate your approach. -Jay
For what you are trying to do maybe you can just use the data cleansing tool as it can remove all the whitespace, letters and punctuation and you will be left with the result that you are looking for.
Just dublicate the column if you want to keep the original data also :)
Else LMK and i can help you with some Regex
Thanks for responding @jrlindem. I really need to get up to speed on REGEX as I can see its power. I was trying to avoid doing multiple steps as you indicate because I know there's much simpler solutions. I also put this task out there because I love the input from the community and the creativity can help in other areas! Thanks again!
Fun one! @jrlindem is right that RegEx is best - here's a possible solution:
I like @Mathias_Nielsen 's idea too!
@gfisch13 here is a regex formula
REGEX_Replace([Address Field], "[^0-9]", "")
You can also do it with a data cleanse tool, take a look at the attached workflow for more explanation
Three good options:
@Mathias_Nielsen with Cleanse Tool
@alexnajm / @aatalai with Regex
And here's a long-form formula sequence to get there (making assumptions about street number existing and zip code always being at the end):
I want to also join in with my Regex solution! :D
Duplicated the column and used Regex to replace all but the digits.
Thank you all for your input. I'm testing each suggestion as its a great learning opportunity. As suggested a few times, the Data Cleansing Tool worked perfectly.
I've been away from Alteryx for a while being unemployed, but my new company has allowed me to introduce this technology, I'm just a bit rusty and the tools changed a bit.
Appreciate this community, great minds!!
User | Count |
---|---|
107 | |
85 | |
76 | |
54 | |
40 |