Hi Team,
I need to parse below address line to 4 coloumns based on below exampls:
# 60 2th Main G A B 2nd Stage ZZZZZZZZZZ Bangalore North Bengaluru Karnataka IN560011 |
Zinkgräfstraße 62 Weinheim DE 69469 |
Panoramastrasse 3 Weinheim DE 69469 |
The output should show as below:
Address Line 1 | Address Line 2 | UBO CITY | Postal Code |
# 60 2th Main G A B 2nd Stage ZZZZZZZZZZ | Bangalore North Bengaluru Karnataka | IN | 560011 |
Zinkgräfstraße 62 | Weinheim | DE | 69469 |
Panoramastrasse 3 | Weinheim | DE | 69469 |
Kindly help me to solve this logic.
Thanks
@Myusrename001 there may be need here to standardize your data intake prior to parsing in Alteryx. Otherwise, you'll find yourself in never-ending Regex land! :D
Jokes aside, here is one approach. Since the records don't really follow a discernable pattern, I went ahead and separated the data set based on the number of words in the records. In the top of the workflow, I am handing all records with five or fewer words and getting their address elements. In the bottom, I am handing the Bangalore entry that is considerably longer.
I'll caveat in saying that the solution works for this specific use case. If the longer entries follow an irregular pattern as well, I would stress (again) to standardize these prior to bringing the data set in.
Hope this helps!
@rzdodson - Please see below logic
Address 1 | CITY | Country | Postal code |
Morc Cantin Royal Road Eau Coulee | Curepipe | MU | 74208 |
SV - 2, Block 7, Second Floor, Eldeco Utopia, Sector 93-A, Noida, | Gautam Budh Nagar Uttar Pradesh | IN | 201304 |
C-1,Pocket-6,Kendriya Vihar - 2 Sector-82, Noida, Gautam Budh Nagar Nodia | Nodia UTTAR PRADESH | IN | 201304 |
Fruehlingstrasse 19B | Heigenbruecken | DE | 63869 |
B-12/101-102 kailash manas CHS Mansarovar Complex Dhamankar naka Near varaldevi lake Bhiwandi | Thane Maharashtra | IN | 421302 |
G2-1501 THE MEADOWS ADANI SHANTIGRAM SG HIGHWAY, NEAR VAISHNO DEVI CIRCLE, | Daskroi Gujarat | IN | 382421 |
@Myusrename001 I am afraid that the data above needs to be seriously cleansed/standardized prior to continuing here. From the looks of it, there isn't a single entry that begins to match others outside of Fruehlingstrasse 19B/Zinkgräfstraße 62/Panoramastrasse 3 that can be parsed out (\w+\s\w+). There is a lot of superfluous information that is just "noise" within the data entries that do not seem required (e.g. floors of buildings, highways, vicinity to other landmarks, etc.).