Hi community,
I have a regex query,and am unable to get my head around it. Might be simple for you guys.
I have a column which looks like so:
column
ABC Oy Apple Road
DE AF Ltd. Band Road
A DE Oy Col Street
GRE Plc Main Street
I would like to remove all characters before Oy/Ltd./Plc including these so that all that is left is the address (Apple Road, Band Road....).
Thanks a lot!
Rohit
Solved! Go to Solution.
Hi @adrrs
REGEX needs a pattern.
Is your pattern 3 characters, space, and then your address?
If yes, you don't even need REGEX for that.
A String function would do it.
Right(Field, Length(Field) - 4)
Cheers,
Hi,
Not exactly. It was just an example but a real world example would be more like so:
Apple Inc. Some Road
General Motors Plc. Detroit St.
The point is that I have identified these Inc/Plc/Oy as the common characters in all observations that can enable me to split the variable so that I aonly retain what comes after these characters.
Thanks!
You could try something like this, but be careful
REGEX_Replace([Field1], ".*(?:\bLtd\b|\bOy\b|\bInc\b|\bPlc\b)\W*(.*)", "$1")
This probably might do the trick, if you see other words as patterns, you could include in the alternate expression.
Cheers,
Solution from @Thableaus is a good one
Another solution would be to use formula tool with
IF FindString([Field1],"Plc") >0 THEN
Trim(Right([Field1], Length([Field1]) - FindString([Field1],"Plc")-4))
ELSEIF FindString([Field1],"Oy") >0 THEN
Trim(Right([Field1], Length([Field1]) - FindString([Field1],"Oy")-3))
ELSEIF FindString([Field1],"Ltd") >0 THEN
Trim(Right([Field1], Length([Field1]) - FindString([Field1],"Ltd")-4))
ELSEIF FindString([Field1],"Inc") >0 THEN
Trim(Right([Field1], Length([Field1]) - FindString([Field1],"Inc")-4))
ELSE [Field1] ENDIF
You could use the Replace() function to substitute a common value for your list of values you would split on (e.g. a pipe '|'), and then you could use TextToColumns to do the split...
Something like REPLACE(REPLACE(REPLACE([Field],'Inc.','|'),'Plc','|'),'Oy','|')
You could also use the FindReplace tool to do the equivalent...which might be a little visually simpler to manage over the nested replaces. You'd basically have a Text Input tool with a "FIND" column populated with Inc/Plc/Oy and a "REPLACE" column that's your pipe '|' character. This feeds into the R connector on the Find Replace tool and your data connects to the F connector.
Thanks! This worked, although it gets ugly the more conditions you have.
Using the Find Replace tool would make the workflow tidier because the number of 'conditions' is handled in the input to the tool, not in a nested formula.