I have to clean a dataset but the data are messed up. For example, I want to replace the last word in a string only if it is "KG." into "KILOGRAM". I do not know how to write the regex for that.
I also wonder how to extract certain conditional data to a new column. For example, I want to extract the whole string that is a FRUIT. I want. my data to look like the following:
DATA DATA NEW DATA_FRUIT ID
255 KG. 255 KILOGRAM
FRUIT ID 978 FRUIT ID 978
119 KG 119 KG
Should I use find and replace or REGEX?
Solved! Go to Solution.
Hi @basketoforange ,
A simple replace function I believe would be enough for this use case. You can use a conditional if statement, (for example If endswith[name of field],"KG" then Replace(...) ) to make it more precise, if you are sure that the unit will always appear after the fruit.
Hope that helps,
Angelos
Just for a regex answer....
because you could have:
5 kg is smaller than 10kg
Regex_Replace([text],".*(kg\.)",'kilogram')
everything until the last kg followed by a period and replace the kg. With kilogram.
you might want (kg\.*) in case the period is not there.
cheers,
mark
Add to @MarqueeCrew and @AngelosPachis , I consider the case sensitivity here and your 2nd part.
Just want to confrim
KG. -->KILOGRAM
KG -->KG
@basketoforange
If you think our answers help, appreciate you would mark them as accept.😁