Hello REGEX experts,
I have a fairly simple issue that I could definitely use some help with. I am joining two datasets using municipalities as one of the primary key fields. However, one of the datasets didn't come cleanly. For example, Boulder, Colorado is simply listed as "Boulder" in one table, but it is listed as "BoulderCounty" or "Boulder county" in the other, causing incorrect joins (if any at all).
How can I use regular expressions to either complete delete "county" or replace the "county" text with a space? I can work through it afterwards. I tried referencing the Regex Documentation to get started but it didn't click for me right away.
Really, really, really appreciate it!
Attached file is just a text input excel example of what I'm working with.
Solved! Go to Solution.
Hi @GSWill
you don’t have to use Regex for this. One way is to first use formula Replace(<yourfield>, “County”, ‘’) followed by a Datacleanse tool to trim out the extra/ trailing/leading spaces.
if you must use Regex for some reason, you can use the Regex Tool set to Replace mode, and the string to replace is
or you can do Regex under Parse mode to extract anything but the county by something similar to this:
(.*)\s*County
dawn
@GSWill
I assume you want the similar strings to all be replaced as "Boulder".
You can use if or swith if you have several replacement.
Please see below:
Attached the workflow,
Hope this helps!
Regards,
Hey @GSWill I think @DawnDuong's solution is best here, but if I could amend it slightly, the TRIM() function will remove leading and trailing whitespace (and is a lot more performant than the data cleanse tool). So
TRIM(
REPLACE([Field],'county','')
)
Will remove all "county"s and then also get rid of any trailing whitespace.
Ollie
Hi Ollie, thank you for your reply! I was just wondering if replace is case sensitive so that if data returns as "COUNTY", "County", or "county" if it will still cleanse. Thanks!
Thank you so much for your reply Messi! My dataset contains all counties in the United States so I have to focus on removing the "county" section of the string instead of specifying the name, otherwise I would exactly implement your solution. Much appreciate your help!
Thank you so much for your reply Qiu! My dataset contains all counties in the United States so I have to focus on removing the "county" section of the string instead of specifying the name, otherwise I would exactly implement your solution. Much appreciate your help!
Thank you so much for your help Dawn! Is the replace function case-sensitive? Thanks!
Hi @GSWill
the replace function (to be used in Formula tool) is case sensitive
if you use the Regex Parse or Replace, you can select the case insensitive option
dawn