I want to identify and replace the third occurrence of the "T" character in each data row
AGCTTAGGCGAGTGCGAGTGCGATA
AGCTAGGCCGTAAAGCGAGGAGCCC
CTAGCATGCATGGGACCTAGGACCA
TAGAGATCGACGATTTACGAGGTTC
to
AGCTTAGGCGAGUGCGAGTGCGATA replaces it at character 12 (base 0)
AGCTAGGCCGTAAAGCGAGGAGCCC only 2 occurrences, replaces nothing
CTAGCATGCAUGGGACCTAGGACCA replaces it at character 10
TAGAGATCGACGAUTTACGAGGTTC replaces it at character 13
This is a simple case. I'm looking for something like this that will work for any occurrence and any character or string more generally. I expect it will require REGEX, but I'm not yet proficient with it.
Solved! Go to Solution.
Use the RegEx Tool
Configuration
Everything else is default
This replaced the 3rd occurance of T in any string with U
This is what I did:
but, it replaced all the Ts, not the third instance:
Am i missing something?
Woops sorry about that! Thought I had it there... let me keep trying.
While @bgraves is going down a regex path, here is a brute force way to accomplish the task:
1. Tokenize each value
2. Find 3rd T
3. Replace with U
4. Restructure Data
Nice one, @MarqueeCrew
Here's an updated Regex ... it's still not quite right!
(?:.*?(T)+){3}.*?((T)+)
Couple issues:
I've racked my brains on RegEx too... surprinsingly difficult! Anyway, here's another brute force approach:
@pcatterson, as an FYI...if you get a workflow sent that is "beyond" your version, you can usually just open the YXMD in a text tool (like Notepad) and change the version to your earlier one. It's in the second line of code.
Of course the caveat is if the workflow uses tools that are in the newer version only, it won't work.
I think the following formula should work:
REGEX_Replace([Input],"(.*?T.*?T.*?)T(.*)","$1U$2")
or a slightly tweaked version which allows for changing instance number more easily (just change the 2!):
REGEX_Replace([Input],"((.*?T){2}.*?)T(.*)","$1U$3")