I'm trying to extract Twitter handles from tweets because my data provider unfortunately doesn't separate out retweets and @mentions. I should be able to do this using the tweets themselves since all the information is contained in the text, but I'm having difficulties getting the formula correctly.
My table is set up with the following columns:
Link to Twitter account | Tweet text |
I want to do the following:
So for the following table:
Link to Twitter account | Tweet text |
https://twitter.com/suestanton | I like burgers |
https://twitter.com/jimsmith | Check out @bobsburgers |
http://twitter.com/jillknox | RT @jimsmith Check out @bobsburgers |
I'd get:
Link to Twitter account | Tweet text | Source | Retweet | Mention |
https://twitter.com/suestanton | I like burgers | suestanton | ||
https://twitter.com/jimsmith | Check out @bobsburgers | jimsmith | bobsburgers | |
http://twitter.com/jillknox | RT @jimsmith Check out @bobsburgers | jillknox | jimsmith | bobsburgers |
I've tried numerous variations of IF/ELSEIF combinations but can't seem to get anything that works. Any ideas?
Solved! Go to Solution.
Here's something that I baked up just for you...
Source:
REGEX_Replace([Link to Twitter account], ".*\/(.*)", '$1')
Source takes whatever is after last '/' symbol.
Retweet:
IIF(left([Tweet text],3) == 'RT ', REGEX_Replace([Tweet text], ".*@(.*?)\s.*", '$1'),'')
Retweet takes the first group (after a '@' symbol') when the tweet starts with "RT(space)" otherwise it is blank.
Mention:
IIF(contains([Tweet text],"@"), REGEX_Replace([Tweet text], ".*@(.*)", '$1'),'')
Mention takes the group of letters following the last '@' otherwise it is blank.
I hope that this helps you.
Cheers,
Mark
Unfortunately, this is not working for [Mention] because it pulls in everything after the "@" instead of just the word tied to it.
So:
Becomes:
It should be:
Try this update to the mention logic:
IIF(contains([Tweet text],"@"), REGEX_Replace([Tweet text], ".*@(.*?)$|.*@(.*?)\s.*", '$1'),'')
Cheers,
Mark
Another crack at it:
IIF(contains([Tweet text],"@"), REGEX_Replace([Tweet text], ".*?@(\w+).*", '$1'),'')
Cheers,
Mark