Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extracting multiple words from a string into different column

CowCookie
6 - Meteoroid

I'm trying to extract Twitter handles from tweets because my data provider unfortunately doesn't separate out retweets and @mentions. I should be able to do this using the tweets themselves since all the information is contained in the text, but I'm having difficulties getting the formula correctly.

 

My table is set up with the following columns:

 

Link to Twitter accountTweet text

 

I want to do the following:

  1. Move the unique part of [link to Twitter account] to a new column called [source]
  2. If [tweet text] contains a word starting with "RT @", move that word into a new column called [retweet].
  3. If [tweet text] contains a word starting with just "@", even if it also contains a word starting with "RT @", move that word into a new column called [mentions].

So for the following table:

 

Link to Twitter accountTweet text
https://twitter.com/suestantonI like burgers
https://twitter.com/jimsmithCheck out @bobsburgers
http://twitter.com/jillknoxRT @jimsmith Check out @bobsburgers

 

I'd get:

 

Link to Twitter accountTweet textSourceRetweetMention
https://twitter.com/suestantonI like burgerssuestanton  
https://twitter.com/jimsmithCheck out @bobsburgersjimsmith bobsburgers
http://twitter.com/jillknoxRT @jimsmith Check out @bobsburgersjillknoxjimsmithbobsburgers

 

I've tried numerous variations of IF/ELSEIF combinations but can't seem to get anything that works. Any ideas?

4 REPLIES 4
MarqueeCrew
20 - Arcturus
20 - Arcturus

@CowCookie,

 

Here's something that I baked up just for you...

 

Capture.png

 

Source:

REGEX_Replace([Link to Twitter account], ".*\/(.*)", '$1')

Source takes whatever is after last '/' symbol.

 

Retweet:

IIF(left([Tweet text],3) == 'RT ',
REGEX_Replace([Tweet text], ".*@(.*?)\s.*", '$1'),'')

Retweet takes the first group (after a '@' symbol') when the tweet starts with "RT(space)" otherwise it is blank.

 

Mention:

IIF(contains([Tweet text],"@"),
REGEX_Replace([Tweet text], ".*@(.*)", '$1'),'')

Mention takes the group of letters following the last '@' otherwise it is blank.

 

I hope that this helps you.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
CowCookie
6 - Meteoroid

Unfortunately, this is not working for [Mention] because it pulls in everything after the "@" instead of just the word tied to it. 

 

So:

  • Hey @johnsmith check out bob's burgers

 

Becomes:

  • johnsmith check out bob's burgers

 

It should be:

  • johnsmith

 

 

MarqueeCrew
20 - Arcturus
20 - Arcturus

Try this update to the mention logic:

 

IIF(contains([Tweet text],"@"),
REGEX_Replace([Tweet text], ".*@(.*?)$|.*@(.*?)\s.*", '$1'),'')

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
MarqueeCrew
20 - Arcturus
20 - Arcturus

Another crack at it:

 

IIF(contains([Tweet text],"@"),
REGEX_Replace([Tweet text], ".*?@(\w+).*", '$1'),'')

Capture.png

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels