This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Between @DavidM and myself we have shown numerous weird and wonderful ways to extract text from PDF and Word docs using a mix of Python or R.
Once you have the raw text what else could you do with it......
Recently I was speaking to a few Banking customers, they love the idea of carrying out sentiment scoring on their data but due to security they are unable to do this easily without pushing their sensitive data out via an API - Which obviously is not a viable option.
I thought this is a great challenge and set out to build an offline sentiment scoring macro. It also ties in with the previous parsing examples built to extract text from pdf/word docs.
We definitely needed to catch up. I was doing a very similar thing using TextBlob.
I have kept the python minimal, just appending the polarity and subjectivity. You can work out the sentiment then in Alteryx, rather than Python
My original use case was Twitter data. Hence the reference in the code, but should be very straight forward to change the name of the columns.
from ayx import Alteryx Package.installPackages(['textblob'])
from textblob import TextBlob
tweetdata = Alteryx.read("#1")
polarityval = 
i = 0
p = TextBlob(tweetdata.iloc[i]['Tweet'])
i = i+1
tweetdata["polarity"] = polarityval
subjectivityval = 
i = 0
s = TextBlob(tweetdata.iloc[i]['Tweet'])
i = i+1
tweetdata["subjectivity"] = subjectivityval
Also, everyone remember, to install new packages you need to make sure Alteryx is running under admin privileges.