Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Anyone interested in a Deep Learning NLP module for Alteryx? Stanford GloVe.

KaiLarsen
9 - Comet

Hi everyone,

 

I've developed a workflow that combines Latent Semantic Analysis and Stanford GloVe (https://nlp.stanford.edu/projects/glove/) to turn a column of texts into high-dimensional vectors representing that text. I've compared it to Latent Semantic Analysis semantic spaces and it generally outperforms them by quite a bit. The resulting vectors can either be used to calculate cosines between your texts or used as features in supervised machine learning (assuming you have a target you are trying to predict). My university took out a patent on a much more involved version of this algorithm, but it seems unlikely the rest of you will want to use it in that context, so thought I'd gauge interest.

 

If interested, give this a like or some other way of communicating interest. I'm on a slow connection up in the mountains, and this workflow requires the 6GB GloVe embeddings file, so would rather not spend the effort unless people see a value in this. I know NLP has not been a huge focus for Alteryx.

 

Kai :-)

 

4 REPLIES 4
jamielaird
14 - Magnetar

I'd be interested to read a blog about this - perhaps you could reach out to the Community team, I'm sure they would love your contribution?

KaiLarsen
9 - Comet

Here it is attached. Feedback and refinements welcome. If many like it, we might create a bit more of a tutorial around it.

 

Note that this requires the following:

1. SnowballC stemming package. Instructions on install in R tool.

2. Stanford GloVe vectors. Comment about where to download and link it in provided in workflow.

 

Hope it is of some use to you all.

 

Kai :-) 

AnkitVarshney
5 - Atom

Hi Kai,

 

Can you please confirm the file format that should be used to upload the GloVe vector file? As it is a .txt file and while uploading it as .CSV file it is not yielding any result.

Also, the file size is too high and cannot be opened in notepad to understand the structure of the file.

 

Regards,

Ankit

KaiLarsen
9 - Comet

Hi Ankit,

 

I'm traveling right now, so can't confirm the answer to your question, but for now, use this file to circumvent the problem: https://www.dropbox.com/s/ly142hyd4ysicql/glove.840B.300d.yxdb?dl=0

 

Kai 🙂

Labels