This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I've developed a workflow that combines Latent Semantic Analysis and Stanford GloVe (https://nlp.stanford.edu/projects/glove/) to turn a column of texts into high-dimensional vectors representing that text. I've compared it to Latent Semantic Analysis semantic spaces and it generally outperforms them by quite a bit. The resulting vectors can either be used to calculate cosines between your texts or used as features in supervised machine learning (assuming you have a target you are trying to predict). My university took out a patent on a much more involved version of this algorithm, but it seems unlikely the rest of you will want to use it in that context, so thought I'd gauge interest.
If interested, give this a like or some other way of communicating interest. I'm on a slow connection up in the mountains, and this workflow requires the 6GB GloVe embeddings file, so would rather not spend the effort unless people see a value in this. I know NLP has not been a huge focus for Alteryx.