Start your journey with Alteryx Machine Learning - Take our Interactive Lesson today!

Alteryx Machine Learning Discussions

Find answers, ask questions, and share expertise about Alteryx Machine Learning.
Getting Started

Start your learning journey with Alteryx Machine Learning Interactive Lessons

Go to Lessons

Sentence similarity Large dataset

8 - Asteroid

Hello guys I have a large dataset set that contains codes for (costs) expenses and a description but the description for the expenses is a free text entry that contains more than what's needed. I want to relate the description to each other by the similarity between sentences since the codes are not accurate to rely on.

4455travel expenses 
4466flight expenses 
7788medical expenses 
7788injury costs 
9900medicine bought from a pharmacy 
4455train tickets to London
8822bonus for an employee getting ....
8822employee target reached expenses 


The desired outcome is every related cost has the same code. The actual data is huge and have full sentences as a description but always has similar keywords.

1100travel expenses  
1100flight expenses 
1100train tickets to London
2200medical expenses 
2200injury costs 
2200medicine bought from a pharmacy 
9999 bonus for an employee getting ....
9999employee target reached expenses 
11 - Bolide

it sounds like you just need some pattern matching.. also some data validation to control user input..







5 - Atom

A very awesome blog post. We are really grateful for your blog post. You will find a lot of approaches after visiting your post.






5 - Atom

Thanks for great share.

5 - Atom

exploring more content on this forum... techzpod download mobdro