Start your journey with Alteryx Machine Learning - Take our Interactive Lesson today!
Start your learning journey with Alteryx Machine Learning Interactive Lessons
Go to LessonsHello guys I have a large dataset set that contains codes for (costs) expenses and a description but the description for the expenses is a free text entry that contains more than what's needed. I want to relate the description to each other by the similarity between sentences since the codes are not accurate to rely on.
4455 | travel expenses |
4466 | flight expenses |
7788 | medical expenses |
7788 | injury costs |
9900 | medicine bought from a pharmacy |
4455 | train tickets to London |
8822 | bonus for an employee getting .... |
8822 | employee target reached expenses |
The desired outcome is every related cost has the same code. The actual data is huge and have full sentences as a description but always has similar keywords.
1100 | travel expenses |
1100 | flight expenses |
1100 | train tickets to London |
2200 | medical expenses |
2200 | injury costs |
2200 | medicine bought from a pharmacy |
9999 | bonus for an employee getting .... |
9999 | employee target reached expenses |
it sounds like you just need some pattern matching.. also some data validation to control user input..
A very awesome blog post. We are really grateful for your blog post. You will find a lot of approaches after visiting your post.
Thanks for great share.
exploring more content on this forum... techzpod download mobdro