We are looking to scan a large number of record and look in four of the columns of freehand input (text boxes). If any of the words in those inputs match against a list of words/phrases, then indicate a match and where. We'll build out some tableau off of these results. I'm curious how others would approach this.
We're looking for exact matches, so 'cannot' would not cause a flag if the keyword list contained 'not'. Similarly, 'not a defect' would not flag if our keyword list was 'zero defect'. I've attached a dummy workbook the loosely mirrors our data set up.
Thanks for any help!
Solved! Go to Solution.
Try this out! The Find & Replace tool should do the trick for determining IF there is a match between your lookup list and the actual data (I had to make up a list because the Excel example didn't come through). Then, I wasn't quite sure what you meant by "where" the match occurred, but using the assumption that you meant which character was the start of the word match, I included a Formula tool and a pair of formulas that should give you the character position of the start of the matching word. That was totally a guess, so if it wasn't quite what you were looking for, let me know! 🙂
Cheers!
NJ
Hi @BigDataGeek
Extending @NicoleJohnson's excellent solution to handle multiple columns.
Basically transpose the text columns and then come up with a format for the match information since each row can now have matches from multiple columns. I chose to have a pipe delimited list.
Dan
Thank you both. Its on the right track, but the where part is different than the structure I'm trying to get to. That said, the location function is a great way to know where in the freehand text it could be found.
In the sample excel that didn't populate right, I had multiple columns of data of 'where' the match could come from. I'm looking for a way to scan all four columns of data, pick up matches and output in a format I can continue on in Tableau with further reporting. I think having each match on a separate row would be best.
ID | Where | Word |
1 | Text2 | zero defect |
1 | Text1 | homogeneous |
3 | Text3 | homogeneous |
Here is how that missing excel was structured.
Thanks
Hi Dan,
This has worked well to search multiple columns, but I'm only picking up one word or 'Reason' right now. One row may have multiple columns of data and each column may have multiple 'Reason's found. How would you approach this to continue searching after a word is found and report all words found?