I'm looking to analyze a free response survey question with Find Replace. Right now I have created a legend to categorize responses based on keywords in the response. For example, If a response is "I like to make chocolate" or "choco is my favorite" then they'll be categorized as "chocolate". If a response says "I like to bake pudding" or "vanilla pudding is my favorite", they'll be categorized as "pudding". Based on the key of: KeywordCategoryChocolateChocolateChocoChocolatePuddingPudding I am using this key with the Find Replace tool. The category is appended into a new column. Currently, the output would look like this: ResponseCategoryI like to make chocolateChocolate choco is my favoriteChocolatei like to bake puddingPuddingvanilla pudding is my favoritePudding My issue is that if a response has keyword for both the Chocolate and Pudding categories (for example, "I like to bake both chocolate and pudding"), I want them to be categorized into both the Chocolate and the Pudding categories. Using the Find Replace tool, right now they would only be categorized into the category that comes first, in this case "chocolate" My current workaround is to run the Find Replace tool multiple times, once for each category, and have each category append a new column. However, this is quite infeasible/realistic, as I'll likely have upwards of 100 categories... and don't want to manage 100 Find Replace tools. Is there a way to do this within the Find Replace tool? Is there another tool that can help? Thank you so much!

Multiple Find Replace in one column?

I'm looking to analyze a free response survey question with Find Replace.

Right now I have created a legend to categorize responses based on keywords in the response.

For example, If a response is "I like to make chocolate" or "choco is my favorite" then they'll be categorized as "chocolate". If a response says "I like to bake pudding" or "vanilla pudding is my favorite", they'll be categorized as "pudding".

Based on the key of:

Keyword	Category
Chocolate	Chocolate
Choco	Chocolate
Pudding	Pudding

I am using this key with the Find Replace tool. The category is appended into a new column. Currently, the output would look like this:

Response	Category
I like to make chocolate	Chocolate
choco is my favorite	Chocolate
i like to bake pudding	Pudding
vanilla pudding is my favorite	Pudding

My issue is that if a response has keyword for both the Chocolate and Pudding categories (for example, "I like to bake both chocolate and pudding"), I want them to be categorized into both the Chocolate and the Pudding categories. Using the Find Replace tool, right now they would only be categorized into the category that comes first, in this case "chocolate"

My current workaround is to run the Find Replace tool multiple times, once for each category, and have each category append a new column. However, this is quite infeasible/realistic, as I'll likely have upwards of 100 categories... and don't want to manage 100 Find Replace tools.

Is there a way to do this within the Find Replace tool? Is there another tool that can help?

Thank you so much!

Datasets

Best Practices

Accepted answers

pedrodrfaria

Hi @yuvalshmul

I updated the WF and now it checks to see which ones did not match:

Please do not forget to accept it as a solution.

Pedro.

Multiple Find Replace in one column.yxmd

All comments

pedrodrfaria

HI @yuvalshmul

I attached a WF to show you how I would approach this. Use the append fields to populate all rows and match them and after that you can remove the possible duplicate values.

Multiple Find Replace in one column.yxmd

yuvalshmul

Thank you @pedrodrfaria!

This is perfect for what I was looking for. Hoping that the 1000 responses x ~500 keywords don't take too long to run, but definitely less time than building and running multiple Find Replace!

A follow up question that this made me realize I also needed to address: I'd also need to see which responses don't get matched with any keywords, as I would likely need to manually look at the responses, add respective keywords/categories to the key, and run the model again. And then repeat until I categorize 100% of responses.

How would I best approach this?

Quick Links

This months top contributors

mceleavey 383

mbarone 337

Hollingsworth 335

LanisC 335

JeffF 335