Alteryx Designer Desktop Discussions

buddhiDB

Hi Alteryx Community,

I hope you're all doing well!

We are currently exploring ways to standardize unformatted asset names using AI-based methods, particularly leveraging the ChatGPT/AI Connector in Alteryx. The goal is to take various unstructured asset descriptions and generate a standardized common asset name.

For example, here’s a small sample of our dataset:

Unformatted Asset Name Expected Standardized Common asset Name

Scaffolding Air Nelson Hanger	Scaffolding
Shadow Vac backpack vacuum cleaner s/n 18916019	Vacuum Cleaner
Toyota Hiace - Super custom van JPA963	Toyota Hiace
Epson ET4750 printer	Printer
2020 Safari Caged Trailer, Single Axle 80D16	Safari Caged Trailer
Fit alarm system in office	Security Systems
EHP 9KG heat pump condensor dryer EDH903BEWA	Heat Pump

This is just a sample—we have a much larger dataset that needs standardization.

We’d love to hear from the community:

Has anyone successfully implemented a similar standardization process using the AI Connector ?

Would you recommend any specific prompt engineering or preprocessing techniques for better AI-driven name standardization?

Are there alternative approaches within Alteryx (or external tools that integrate well with Alteryx) that might work better for this use case?

If possible, we'd really appreciate a demo or guidance on how to set this up effectively.

Looking forward to your insights and suggestions—thanks in advance for your help!

KGT

Seeing as it's been 8 hours without response, I'll provide some input, although not the "answer". I am aware that I can be a bit of a nay-sayer on this topic, so bear in mind that a lot of people are more positive on using GenAI for this.

My belief is that traditional methods still haven't been fully utilised and using AI for classification is harder and less accurate than just designing the process using ML and Transforms. And that if you can classify it first (from a previous run), don't cloud the algorithm with that. Similar to performing fuzzy matches.

Classification algorithms can be super bespoke as the terminology can be very unique for an industry/process. Hence, most AI implementations I've seen with this involve RAG. If using ChatGPT etc, you want your prompts to restrict/define the research set. So, prompts like "give weighting to terms used in the HVAC industry" or having your own list of terms and a prompt like "Use this list as the starting point and don't use more than 100 new terms". I haven't tested either of those prompts, but hopefully you get the idea.

I personally still think ML methods are better for classification, but it mostly depends on what your expectation is. If you are expecting to get a finished result from the prompt, then that gets a lot harder, and you have to be prepared to have incorrect classifications. Using NLP, we would normally get a list of terms that it closely aligns with and a value. Then that info can be used with other standard formatting to pick the best term. So, a mix of traditional methods and more advanced methods (NLP), essentially using the NLP as another source for decision.

You could always use the ChatGPT connector as another source, similar to the NLP above. Your main thing to design is what your finished list of classifications looks like. What do you want to achieve. Do you want 3% new terms everytime, or do you want to classify against a list...

Alteryx Designer Desktop Discussions

Using AI Connector to Standardize Asset Names in Alteryx

Re: Need to check if we can activate container bas...

Alteryx 2024.2 Upgrade Issue – Formula Tool Config...

Re: Alteryx Core Exam Data Download Issues

Re: Selecting the input with specific header

Re: Replace Values in Columns in a Table using a d...