Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Data Science

Machine learning & data science for beginners and experts alike.
Garabujo7
Alteryx
Alteryx

Export a Trained Topic Classification Model to Categorize New Items

 

Taken from giphy.comTaken from giphy.com

 

Once we create and train our model to identify topics within the data we have, the next step is to use that trained model to assign topics to new data we receive and thus avoid running the entire preparation and training process all over again, which would consume more time and resources.

 

Garabujo7_1-1628699329986.png

 

The image above is an end-to-end topic identification process. How can we make it more efficient?

 

Export a Trained Model

*The 2021.2 version of Alteryx Intelligence Suite included the ability to export the trained topic classification model. Once we have a model that meets our needs, we can export it and use it to assign topics to new information.

 

The first step is to place an Output Data tool after the M anchor of the Topic Modeling tool.

 

Garabujo7_2-1628699382703.png

 

Select an Alteryx Database file (.yxdb) as the output format, as this will store the model object, and we can use it to assign topics based on the data we used to train it. Having the model trained, we can use it in a new workflow.

 

Garabujo7_3-1628699396069.png

 

As demonstrated in the above image, place the new dataset that we want to classify (Text Input connected to a Text Pre-processing tool) and the trained model that we exported (Input Data tool with TrainedTopicModeling.yxdb). To score the new values, we will use the Predict Values tool found in the Machine Learning tab of the tool palette.

 

Garabujo7_4-1628699426687.png

 

Connect the trained model to input anchor M (model), and the data to anchor D (data). It is important to add text preprocessing first to prepare the new data correctly.

 

Garabujo7_5-1628699474655.png

 

And that's it! When executing the workflow, your model will assign a topic to each data point. This way we will optimize the execution time and we can use the model to categorize new data easily.

 

Now, let's dance a little.

 

Taken from giphy.comTaken from giphy.com