2022.1.1.30569 Patch Release Update

The 2022.1.1.30569 Patch/Minor release has been removed from the Download Portal due to a missing signature in some of the included files. This causes the files to not be recognized as valid files provided by Alteryx and might trigger warning messages by some 3rd party programs. If you installed the 2022.1.1.30569 release, we recommend that you reinstall the patch.

Data Science

Machine learning & data science for beginners and experts alike.
Garabujo7
Alteryx
Alteryx

This post is part two of the image classification (without code) series. To read part one, click here.

 

Classification model creation

 

To continue where we left off in part one, the next step is setting the model's options.

 

Garabujo7_1-1654802039932.png

 

We now select the training and validation images. 

 

Training and validation image selection

 

A predictive model, in this case, the classification model that we are going to use, requires training images to start learning.

 

Then, to ensure that the model is not memorizing the images that we present to it and is very efficient in classifying only the ones it knows, we use another set of images to validate that it can make adequate predictions on images that it has never seen before.

 

In this way, we ensure our model has better performance in the real world.

 

Garabujo7_2-1654802088571.png

 

Classification model options

 

It is important to note that the platform has four types of models, which I describe below. They are pre-trained, which we can take advantage of to make our classification process easier and faster.

 

In the model options, we have:

  • Epochs
    • It iterates data through the model, both forwards and backward
    • The higher the number of epochs, the better the model results, although the processing time increases
    • The recommended option is 10

 

  • Pre-trained model
    • The platform contains four ready-to-use models. They have a balance between greater precision and processing time. The more accurate, the longer they take to execute
    • VGG16: Most accurate but slowest
    • InceptionResNetV2: It is fast to train and somewhat more precise than the previous one
    • Resnet50V2: It is the fastest and a little less accurate
    • Inception V3: It has the best balance between accuracy and speed
    • The model recommended by the platform is InceptionV3

 

  • Batch size of images to be processed
    • The batch size allows us to reduce the amount of data that passes through the model at a given time, to train the models without occupying so much memory
    • The recommended option is 32 

 

Garabujo7_3-1654802197579.png

 

Additionally, within the platform, we can consult the configuration details quickly. Clicking the question mark opens the help information.

 

Garabujo7_0-1655130501655.png

 

Export the trained model

 

Once we have the model trained to classify the images, it is exported in an Alteryx database format, .yxdb, to streamline the process and classify new images.

 

Garabujo7_5-1654802218520.png

 

Classification of new images

 

Once we have the trained model, we can use it to classify new images that the model has never seen.

 

For that, we first use the Holdout images with the trained model. We use the Prediction tool found in the machine learning tab to do the classification.

 

Garabujo7_1-1655130533565.png

 

The result of the model will be the predicted label. To verify that the model gives us adequate results for the objective, we compare the original label of the image with the one that the model predicted.

 

Garabujo7_2-1655130664794.png

 

For that, I first use a formula tool, which compares the original label and the one that the model predicted to separate the correct ones from the incorrect ones with a filter. Then I create a report with the correct and incorrect classifications using the image block and the basic table.

 

Garabujo7_3-1655130705645.png

 

To validate the effectiveness of the model, I created a contingency table for it to count the number of correct and incorrect predictions.

 

Garabujo7_4-1655130758120.png

 

Garabujo7_12-1655131670657.png

 

Now that we have a model that gives us acceptable results, we can either use new images to classify them or deploy the model to production quickly.

 

Putting the model into production

 

We can create an analytic application for the model to be consumed through Alteryx Server by other business users directly in a web browser without the need for an Alteryx Designer license.

 

Garabujo7_5-1655130822273.png

 

With some interface elements, we can create an application for users to use the analytical application through the server without any programming. You only must configure two elements.

 

First, the browse file tool, which will allow the user to use their own images for the model to classify them.

 

Garabujo7_6-1655130848858.png

 

Garabujo7_7-1655130880132.png

 

The second is the action tool, which will dynamically update the values in the stream. What we must select in this case is the field that will update and, below, where it will take the value to update it.

 

Garabujo7_10-1655131035317.png

 

If you want to learn more about building analytic apps, you can go to the Alteryx community.

 

Analytic applications on Alteryx Server

 

What is the magic of creating an analytic app? When we publish it to the Alteryx Server, users can use it unlimitedly and without requiring a Designer license. The analytic app will look like this in the Server Gallery.

 

image009.png

 

With this application, the user can select their images through an internet browser and take advantage of the classification model that someone else in his organization created. The result will be displayed both in the browser and in whatever format has been configured for the output.

 

Garabujo7_11-1655131332691.png

 

Conclusion

 

The process of training the predictive model to classify images is very simple--you only need to drag some analytical blocks, select some options, and do a couple of tests to choose the model that best suits what you want to solve. I used 1,200 images of concrete with defects and without defects. The training took approximately 5 minutes.

 

The accuracy of the model was 100%. This will depend a lot on the quantity and variety of images that we use for training.

 

The applications of this type of image classification model are enormous. Recognizing signatures on documents, verifying if a car had an accident or not, and identifying if a cell phone screen is broken, among many others, make this functionality an enormous added value to any business.