Data Science

Garabujo7 · ‎06-17-2022

This post is part two of the image classification (without code) series. To read part one, click here.

Classification model creation

To continue where we left off in part one, the next step is setting the model's options.

We now select the training and validation images.

Training and validation image selection

A predictive model, in this case, the classification model that we are going to use, requires training images to start learning.

Then, to ensure that the model is not memorizing the images that we present to it and is very efficient in classifying only the ones it knows, we use another set of images to validate that it can make adequate predictions on images that it has never seen before.

In this way, we ensure our model has better performance in the real world.

Classification model options

It is important to note that the platform has four types of models, which I describe below. They are pre-trained, which we can take advantage of to make our classification process easier and faster.

In the model options, we have:

Epochs
- It iterates data through the model, both forwards and backward
- The higher the number of epochs, the better the model results, although the processing time increases
- The recommended option is 10

Pre-trained model
- The platform contains four ready-to-use models. They have a balance between greater precision and processing time. The more accurate, the longer they take to execute
- VGG16: Most accurate but slowest
- InceptionResNetV2: It is fast to train and somewhat more precise than the previous one
- Resnet50V2: It is the fastest and a little less accurate
- Inception V3: It has the best balance between accuracy and speed
- The model recommended by the platform is InceptionV3

Batch size of images to be processed
- The batch size allows us to reduce the amount of data that passes through the model at a given time, to train the models without occupying so much memory
- The recommended option is 32

Additionally, within the platform, we can consult the configuration details quickly. Clicking the question mark opens the help information.

Export the trained model

Once we have the model trained to classify the images, it is exported in an Alteryx database format, .yxdb, to streamline the process and classify new images.

Classification of new images

Once we have the trained model, we can use it to classify new images that the model has never seen.

For that, we first use the Holdout images with the trained model. We use the Prediction tool found in the machine learning tab to do the classification.

The result of the model will be the predicted label. To verify that the model gives us adequate results for the objective, we compare the original label of the image with the one that the model predicted.

For that, I first use a formula tool, which compares the original label and the one that the model predicted to separate the correct ones from the incorrect ones with a filter. Then I create a report with the correct and incorrect classifications using the image block and the basic table.

To validate the effectiveness of the model, I created a contingency table for it to count the number of correct and incorrect predictions.

Now that we have a model that gives us acceptable results, we can either use new images to classify them or deploy the model to production quickly.

Putting the model into production

We can create an analytic application for the model to be consumed through Alteryx Server by other business users directly in a web browser without the need for an Alteryx Designer license.

With some interface elements, we can create an application for users to use the analytical application through the server without any programming. You only must configure two elements.

First, the browse file tool, which will allow the user to use their own images for the model to classify them.

The second is the action tool, which will dynamically update the values in the stream. What we must select in this case is the field that will update and, below, where it will take the value to update it.

If you want to learn more about building analytic apps, you can go to the Alteryx community.

Analytic applications on Alteryx Server

What is the magic of creating an analytic app? When we publish it to the Alteryx Server, users can use it unlimitedly and without requiring a Designer license. The analytic app will look like this in the Server Gallery.

With this application, the user can select their images through an internet browser and take advantage of the classification model that someone else in his organization created. The result will be displayed both in the browser and in whatever format has been configured for the output.

Conclusion

The process of training the predictive model to classify images is very simple--you only need to drag some analytical blocks, select some options, and do a couple of tests to choose the model that best suits what you want to solve. I used 1,200 images of concrete with defects and without defects. The training took approximately 5 minutes.

The accuracy of the model was 100%. This will depend a lot on the quantity and variety of images that we use for training.

The applications of this type of image classification model are enormous. Recognizing signatures on documents, verifying if a car had an accident or not, and identifying if a cell phone screen is broken, among many others, make this functionality an enormous added value to any business.