Data Science

BenMoss · ‎05-04-2020

Legolytics Part 1: Extracting Colours From an Image With Alteryx and R

Legolytics Part 2: Assigning the Lego Colour Palette

Legolytics Part 3: All Possible Bricks

Legolytics Part 4: Optimizing Cost (you are here)

Legolytics Part 5: The Product

Money, Money, Money

The whole reason I embarked on this project was to try and beat the price Lego charges for this product. In this section, I’ll discuss how we can use the Optimization tool in this very specific, bespoke use-case (you can use the linked posts below to find out more detail about other use-cases for the Optimization tool).

This post is about moving from this…

...to this (which is significantly cheaper to build)...

An Important Note

I am definitely not a subject matter expert when it comes to optimization, nor using the Optimization tool. Huge credit must go to the resources provided by @SydneyF and Philip Mannering’s amazing optimization talk at Inspire Europe 2019. I’ve linked to their resources below:

Tool Mastery | Optimization

Prescriptive Analytics: Unleash the Optimization Tool

What is Our Objective?

To start, we need to provide the Optimization tool with is our options as well as our objective, which is the metric that we want to maximise or minimise. In this case our objective is to minimise cost. Our options are each of the possible bricks that will fit into our image.

To approach this first step, I created a data stream consisting of the cost of each possible brick that can fit into our dataset, and its respective price. I defined the type as binary, or B, which means a brick can either be used in the Lego image, or not. This data stream gets fed into the O input anchor of the Optimization tool.

Note: in the Optimization tool there is an interesting quirk, where purely numeric identifiers for your variables cause problems, which is why I have created a new ID field using the UUID() function available in the Formula tool.

Our Constraints

In order to fill our image, each pixel must be covered by one brick (this rule is specified in the next section). To account for this, we must create a data source which details which bricks cover which pixels. This can be easily achieved with the Cross Tab tool. Each row in this data represents a different Lego brick, and each column represents a pixel.

Of course, this gets mildly unwieldy because we will end up with 2305 columns (one for each pixel in our image, and one containing the brick identifier).

Here, if a column has a value 1, that indicates that the brick covers that pixel, if the value is [Null], that indicates the brick does not cover that pixel. This data stream is the input to the A anchor.

Now Let’s Apply Some Direction

The direction element is important. It tells us the amount of times each pixel should be represented in our image, which of course should be exactly 1, which is represented by setting the dir (or direction) column to ‘==’ and rhs (or right-hand side) to '1'. Each row in the dataset represents a pixel in the image, which needs to be filled by a Lego brick once. This input feeds into the B anchor.

The Results

Well, they are fantastic.

The S output anchor (which I assume stands for solution), indicates the objective which is the value that the optimisation model has managed to achieve. It then details all of the individual options that is has used to achieve this score.

In the case of Phil’s head, you will see our model has managed to minimise the cost at £43.40 - much better than the £100 Lego want!

This data can then be merged with our original catalogue of parts to allow us to go and find our parts in the Lego store and build our image!

Next Steps

Well, that’s it, in terms of building a solution. However, there is one more part to this blog series, and that’s how you can use the application that I have built (which I will share), to make all your family and friends some truly personalised Christmas gifts.