- RSS フィードを購読する
- 新着としてマーク
- 既読としてマーク
- ブックマーク
- 購読
- 印刷用ページ
- モデレーターに通知する
The workflow developed in this blog post can be found here. The workflow uses the Optimization tool, which forms part of the prescriptive tool kit, so you must also have these tools installed.
Setting the scene
Last Christmas, I and a few colleagues looked to spread some festive cheer within The Information Lab by hosting a pub quiz. Not just your typical pub quiz though, no, that would be boring for us hosts, instead we spent countless hours building out a solution that used the technology platforms that exist within our business.
In the end, we created a solution that automatically posts a question, each hour, to our messaging platform (Convo), via Alteryx and Zapier.
Users could then submit their answers to the question via an Alteryx Application, and then a secondary ‘bot’ would post out the results of each round to our messaging platform.
This was a fun exercise - massively over-engineered, but fun - and this year we wanted to do something similar, so we went with Bingo.
Clarification: what is bingo?
It seems that Bingo is a very different game depending on where you live in our wonderful, Covid-ridden world. Bingo in the UK (the game we would be playing) is a game where each participant receives a ticket (or more often than not, 6 tickets) with 15 numbers, each unique, between 1 and 90.
When tickets are sold in strips of 6, each strip will have each number on exactly once, meaning that every time the ‘caller’ says a number, the audience will always have something to cross off.
The Bingo caller (the person who referees/runs the game), has a random number generator and calls out the ‘balls’ one by one until someone claims one of the many prizes available (with the final prize being for a full house which is all of the numbers on a ticket).
So that’s the type of Bingo we are referring to here, none of this American 5x5 grid nonsense.
The problem
In the UK it’s very easy to get hold of Bingo tickets, either by walking to a stationery store or by web-searching UK Bingo Tickets. However, similar to last year's Christmas quiz, we wanted to use our technologies in our quest for festive fun.
In this case, it turns out that generating Bingo tickets is an optimization problem, one that is difficult to solve without an advanced analytics platform (cough Alteryx cough) at your disposal.
Below I’ve outlined the logical rules around a strip of bingo tickets.
- 6 tickets creates a ‘strip’ of tickets
- Each strip of 6 tickets must contain each number from 1-90 only once
- Each row must have a exactly 5 numbers
- Each column must contain exactly 10 numbers
- Each column within a ticket must contain at least 1 number
- Those 10 numbers must be from a specific set, so column 1 must only contain numbers 1-10, column 2 11-20…, column 9 81-90
- Each strip of tickets should be randomly generated
Where do we start?
We’re going to start by generating a list of the different spaces that are contained on a strip of 6 Bingo tickets.
Each ticket contains 9 columns and 3 rows, therefore we need to generate a list of 162 objects. In Alteryx, generating data from scratch is easy using the Generate Rows tool. In this case, I’ve used three, the first tool generates 6 records that indicate the ticket. The second tool generates 9 records which indicate the column, and finally, the third tool generates 3 records indicating the row.
By creating these, one after another, we have the same effect as mashing all these different items together.
The final step of this initial process is to create a unique identifier for each square in our grid.
Now that we have a list of all squares in our grid, the next step is to join in the valid ball options for each square. The only constraint at this point that we can consider is that balls may only appear in specific columns, 1-10 in column 1, 11-20 in column 2, … , 81-90 in column 9.
Therefore after generating our list of balls (using the generate rows tool again), we can implement a simple formula which will return the relevant column number for that ball.
CEIL([Ball]/10)
This function takes the ball number, divides the value by 10, before the CEIL function then takes the result and rounds it to the next smallest whole number, greater than the value.
The reason we choose 10 is because that's how many numbers are allowed in each column.
We can simply now join this list with our grid list to generate a data source which contains, for each square, a list of valid balls.
Given that each square is linked to a column which can contain any one of 10 balls, we’re now up to 1,620 records.
Some additional prep
In our data we have labelled a square as belonging to a column value between 1 and 9 and a row value between 1 and 3, as shown in the grid below (sample of two tickets).
However, our row-based and column-based constraints are at a ticket level, and as a result we must generate unique identifiers for these. A Formula tool allows me to quickly create unique identifier for my different columns and rows.
Row = [Row]+(([Ticket]-1)*3)
Column = [Column]+(([Ticket]-1)*9)
The final step of preparation is to adjust the unique identifiers that we have created. At present they are all integers, but our Optimization tool needs to be able to decipher between ball value 9 and column 9. We can use the Multi-Field Formula tool to add the field name as a prefix to the value.
List of decision variables
When creating our optimization model, one of the required inputs in this scenario is a list of decision variables. In this case, we have 1,620 decision variables, each of which can either be turned on or off, and only be used once (if used at all); therefore the type field (which is required for our model) can be set as binary, or B.
One further element that is required by our optimization model is the coefficient. The purpose of our model will be to minimize or maximise our coefficient. In this specific example we can use the RAND() function in Alteryx to create a coefficient, which means that each time our model creates a ticket, our output will be different as each “variable” is assigned a different weighting.
Constraints
The purpose of this input is to inform our model of the constraints which should be in place when building our solution. In our table I must first define the constraint, then the direction (equals ==, less than or equal <=, greater than or equal >=), and finally the right hand side of the equation, i.e. what the comparative value is.
- Each ball must appear exactly once
- Each ticket must have exactly 15 values
- Each Square must have at most 1 value
- Each Row must have exactly 5 values
- Each Column must have between 1 and 2 values
At this point you may be wondering what ColumnMin and ColumnMax are. These are simply copies of our Column field. When building this workflow I identified that you cannot have two constraints against the same value, and therefore by duplicating the field we give ourselves the ability to do this.
Our "dense matrix"
The purpose of the dense matrix is to inform our model of the contribution of each decision variable towards our constraints.
The dense matrix can be provided in a couple of forms but I have chosen to use the mode ‘dense matrix, variables in rows.’ The dataset looks something like this:
As an example, if you chose to select the decision variable SquareBall1 and SquareBall2 then the tally for constraint Ball1 would be 2, but this would violate our constraints which state each ball must appear exactly once.
Interpreting and formatting the output
The S anchor coming out of our Optimization tool provides us with a list of variables selected by our model; this information can then be used to return the chosen position of the different balls in our strip of tickets.
The penultimate step in our fantastically over-engineered process was to use the reporting tools within Alteryx to create a visually pleasing output, which we did simply by using the Table tool.
The final step was to publish this as an analytic application on our server so that people could get their bingo tickets in preparation for our holiday game!
Now, whilst this clearly isn’t a business use case, I hope this provides another example of how we can use fun data challenges to improve our knowledge of the product, which we can then apply in the real world.
ここにコメントを追加するには、ご登録いただく必要があります。 ご登録済みの場合は、ログインしてください。 ご登録がまだの場合は、ご登録後にログインしてください。