Start Free Trial

General Discussions

Discuss any topics that are not product-specific here.

Euleryx Project 14 – Longest Collatz Sequence

Pilsner
13 - Pulsar

Euleryx Problem 14 – Longest Collatz Sequence

Pilsner_9-1761851471241.png

 

 

 

My workflow and macros:

 

Spoiler

Workflow:

Pilsner_11-1761851502755.png

 

 

 

Macro:

Pilsner_12-1761851510227.png

 



Answer: 837799



Last Week's Favourite Solution:

Last weeks challenge had many different solutions, some admittedly requiring more tools than others. Forward thinking is always bonus so it was great to see all the large number (big int) solutions. Last weeks favourite solution goes to @AkimasaKajitani  for being the first person to post a solution which accurately computes the entire value, using string addition. Please find this solution on page one of last week's post or click here.

 

Definitions:

  • Conjecture: A conjecture is a mathematical rule/principal that is assumed to be true. They will typically have been tested on a large range of numbers but because there is no rigours mathematical proof, it cannot be stated as a fact.

  • Collatz Conjecture: The Collatz conjecture states that you can choose any starting number, and by following the rules listed below, you will always, eventually, get back down to the value “1” (the end of the sequence).


The rules for the Collatz conjecture are simple:
-  If the number is odd, then multiply it by 3 and then add one.
- If the number is even, then divide it by 2.

 

Mathematical Theory:

This problem can be solved via a bruit force method using two generate rows, to calculate the Collatz sequence for all numbers up to 1,000,000 but it takes  18 seconds to run, and repeats a lot of calculations. If we could reduce the number of calculations we could subsequently reduce the run time.

 

For example, lets start by calculating the Collatz sequence for all numbers 1-5:

1: 1 (length 1)

2: 2, 1 (length 2)

3: 3, 10, 5, 16, 8, 4, 2, 1 (length 8)

4: 4, 2, 1 (length 3)

5: 5, 16, 8, 4, 2, 1 (length 6)

 

Now when we come to calculate the sequence for 6, we could just start from scratch, or we could do the following:

Calculate the terms of the sequence, until we dip below the starting value. The reason this is significant is because we have already calculated all the sequences (chains) for these smaller numbers. In this case, that is simply the second term as the sequence goes 6, 3

 

But we already know the full sequence for 3. So instead of recalculating, lets just add their lengths together to find the total length of the sequence for 6. E.g

 

We had a chain of length 2, for the start value of 6 and a chain of 8 for the start value of 3. As the number 3 appears in both sequences we do have to subtract one from the total chain length as we don’t want to count the same number twice. So the total length for the 6the collatz sequence is:

2 + 8 – 1 = 9

 

Same can be applied for a start value of 7:

7: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5 (length 12)

 

5 was the first number in the sequence less than 7, se we have stopped there. Now we will join the two chains together to get a combined length of 12 + 6 – 1 = 17

So this basic principal of joining the chains together clearly works when the chain is only split in two. But what about if its got more than two splits, well that’s where the iterative element comes in.

MrBeanStressedGIF.gif

Lets consider the number 12. Its sequence begins as follows:

12: 12, 6 (length 2)

 

Now we can join to the sequence of 6

6: 6, 3 (length 2)

So we have a combined length of 3, but the sequence is not yet complete, it only goes 12, 6, 3.

Well by converting the above chain joining principal into an iterative process, we can then join this new chain, back to the chain for 3.

3: 3: 3, 10, 5, 16, 8, 4, 2, 1 (length 8)

To get a total chain length of 10.

 

By joining shorter chains together instead of recalculating each time, the run time goes from 18 down to just 5 seconds. (I have included both solutions in the file below so feel free to test them yourself).

 

Method:

1) Generate records from 2 – 1,000,000 (no need to include 1 as it's already a complete chain).

Pilsner_8-1761851428155.png

 



2) Use another generate rows to implement the sequence rules; however, when the value dips below the “Input” value, stop computing. Instead, we will use the joining logic discussed earlier.


Pilsner_7-1761851392076.png

 


3) For each starting record, find the number of terms in the chain (count) and the last value calculated in the sequence so far. This should be the first value that dips below the “input”

Pilsner_6-1761851373132.png

 




4) Now we need to join the sequences together until we have the complete chains. To do this, I have used an iterative macro.

a) The first step removed any records which have a “last chain” value of 2. These records have effectively completed the cain (the final steps are 2 and then 1) so we can successfully output them.

Pilsner_5-1761851338069.png

 

 

b) For all remaining records, we can join back to our original macro input. This will now connect an incomplete sequence with the next section of said sequence. You will notice I have also updated the “Last_chain” value. As we have just joined two chains together, we need to take the “Last_chain” value from the new part of the chain.

Pilsner_3-1761851273648.png

 



c) Add the counts together. Again, we have just joined two chains together so their lengths (counts) must be combined.

Pilsner_13-1761852216297.png

 

 

 

d) Simply loop round until all the records have a “Last_Chain” value of 2.

Pilsner_1-1761851200653.png

 

 

 

5) Final steps are to sort the chain lengths in descending order, then take the top record.
 

Pilsner_0-1761851170658.png

 

 

 

6) Submit your answer to the Project Euler Website!

Voilà!GIF.gif

 

Summary:

Even though it may seem like extra work, by creating an iterative macro, we have removed the need to calculate the same section of a chain over and over again, reducing the run time by over 60%.

 

Want to find out more, follow this link to our introduction post - Euleryx: Let The Games Begin.

7 REPLIES 7
Hub119
11 - Bolide
11 - Bolide

Iterative macros for the win!

Spoiler
WorkflowWorkflow
Spoiler
Collatz MacroCollatz Macro
NicoleJ
Alteryx
Alteryx

I see your iterative macros @Hub119 ... and I raise you a DCloud with no macros for the win! 

Spoiler
I did have to split my data stream in two to fit within the sampling row limit without having to generate an output file... but I'm still calling that only a little bit brute force, and a macro-less win. 😋
NicoleJ_0-1761866524731.png

The second Generate Rows formula was where I applied the iteration logic - I forget sometimes that Generate Rows isn't only for adding 1 or generating dates in between a start and end date. Fun! 

NicoleJ_0-1761866677669.png

 

 

 

Cheers!
NJ
Director, Product Management
Alteryx
DaisukeTsuchiya
14 - Magnetar
14 - Magnetar

Following the problem statement, I built a simple iterative macro that took 14 seconds to run.

Spoiler
スクリーンショット 2025-10-31 085058.jpgスクリーンショット 2025-10-31 085122.jpg
Qiu
21 - Polaris
21 - Polaris

I take the easy way, the Brutal Force way.

Spoiler
Euleryx Project 14A.png
AkimasaKajitani
17 - Castor
17 - Castor

My solutions are Brute force(31.8sec) with no macro and Iterative Macro solution(23.9sec).

 

Spoiler
My macro solution is that last remains is my answer. This algorithm is very simple.
AkimasaKajitani_0-1761961766662.png
AkimasaKajitani_1-1761961786704.png

 

 

gawa
16 - Nebula
16 - Nebula

Mine is 'a bit' wise brute-force that drops the hopeless numbers in advance. This approach enabled me to just examine a half of 1million.

I'm 90% certain about my logic but if you find some flaw, please let me know😂

Spoiler
image.png
In all possible sequences, there re some that follows:

N_0(odd) => N_1=3*N_0+1(even) => N_2=(3*N_0+1)/2

If such a N_0 exists, it means N_0 has a longer chain than N_1(and N_1 does than N_2 as well). Hence as far as such a N_0 is examined, N_1 and N_2 are not needed to be examined.
Fr example, starting from 3, it follows sequence
3 => 3*3+1=10 => 10/2=5 =>...

So the sequence of 3 contains that of 10 and 5 hence chain length of 10 and 5 are shorter than that of 3. 10 and 5 can be dropped in advance.
PangHC
13 - Pulsar

i try to records the old chains join with the current chains (as you mentions) but it work well in python. but not in Alteryx. the Join take so much time in compare the current due to 1million join X million. even after cancel some data. 

and interesting that when go through the non-macro ways. it was slower than the macro ways. I thought is always faster than macro method.

Spoiler
the number can be reduce for
3n+1 < 1,000,000 for odd
n/2    <  1,000,000 for even

so it only start from 333,333 and +2 after 

PangHC_0-1762141650855.png
macro: 
PangHC_5-1762142539442.png

 

 

i also test non-macro, macro, multiple-macro

Spoiler
runtime:
non-macro : ~50 seconds 
macro : ~23 seconds
2 macro : ~9 seconds

Non-macro
PangHC_3-1762142291710.png

 


1 Macro
PangHC_2-1762142195363.png

 

2 macro
PangHC_1-1762142137363.png

 


I know multiple macro will save some time. but more than half time is surprise me. 

since it divide 2, so I look for binary also. where it interesting to where it have 0 between the gap.

PangHC_4-1762142495322.png

 



Labels