General Discussions

Pilsner · ‎10-30-2025

Euleryx Problem 14 – Longest Collatz Sequence

My workflow and macros:

Spoiler

Workflow:

Macro:

Answer: 837799

Last Week's Favourite Solution:

Last weeks challenge had many different solutions, some admittedly requiring more tools than others. Forward thinking is always bonus so it was great to see all the large number (big int) solutions. Last weeks favourite solution goes to @AkimasaKajitani for being the first person to post a solution which accurately computes the entire value, using string addition. Please find this solution on page one of last week's post or click here.

Definitions:

Conjecture: A conjecture is a mathematical rule/principal that is assumed to be true. They will typically have been tested on a large range of numbers but because there is no rigours mathematical proof, it cannot be stated as a fact.
Collatz Conjecture: The Collatz conjecture states that you can choose any starting number, and by following the rules listed below, you will always, eventually, get back down to the value “1” (the end of the sequence).

The rules for the Collatz conjecture are simple:
- If the number is odd, then multiply it by 3 and then add one.
- If the number is even, then divide it by 2.

Mathematical Theory:

This problem can be solved via a bruit force method using two generate rows, to calculate the Collatz sequence for all numbers up to 1,000,000 but it takes 18 seconds to run, and repeats a lot of calculations. If we could reduce the number of calculations we could subsequently reduce the run time.

For example, lets start by calculating the Collatz sequence for all numbers 1-5:

1: 1 (length 1)

2: 2, 1 (length 2)

3: 3, 10, 5, 16, 8, 4, 2, 1 (length 8)

4: 4, 2, 1 (length 3)

5: 5, 16, 8, 4, 2, 1 (length 6)

Now when we come to calculate the sequence for 6, we could just start from scratch, or we could do the following:

Calculate the terms of the sequence, until we dip below the starting value. The reason this is significant is because we have already calculated all the sequences (chains) for these smaller numbers. In this case, that is simply the second term as the sequence goes 6, 3

But we already know the full sequence for 3. So instead of recalculating, lets just add their lengths together to find the total length of the sequence for 6. E.g

We had a chain of length 2, for the start value of 6 and a chain of 8 for the start value of 3. As the number 3 appears in both sequences we do have to subtract one from the total chain length as we don’t want to count the same number twice. So the total length for the 6^thecollatz sequence is:

2 + 8 – 1 = 9

Same can be applied for a start value of 7:

7: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5 (length 12)

5 was the first number in the sequence less than 7, se we have stopped there. Now we will join the two chains together to get a combined length of 12 + 6 – 1 = 17

So this basic principal of joining the chains together clearly works when the chain is only split in two. But what about if its got more than two splits, well that’s where the iterative element comes in.

Lets consider the number 12. Its sequence begins as follows:

12: 12, 6 (length 2)

Now we can join to the sequence of 6

6: 6, 3 (length 2)

So we have a combined length of 3, but the sequence is not yet complete, it only goes 12, 6, 3.

Well by converting the above chain joining principal into an iterative process, we can then join this new chain, back to the chain for 3.

3: 3: 3, 10, 5, 16, 8, 4, 2, 1 (length 8)

To get a total chain length of 10.

By joining shorter chains together instead of recalculating each time, the run time goes from 18 down to just 5 seconds. (I have included both solutions in the file below so feel free to test them yourself).

Method:

1) Generate records from 2 – 1,000,000 (no need to include 1 as it's already a complete chain).

2) Use another generate rows to implement the sequence rules; however, when the value dips below the “Input” value, stop computing. Instead, we will use the joining logic discussed earlier.

3) For each starting record, find the number of terms in the chain (count) and the last value calculated in the sequence so far. This should be the first value that dips below the “input”

4) Now we need to join the sequences together until we have the complete chains. To do this, I have used an iterative macro.

a) The first step removed any records which have a “last chain” value of 2. These records have effectively completed the cain (the final steps are 2 and then 1) so we can successfully output them.

b) For all remaining records, we can join back to our original macro input. This will now connect an incomplete sequence with the next section of said sequence. You will notice I have also updated the “Last_chain” value. As we have just joined two chains together, we need to take the “Last_chain” value from the new part of the chain.

c) Add the counts together. Again, we have just joined two chains together so their lengths (counts) must be combined.

d) Simply loop round until all the records have a “Last_Chain” value of 2.

5) Final steps are to sort the chain lengths in descending order, then take the top record.

6) Submit your answer to the Project Euler Website!

Summary:

Even though it may seem like extra work, by creating an iterative macro, we have removed the need to calculate the same section of a chain over and over again, reducing the run time by over 60%.

Want to find out more, follow this link to our introduction post - Euleryx: Let The Games Begin.

Hub119 · ‎10-30-2025

Iterative macros for the win!

Spoiler

Workflow

Spoiler

Collatz Macro

NicoleJ · ‎10-30-2025

I see your iterative macros @Hub119 ... and I raise you a DCloud with no macros for the win!

Spoiler

I did have to split my data stream in two to fit within the sampling row limit without having to generate an output file... but I'm still calling that only a little bit brute force, and a macro-less win. 😋

The second Generate Rows formula was where I applied the iteration logic - I forget sometimes that Generate Rows isn't only for adding 1 or generating dates in between a start and end date. Fun!

I did have to split my data stream in two to fit within the sampling row limit without having to generate an output file... but I'm still calling that only a little bit brute force, and a macro-less win. The second Generate Rows formula was where I applied the iteration logic - I forget sometimes that Generate Rows isn't only for adding 1 or generating dates in between a start and end date. Fun!

Cheers!
NJ
Director, Product Management
Alteryx

DaisukeTsuchiya · ‎10-30-2025

Following the problem statement, I built a simple iterative macro that took 14 seconds to run.

Spoiler

Qiu · ‎10-30-2025

I take the easy way, the Brutal Force way.

Spoiler

AkimasaKajitani · ‎10-31-2025

My solutions are Brute force(31.8sec) with no macro and Iterative Macro solution(23.9sec).

Spoiler

My macro solution is that last remains is my answer. This algorithm is very simple.

gawa · ‎11-02-2025

Mine is 'a bit' wise brute-force that drops the hopeless numbers in advance. This approach enabled me to just examine a half of 1million.

I'm 90% certain about my logic but if you find some flaw, please let me know😂

Spoiler

In all possible sequences, there re some that follows:

N_0(odd) => N_1=3*N_0+1(even) => N_2=(3*N_0+1)/2

If such a N_0 exists, it means N_0 has a longer chain than N_1(and N_1 does than N_2 as well). Hence as far as such a N_0 is examined, N_1 and N_2 are not needed to be examined.
Fr example, starting from 3, it follows sequence
3 => 3*3+1=10 => 10/2=5 =>...

So the sequence of 3 contains that of 10 and 5 hence chain length of 10 and 5 are shorter than that of 3. 10 and 5 can be dropped in advance.

In all possible sequences, there re some that follows:N_0(odd) => N_1=3*N_0+1(even) => N_2=(3*N_0+1)/2If such a N_0 exists, it means N_0 has a longer chain than N_1(and N_1 does than N_2 as well). Hence as far as such a N_0 is examined, N_1 and N_2 are not needed to be examined.Fr example, starting from 3, it follows sequence3 => 3*3+1=10 => 10/2=5 =>...So the sequence of 3 contains that of 10 and 5 hence chain length of 10 and 5 are shorter than that of 3. 10 and 5 can be dropped in advance.

AncientPandaman · ‎11-02-2025

i try to records the old chains join with the current chains (different is records the whole chains) but it work well in python. but not in Alteryx. the Join take so much time in compare the current due to 1million join X million. even after cancel some data.

and interesting that when go through the non-macro ways. it was slower than the macro ways. I thought is always faster than macro method.

Spoiler

the number can be reduce for
3n+1 < 1,000,000 for odd
n/2 < 1,000,000 for even

so it only start from 333,333 and +2 after

macro:

the number can be reduce for3n+1 < 1,000,000 for oddn/2 < 1,000,000 for evenso it only start from 333,333 and +2 after macro:

i also test non-macro, macro, multiple-macro

Spoiler

runtime:
non-macro : ~50 seconds
macro : ~23 seconds
2 macro : ~9 seconds

Non-macro

1 Macro

2 macro

I know multiple macro will save some time. but more than half time is surprise me.

runtime:non-macro : ~50 seconds macro : ~23 seconds2 macro : ~9 secondsNon-macro 1 Macro 2 macro I know multiple macro will save some time. but more than half time is surprise me.

since it divide 2, so I look for binary also. where it interesting to where it have 0 between the gap.

General Discussions

Euleryx Project 14 – Longest Collatz Sequence

Re: Advent of Code 2025 Day 12 (BaseA Style)

Re: Advent of Code 2025 Day 12 (BaseA Style)

Re: Advent of Code 2025 Day 12 (BaseA Style)

Re: Advent of Code 2025 Day 12 (BaseA Style)

Re: Advent of Code 2025 Day 12 (BaseA Style)