Start Free Trial

General Discussions

Discuss any topics that are not product-specific here.

Euleryx Project 3 - Largest Prime Number

Pilsner
13 - Pulsar

Welcome back to Project Euleryx, this time: Largest Prime Number.

 

This week's post is going to be a little longer than previous posts, as we'd like to cover some of the mathematical theory behind our solution. That being said, just because there is some theory behind our approach, doesn’t mean it is the only approach that works, so please feel free to share your thoughts below.  
 
 

Euleryx Problem 3 - Largest Prime Number

The challenge is to find the largest prime factor of the number 600851475143. 

 

question.png

 

Here's my full workflow and answer:

Spoiler
Workflow:
workflow.png

Macro:
macro.png



Answer: 6857

Last Weeks Favourite Solution:
Another tough decision for our favourite solution so we have gone with a tie. Both @CoG and @gawa provided excellent solutions. Calculating the neccesary terms only and storing variables within a single column in a generate rows was genius. Checkout there solutions on page one of last weeks challenge here.

Definitions: 
 

  • Factor: A number that divides another number exactly (i.e. no remainder or decimal) 
  • Prime Numbers: A number with exactly two unique factors: 1 and itself.  

 

Mathematical Theory: 
 

 

DougMacleanNhlGIF (2).gif

 

  

During this problem, we exploited what’s called the Fundamental Theorem of Arithmetic (FTA), which states: 
 

“Every positive integer (whole number) can be written as a product of prime numbers, or the number itself is prime.” 

e.g.  
If we take the number 30, we can write this as 2 x 3 x 5 

 

 

The Method to our Madness (Factor Trees): 

 

We decided that to tackle this problem, we would list all the prime factors of the number, then simply take the largest. The tricky part here was creating that list of prime numbers.  

 

 

ObviouslySnapeGIF.gif

  

Using the Fundamental Theorem of Arithmetic, we know that 600851475143, can be written as a product of its prime factors. But how do we find these prime factors? We decided to use what's called a factor tree. 

 

A factor tree works as follows: 
 
1) Identify any 2 factors of your number. 

2) If the number is prime, circle it. 

3) If the number is not prime, thanks to the FTA, we know that it too can be expressed as a product of its primes, so, repeat steps 1 and 2 for each uncircled number. 

4) Stop when there are no more uncircled numbers at the bottom of the tree.  

 

e.g. for the number 30, the tree would grow as follows: 

First step: 

 

1.png

 

Second step: 

 

2.png

 

Therefore for we can see, 30 can be written as 5 x 2 x 3 

 

In case you’re worried that you didn’t think of 5 and 6 to begin with, rest assured, it doesn’t matter which two numbers you choose, as long as they make up a factor pair. Here are some other examples of possible factor trees for the number 30: 

 

 

3.png

All of which lead to the same answer. With this in mind, the only thing left was to create a workflow in Alteryx, that can create these factor trees.  

 

Setting up the Workflow: 

 

 

UAQ7K4Gb1dXn.gif

Whilst all this theory is great, we still need a way to get the computer to do the hard part. If it wasn’t apparent from the above, we will be using an iterative macro to solve this problem. So, the first step is to decide what information we need to be passed between each iteration.  
 
For this question, we fed 3 columns into the macro, as shown in the image below. You will notice we have two starting numbers to help illustrate the workflows process, as we step through it: 

 

4.png

The Iterative Macro: 

ChandlerBingMatthewPerryGIF.gif

  

Yes, it's time for the tough part, but fear not, let's break the steps down together. On each iteration, our goal is: 
 

  • To find one prime factor 
  • Add this to the “List of Prime Factors” Column 
  • Update the number field to be the factor pair of the identified prime number.  

i.e. at the start of the first iteration: 

5.png 

vs the end of the first iteration (on the loop anchor). Notice [Number] * [List of Prime Factors] = [Starting Number] 

6.png


This is achieved as follows: 

Spoiler
  1.  Use a generate rows to create a list of numbers until it reaches an integer factor. (This only works for odd numbers > 1, we deal with the others soon.) The integer it stops on will be the Lowest prime factor of the value in the “Number column”. 

    Before generate rows:
     
    5.png

    After generate rows:
    7.png

     
  2. Using the Summarise tool, find the maximum value in the list for each “Number”, as this is the lowest prime factor.  (Note: We take the maximum value as the generate rows stopped once it found the first prime factor, hence all smaller values are not factors).
    8.png



  3. Now we deal with the remaining values (even numbers or the number 1)  
    Here we check if a number is even or not: 

    1. If it is even, then set the lowest Multiple = 2. 
    2. Otherwise, filter the value out of the loop, and towards the macro output. 

      9.png

  4. Union all the records with prime factors, back together and update the List of Prime Numbers column:
    10.png

  5. Filter out any rows where the Lowest Factor = the Number column, as this means all prime factors of the starting value have been found. Output these rows of data from the macro.

    For records still in the loop, we must update the three original columns ready for the second iteration. 
    11.png

  6. Update the number column by doing the [Number] / [Lowest Factor]. (This step effectively creates the other branch of your factor tree.)  

    12.png

    Without this step, each iteration would be passing the same number through, so it's vital that we remember this part 
    13.png




  7. Drop any columns created during that iteration and feed the results into the loop output.
    14.png


     
  8. Relax as the iterative macro is complete: 
    MinionGIF.gif



 

 

Final Steps: 

Running this macro should give you the following output.  

 

15.png

 

As you can see, all the prime factors are listed in the “List of Prime Factors” column, with a comma separating them. It's important to note that this list may include duplicates, for example, if you started with the number 4, the “List of all prime Factors” should be “2,2” as 2 x 2 = 4. 

 

ohhp.gif

 

To find the largest prime factor for each starting value, simply parse out the “List of Prime Factors” columns, sort the results descending, and select the top record per “Starting Number”: 

 

 

16.png

 

Now, all that's left to do is submit your answer to the Project Euler website

Summary: 

 

By using Alteryx to Iteratively create a factor tree, one branch at a time, we have been able to reduce any number down to its prime factors. Thanks to the Fundamental Theorem of Arithmetic, we know that the largest prime number from our list, is indeed the largest prime factor of 600851475143. 

Want to find out more, follow this link to our introduction post - Euleryx: Let The Games Begin.

 

15 REPLIES 15
BS_THE_ANALYST
15 - Aurora
15 - Aurora

@Pilsner here we go! Prime factors .. I already know this is ramping up 😂😏🔥!! Can't wait to see the solutions for this one. I'll be getting on this later 😈.

All the best,
BS

LinkedIN

Bulien
CoG
14 - Magnetar

Still committed to String Coding!

 

 

Spoiler

There are two optimizations that we can make:

1. For a given number, pull all copies of a factor out immediately.

2. If we remove prime factors from smallest to largest, then we know that the next factor has to be greater than the last (which reduces our search domain)

The other useful constraint is that we do not need to know all factors, only the largest prime factor. Our logic can thankfully be simplified then since we don't need to keep track of every factor we identify.


We can thus construct our string for the Generate Rows Tool as follows:
[Candidate #],[Target]

For this particular Project Euler problem we initialize to: 2,600851475143

We'll know we're done when the [Candidate #] >= the [Target], since this implies that no value below the target divides the target (aka the number is prime)

The iteration logic is two-fold:
1. If the current [Candidate #] divides the [Target], then keep the [Candidate #] the same for the next iteration, but update the [Target] to [Target]/[Candidate #]
2. Otherwise, increment [Candidate #] by 1, and keep [Target] the same for the next iteration

Once this process completes, we can get the last row from the Generate Rows Tool's output, and extract the [Target] value as our answer!

Screenshot 0003-1.png
Screenshot 0003-2.png

 

Happy Solving, Y'all!

 

Carolyn
12 - Quasar
12 - Quasar

This is definitely the case where I didn't get the fewest tools but I'm happy with mine.


I used a Generate Rows to find all possible factors of an input number from 1 to the square root of the number. The reason that I did the square root is because this will give me half of all the factors. I can then take the factors, divide them into the starting number, and get the matching factor.


For ex, if I'm trying to find all the factors of 99, I only need to find the factors from 1 to sqrt(99). Once I find 3, I can use it to find 33 by going 99/3. Once I find 9, I can do 99/9 = 11 to get its matching factor.


---


I built a macro that could identify if a number is prime or not. I have a hunch that I'll be happy to have that as the challenge continues.


The way the macro works is:

1. Generate Rows for all numbers from 1 to square root of the number
2. Find if those numbers are factors of the starting number by using the mod function
3. Count the number of factors
4. If the count is 1, then the number is prime. If it's greater than 1, then it's not prime


---


Then what I did was find all the factors of the giant number, and evaluate if they're prime or not. Then I found the highest.

 

image.png

gawa
16 - Nebula
16 - Nebula

Similar approach to @CoG .

 

Spoiler
(1) Prepare data in format of [Candidate Max Prime]_[Examined Factor]
- At the very first loop initiation,[Candidate Max Prime] is the given number '600851475143'.

(2) Check Mod([Candidate Max Prime], [Examined Factor]):
if mod=0, update [Candidate Max Prime] to [Candidate Max Prime]/[Examined Factor] and reset [Examined Factor] to 2
if mod!=0, increment [Examined Factor] by 1(if current value is 2) or 2(else)
*We can skip even numbers except for 2 (4, 6, 8....) since they never be prime number. Except for 2, prime number is always odd number.

(3)Repeat (2) until [Candidate Max Prime]=[Examined Factor]
image.png

For reference, RSA encryption relies on the mathematical properties of prime numbers to ensure secure data transmission over the internet. https://en.wikipedia.org/wiki/RSA_cryptosystem

While multiplying two prime numbers is easy and fast, factoring a large number back into its prime components is extremely difficult.

If you're feeling brave, try factoring this number: 10142789312725007.
(My computer gave up after running for 3 minutes.)😂

 

caltang
17 - Castor
17 - Castor

I did it!

 

Spoiler
image.png
Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
caltang
17 - Castor
17 - Castor

For anyone interested in my workflow, here is what I did. Took me awhile to consolidate my thinking process...

 

SPOILER tag!

 

Spoiler
Step 1
First step is to create a text input tool with the [Largest Prime Number] at 600851475143 as per the challenge. Making it a field is easier to reference later.

Step 2
Second step is to generate rows! The reason for doing so is for us to check for divisibility by having all values listed out. So... we can then check for each divisor, check if it divides the number evenly (i.e., remainder is 0) and update the number by dividing it if it’s a factor.

So, we create a new field in Generate Rows (5) called [Divider], with an initialization expression of 2 (first prime tested). If you look at the Condition Expression:
[Divider] <= CEIL(SQRT([Largest Prime Number]))

 

This condition ensures the divisor doesn’t exceed the current number being factored, and we loop it by:

[Divider] + 1


This loops until we get 775,146 records of data. We have these conditions to avoid generating unnecessary rows and improve efficiency, we only need to test divisors up to the square root of the number.


Step 3
After having all those rows, we can now check for divisibility. We do this with Formula (6)...

We create two new fields:

[IsFactor] to check if the divisor is a factor

IF MOD([Largest Prime Number],[Divider]) = 0 
THEN 1 
ELSE 0 
ENDIF


[New_Value] to update the number if a factor is found

IF [IsFactor] 
THEN [Largest Prime Number] / [Divider] 
ELSE [Largest Prime Number] 
ENDIF

 
Step 4
Now that we have the IsFactor ready and the New_Value calculated... we can simply just filter out the [IsFactor] = 1 (True rows). This keeps only the rows where Divisor divides the largest prime number evenly. We do this with Filter (7).

Step 5
Now we need to check for PRIME factors! So... we need to verify that each factor is not divisible by any number less than itself (other than 1 of course). So we add another generate rows!

We create a new field called [IsReallyFactor] again with an initialization expression of 2. Then we set the condition.. similar to before:

[IsReallyFactor] <= SQRT([Divider])


...and we loop it similar to before:

[IsReallyFactor] + 1

 

Once we have looped it, we will get 1,413 records for us to check against. We then use Formula (9) to perform the check with a new field called [IsReallyReallyFactor] with the expression:

IF MOD([Divider],[IsReallyFactor]) = 0
THEN 1
ELSE 0
ENDIF


So we then have all the real factors lined up with 1s and 0s.

Step 6
We now just need to group by the Dividers and sum up the [IsReallyReallyFactor] field! A prime factor will have Sum_IsReallyReallyFactor = 0 (since it’s only divisible by 1 and itself, and we start testing from 2).

 

image.png

Then from there, we just filter out Sum_IsReallyReallyFactor = 0 to get all our PRIME FACTORS!
image.png

Final notes
The naming convention was easier for me to follow since I was doing it quickly... but I hope the above helps. 

This was really fun @Pilsner !
 

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
Pilsner
13 - Pulsar

These solutions are brilliant, loved reading through the write-ups, and I still can’t quite believe the 3-tool solutions!

I never knew about RSA; that’s really helped my understanding of encrypted connections, the "handshake" certainly makes more sence now. I’ll be reading up on this a bit more, thanks @gawa 

@Carolyn  - you might be right, I'm pretty certain going to be useful to have a prime number macro. 😄


Carolyn
12 - Quasar
12 - Quasar

@caltang - Your field names 🤣 That looks like my AoC fields. "ActualRowCheckThisOneISwear" 🤣

Yoshiro_Fujimori
15 - Aurora
15 - Aurora

My solution.

 

Spoiler

Main workflow
Euleryx_3_main.png
Macro
Euleryx_3_iter.png

Output from the macro

a
1471
6857
839
71

1471 * 6857 * 839 * 71 = 600851475143. OK!


Explanation
The idea is to output the prime factors as soon as it is found, and pass the remaining factors to the next iteration.

 

For example, say the input is 24, set the input data as below.

a

b

c

isPrime

24

24

1

 

 

Generate rows for each row with [b] = sqrt([a]) ~ 2.

a

b

c

isPrime

24

5

1

 

24

4

1

 

24

3

1

 

24

2

1

 

 

Fill in [c] with the quotient of [a] divided by [b] only if it is divisible. Otherwise set 0.

If [c] in all rows are 0 in a group of [a], it is a prime factor and the number is sent out of the macro.

 

The rows successfully divided are sent to the next iteration.

At the end of 1st iteration, the iteration output looks like below;

a

b

c

isPrime

4

4

1

 

6

6

1

 


At the end of 2nd Iteration;

a

b

c

isPrime

2

2

1

 

2

2

1

 

2

2

1

 

3

3

1

 

They are sent out of the macro at the end of 3rd iteration, because they are all prime factors.

 

 

 

Labels
Top Solution Authors