Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Questions about random seed

Gualigee
8 - Asteroid

Gualigee_0-1651965084741.png

if I choose 0, each time I run the workflow, the training and validation data set should be different, but the attached workflow shows the same results. Likewise, if I choose a number other than 0, each time I run the workflow, the training and validation dataset should be the same, but the attached workflow shows the same results. can you please advise?

8 REPLIES 8
IraWatt
17 - Castor
17 - Castor

Hey @Gualigee,

I've tried the attached workflow and It generates the same validation and training sets each time for me. Were you saying that this workflow wasn't doing this for you?

Gualigee
8 - Asteroid

hi@IraWatt, my question is if i set the seed=0, each time you run the workflow, you should get a different validation and training sets. but actually, we don't, right? I wonder why this is. Thanks.

 

apathetichell
18 - Pollux

If you set the seed manually to a static number - your training and test data sets will be the same each time.

 

https://r-coder.com/set-seed-r/

Gualigee
8 - Asteroid

Hi @apathetichell, if i set the seed manually to 0,  the training and test data sets will be different each time, right? But mine is the same,

apathetichell
18 - Pollux

No - if you set a seed you will get the same dataset after it is run - see the link I sent.

  1. The state of the random number generator is stored in .Random.seed (in the global environment). It is a vector of integers which length depends on the generator.
  2. If the seed is not specified, R uses the clock of the system to establish one.

Run again the previous example where we sampled five random numbers from a Normal distribution, but now specify a seed before:

# Specify any integer
set.seed(1) 

rnorm(5) # -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078
 

If you execute the previous code, you will obtain the same output. However, note that if you run rnorm(5) twice, it gives different results:'

 

 

in the context of "twice" this would mean executing the random seed twice in an R tool which cannot be done out of the box in Alteryx.

Gualigee
8 - Asteroid

Thank you for your clarification, @apathetichell 

 

then what is the difference of setting the reed=1 and =3, I did see the results are different. but what is the purpose of setting a different number? Thank you. 

apathetichell
18 - Pollux

the difference is setting a seed vs not setting a seed - if you are setting a seed - you can choose any number there - but if you are troubleshooting or want a static dataset or want to collaborate with your team on the same data - communication of what the seed you are setting is key.

 

setseed(1) is different than setseed(2) - but the differences don't matter. your model should be as effective (basically) vs either seed. If not you have a model problem not a seed problem.

Gualigee
8 - Asteroid

Thank you for your classification. @apathetichell 

Labels