This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I want to build a macro that would work on a variety of files.
The goal is to select a random sample from a dataset, let users choose what % of random rows to select in each random sample, choose what columns to use as response and group (also enter the control variable when needed) in the test of means tool, and then enter the number of times the process repeats. I want to count the number of times the test appears to be significant with a p-value threshold.
I started building the macro but it was not making sense. Would this be a batch macro, or and iterative?
The baseline of what I am trying to do looks like the image below:
(The select tool is only converting the "p-value" column into a double.)
How would I build this macro, if anyone could help, it'd be greatly appreciated. Thank you in advance.
It sounds to me from what you're describing is that you're looking to build an analytic app with a number of user inputs to change the default values in your workflow (example shown below)
I'm unfamiliar with the test of means tool, but you can actually see that it is a macro in itself.
My example creates user inputs for the 1st 3 criteria you mentioned (is the input file static or do you want the user to pick that too?)
Can you explain a bit more what you mean by the number of times the process repeats, as this could determine if you also need to wrap it in a macro.
If you just need to run the same workflow a number of times and change one or more things each time, like parameters or input files, you can do it with a batch macro, but if you need the output from each iteration to be used as the input for the next iteration until a certain criteria is met, then you need an iterative macro.
The test of means tool is simple, the user selects a numeric column as the response, and a grouping variable for control vs treatment that is categorical. So the input file can be almost any data files that contains at least one string column and one numeric column. The number of columns I am not concerned about. The users will pick which columns out of the data file to use.
I am needing a macro that will repeat the process of randomly selecting a sample from the data set (per user's input as to how large the sample may be), and then test the means. The output on the report node from the test of means tool gives a column called "p-value". And I want the process to tell me if the users chose to sample 30 times, how many times were the p-value less than 0.05.
Hope that clarified some. I got as far as making a regular macro, but didn't have an idea as to how I can make the process repeat based on a number the user specifies.