Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Featured Ideas

Hello!
I appreciate this is a very underused element of Alteryx Functionality, however, I have noticed a few issues with the description of fields. 

 

Firstly, if you set a description on a field within a select tool:

TheOC_0-1681228654695.png



And then attempt to clear the description later in the workflow (in another select tool), you cannot. When you delete the description, it will clear back to the original value (in this case, 'test'):

TheOC_1-1681228698380.png


This can be easily recreated, and can be more applicable to yxdb outputs that contain the description of fields. In that scenario, you cannot go back to the previous select tool and remove the description. The closest you can come to easily clearing the description is replacing it with a space ' '.

 

As a secondary issue, as current the score tool removes field descriptions and overrides the source. For example if I open the Score tool example workflow, and add a select tool/description:

TheOC_10-1681229323907.png

 


You can see the meta data going into the score tool:

TheOC_8-1681229240520.png

 

But unfortunately the output of the tool looks like:

TheOC_9-1681229254843.png

 

Showing that it has completely removes the descriptions, and also replaced all of the 'source' information. My suggestion for this would be that it would not replace the source information or descriptions.

 

 

Thirdly - and quite a niche issue, but an int64 field specifically will break when the description differs between the data and the model.

Again, easy to recreate within the Ccore tool example workflow. Apply a Select tool to both streams, setting 'First_Years' to an int64. Within the bottom stream (the model creation), set a description, in this case, 'test':

TheOC_11-1681229464488.png

 

Make sure to leave the top streams description blank.

Run the workflow, observe the error:
Error: Score (106): Score: The variable testFirst_Years is missing from the input data stream.
Interestingly, it seems to be using the description as part of the name within the Score tool, which is causing issue when the descriptions differ. My suggestion for this would be that it would not utilise descriptions at all.

 

Kind Regards,

Owen

Hi all!

 

Based on the title, here's some background information: SHAPLEY Values

 

Currently, one way of doing so is to utilize the Python tool to write out the script and install the package. However, this will require running Alteryx as an administrator in order to successfully load, test, and run the script. The problem is, a substantial number of companies do not grant such privileges to their Alteryx teams to run as administrator fully as it will always require admin credentials to log in to even open Alteryx after closing it.

 

I am aware that there is a macro covering SHAP but I've recently tested it and it did not work as intended, plus it covers non-categorical values as determinants only, thereby requiring a conversion of categorical variables into numeric categories or binary categories. 

 

It will be nice to have a built in Alteryx ML tool that does this analysis and produces a graph akin to a heat map that showcases the values like below:

caltang_0-1680442322684.png

 

By doing so, it adds more value to the ML suite and actually helps convince companies to get it.

 

Otherwise teams will just use Python and be done with it, leaving only Alteryx as the clean-up ETL tool. It leaves much to be desired, and can leave some teams hanging.

 

I hope for some consideration on this - thank you.

 

Alteryx hosting CRAN

 

Installing R packages in Alteryx has been a tricky issue with many posts over the years and it fundamentally boils down to the way the install.packages() function is used; I've made a detailed post on the subject. There is a way that Alteryx can help remedy the compatibility challenge between their updates of Predictive Tools and the ever-changing landscape that is open-source development. That way is for Alteryx to host their own CRAN!

 

The current version of Alteryx runs R 4.1.3, which is considered an 'old release', and there are over 18,000 packages on CRAN for this version of R. By the time you read this post, there is likely a newer version of one of these packages that the package author has submitted to the R Foundation's CRAN. There is also a good chance that package isn't compatible with any Alteryx tool that uses R. What if you need that package for a macro you've downloaded? How do you get the old version, the one that is compatible? This is where Alteryx hosting CRAN comes into full fruition.

 

Alteryx can host their own CRAN, one that is not updated by one of many package authors throughout its history, and the packages will remain unchanged and compatible with the version of Predictive Tools that is released. All we need to do as Alteryx users is direct install.packages() to the Alteryx CRAN to get our new packages, like so,

 

 

install.packages(pkg_name, repo = "https://cran.alteryx.com")

 

 

 

There is a R package to create a CRAN directory, so Alteryx can get R to do the legwork for them. Here is a way of using the miniCRAN package,

 

 

library(miniCRAN)
library(tools)
path2CRAN <- "/local/path/to/CRAN"
ver <- paste(R.version$major, strsplit(R.version$minor, "\\.")[[1]][1], sep = ".") # ver = 4.1
repo <- "https://cran.r-project.org" # R Foundation's CRAN
m <- available.packages() # a matrix of all packages and their meta data from repo
pkgs4CRAN <- m[,"Package"] # character vector of all packages from repo
makeRepo(pkgs = pkgs4CRAN, path = path2CRAN, type = c("win.binary", "source"), repos = repo) # makes the local repo
write_PACKAGES(paste(path2CRAN, "bin/windows/contrib", ver, sep = "/"), type = "win.binary") # creates the PACKAGES file for package binaries
write_PACKAGES(paste(path2CRAN, "src/contrib", sep = "/"), type = "source") # creates the PACKAGES files for package sources

 

 

It will create a directory structure that replicates R Foundation's CRAN, but just for the version that Alteryx uses, 4.1/. 

 

Alteryx can create the CRAN, host it to somewhere meaningful (like https://cran.alteryx.com), update Predictive Tools to use the packages downloaded with the script above and then release the new version of Predictive Tools and announce the CRAN. Users like me and you just need to tell the R Tool (for example) to install from the Alteryx repo rather than any others, which may have package dependency conflicts.

 

This is future-proof too. Let's say Alteryx decide to release a new version of Designer and Predictive Tools based on R 4.2.2. What do they do? Download R 4.2.2, run the above script, it'll create a new directory called 4.2/, update Predictive Tools to work with R 4.2.2 and the packages in their CRAN, host the 4.2/ directory to their CRAN and then release the new version of Designer and Predictive Tools.

 

Simple!

Hello!

I remember a while ago running into a peculiar error:
'The R.exe exit code (4294967295) indicted an error'. This was peculiar, as the data output was still seemingly correct, however, the error made me double-check the community for answers.

 

There are some very technical sources here:
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/R-tool-Fake-Errors/td-p/25163
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Boosted-Model-Error/td-p/5509

but in short, this seems to be caused by a return code from C++ libraries, being understood by R as an error. Its a very inconsistent error, typically caused by low memory. This creates what most call a 'fake error' - the code runs perfectly fine, but seems to produce an error that doesn't actually indicate anything wrong.

 

Within those threads, its also stated that calling the garbage collection function (gc()) does tend to solve the problem on R exit, however this requires a user to understand basic R, and have access to the macro to be able to change the code - thus making predictive analytics more intimidating than it already is for new Alteryx users.

 

The first occurrence of this error seems to be way back in 2015, however the error is still being reported by users (see posts from 2020 and 2021):
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Password-protected-Excel-files-R-solut...
https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Error-The-R-exe-exit-code-n-indicat... 

An important issue of these 'fake errors', is not only that they cause confusion, but also that they will cause analytic apps and server workflows to not work as expected, and stop running depending on the configuration.

 

My suggestion would be to revisit this issue, as by my understanding it occurs inconsistently, and calling garbage collection does not always seem to fix it. Even if the Error message is still created, it may be worth Alteryx suppressing these errors, in the case they are not real errors.

 

 

Steps to reproduce:

(as mentioned, its very inconsistent)

1. Open the Boosted Model example workflow

2. *10 the number of maximum trees in the model, in the boosted model configuration (Model customization)

3. Run the workflow, inspect the results (which are seemingly correct), and the error message in the results window.

 

TheOC_0-1647261720754.png

 

 

Hope this helps!
TheOC

0 Likes

Hi,

My boss and I uses a MB Rules Tool.
He is in trouble because the settings are not what he wants.
It seems that many combinations will come out.

Control Parameters
The allowable minimum number of items in a rule or itemset.


I would like you to inprove or add this parameter to "Maximize".
Modeler seems to be able to set the maximum.

Hi Alteryx,

 

Can we get the R tools/models to work in database for SNOWFLAKE.

In-Database Overview | Alteryx Help

 

I understand that Snowflake currently doesn't support R through their UDFs yet; therefore, you might be waiting for them to add it.

I hear Python is coming soon, which is good & Java already available..

 

However, what about the ‘DPLYR’ package? https://db.rstudio.com/r-packages/dplyr/

My understanding is that this can translate the R code into SQL, so it can run in-DB?

https://docs.snowflake.com/en/release-notes/2015-09.html#snowflake-extension-for-dplyr-pre-productio...

Could this R code package be appended to the Alteryx R models? (maybe this isn’t possible, but wanted ask).

 

Many Thanks,

 

Chris

 

0 Likes

Hi,

 

Can we please make the TS Model Factory customisable to both ARIMA and ETS? I understand that currently it is using auto.arima for R, it would be nice to add the option to customise p,q,P and Q.

 

Thank you.

0 Likes

I was able to add the following lines of R script to get the importance of the variables used in the cluster analysis. This will allow the user to see what variables are important in determining the clusters they have. 

 

The script I added is below. It is pretty basic and could used spruced up by an Alteryx engineer as far as column naming,  accounting for contingencies, and making it a reporting function.  I think this would be a valuable feature for future versions of this tool. 

library(FeatureImpCluster)#load library
FeatureImp_res <- FeatureImpCluster(clus.sol,as.data.table(the.matrix)) #Use FeatureImpCluster to take the cluster model (clus.sol) and data (the.matrix) to get variable imp.
FeatureImp_df <- as.data.frame(FeatureImp_res$featureImp) #turns features from a list to dataframe
FeatureImp_df_rn <- tibble::rownames_to_column(FeatureImp_df, "Variable") #Adds the variable name to the importance scores
write.Alteryx(FeatureImp_df_rn, 3) #outputs dataframe in output #3

 

0 Likes

Allow the direct entry of the test start date into the AB Trend tool rather than forcing the use of a calendar widget to select the date. Or at least make the calendar display the date that was set rather then the current month. 

 

treepruner_0-1607224843034.png

 

 

It would also be useful to edit the sample workflow for this tool , sample 21_AB_Trend_Controls_Analysis_Sample.yxmd, to add annotations to each tool to  describe what the settings are in the example as well as to add more specific detail to the workflow description.

0 Likes

It would be great if it was possible to output the top most influential features in producing the score for each individual entity/row when using the predictive and machine learning tools.

 

Similar to the way they work in DataRobot. Details here and here.

 

This would enable some simple interpretation of how a model came to an individual prediction and the most important features in that particular row/case.

Sometimes, as a sanity check, I would like to be able to model only the mean of my data set, i.e. I would like to use a predictive tool with no predictors included. The result would be a model with only an intercept, and this value would be the mean of the target variable. This would not be an important feature for final models, of course, but when starting to look at a data set and build up a model, it can be useful to first ensure the model is producing the expected output in the simplest case. 

 

Note, this can be achieved when just one predictor is included, but it takes some math (see below), so it would be nice to be able to have this as a built-in option.

 

Kenda_0-1594148666258.png

 

I'm really liking the new assisted modelling capabilities released in 2020.2, but it should not error if the data contains: spatial, blob, date, datetime, or datetime types.

 

This is essentially telling the user to add an extra step of adding a select before the assisted modelling tool and then a join after the models. I think the tool should be able to read in and through these field types (especially dates) and just not use them in any of the modelling.

 

An even better enhancement would be to transform date as part of the assisted modelling into something usable for the modelling (season, month, day of week, etc.)

 

joe_lipski_0-1593515364178.png

 

I would like to request that the Python tool metadata either be automatically populated after the code has run once, or a simple line of code added in the tool to output the metadata. Also, the metadata needs to be cached just like all of the other tools. 

 

As it sits now, the Python tool is nearly unusable in a larger workflow. This is because it does not save or pass metadata in a workflow. Most other tools cache temporary metadata and pass it on to the next tool in line. This allows for things like selecting columns and seeing previews before the workflow is run.

 

Each time an edit is made to the workflow, the workflow must be re-run to update everything downstream of the Python tool. As you can imagine, this can get tedious (unusable) in larger workflows.

 

Alteryx support has replied with "this is expected behavior" and "It is giving that error because Alteryx is 
doing a soft push for the metadata but unfortunately it is as designed."

0 Likes

Could you please support Redshift as well?

 

https://help.alteryx.com/2018.2/PredictiveAnalytics.htm

 
0 Likes

When I use the PCA tool, I run it with 2 PCs, then look at the results output to choose my principal components, and re-run it with the actual number of PCs that I need. I use the loadings and variance data quite a lot - it would be great to be able to output the loadings, variance, and also the scaled variables as data to work with in Alteryx, rather than just browse it in the report.

 

Gwilym_0-1588787962265.png

 

I have modified the PCA tool to do this myself, but I find I need to do this with each upgrade just in case anything has changed in the tool. I'd love it if the report summary data and scaled data was available as an output!

 

For reference, my amended version is available here:

https://gallery.alteryx.com/#!app/Principal-Components-Analysis-Extra/5eb2f79d0462d70bc0b6c516

0 Likes

Linear Regression Tool errors out with my data set if I sample more than 1 in every 31 cases. The sample size error-out is very consistent, despite the fact that different R error messages filter up with different runs. Support recommended small sample for the predictive tool then submit all data to Score. That's backwards of my need, which is to submit detail data to the predictive tool to create as precise a model as possible then apply that model to predicting a smaller set of future case outcomes. 

 

I used version 2019.4.8.22007. My full data set had 15.46 million rows, and one string field (which is necessary) accounts for the majority of predictors submitted to the model. I ran it from the Desktop version. The PC had 64 GB RAM and I even changed the default Virtual Memory settings in hopes that'd help.

Hello,

 

as shown in the Alteryx Inspire Demo, Assisted modeling is going to work with a wizard and generate several tools as result.

The data evaluation functions and feature engineering assist however would be extremely useful tools in their own, is there any chance we can use them as separate tools in the upcoming version?

 

Thanks in advance!

0 Likes

R has a very large number of useful packages and examples.  Often, we only need a few lines of R code.  However, integrating that with the data flow in Alteryx can be complex.  It would be ideal if there was a tool where you could drop in R code, and have the tool create named inputs and outputs for each variable in the R code, and create blank text documents or YXDBs with the correct column names and variable types.  This seems like it could be automated, and would eliminate a lot of trial and error in using small pieces of R code for specialty tasks.

Add a new feature to develop your own customized decision tree with Insight. So instead of using a tree generated with the Decision Tree tool a user can generate a tree with custom splits and save  the splitting rules as a model to score later a new dataset. This will provide user the ability to enhace a tree with business knowledge.

When I checked out if there is a way to force the intercept to 0 or some other constant value, it's not doable from the tool configurations.

Linear Regression.png

 

So manually we can have "0" intercept in linear regression this way (edit the macro's R tool)

 

cars.lm <- lm(dist ~ speed, data = cars)
cars.lm2 <- lm(dist ~ 0 + speed, data = cars) 
summary(cars.lm)

# Adding the 0 term tells the lm() to fit the line through the origin

 

So a minor addition of a tickbox will solve that and make that linear regression tool more flexible i guess...

 

Best...  

Top Liked Authors