This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Today when we install custom tools that use DLLs, the DLLs must be placed in the Plugins folder inside the Alteryx installation directory. This requires a second step after the YXI installer runs. I would like to be able to package the DLL with the YXI installer and Alteryx will search for the DLL inside the tool's directory, just the same as what happens with custom Python tools. This will allow custom tools that use DLLs to be installed just as easily as the 1-step installation process for Python tools.
For example, this today does not work, but I want it to:
Please enhance the dynamic select to allow for dynamic change data type too. The use case can be by formula or update in an action for a macro. If you've ever wanted to mass change or take precision action in a macro, you're forced to use a multi-field formula. It would be rather helpful and appreciated.
Can we have an option to save a workflow in a prior version for backward compatibility? I think Tableau offers this functionality.
If I have 2019.4.8 and a colleague has 2019.1.x, I cannot share my workflows because my colleague will receive a notice that the workflow was built in a newer version. I want to be able to save my workflow in 2019.1.x and send to my colleague.
This is predicated on the workflow not containing any tools/features not present in the older version. In that case, give me a warning about the specific tools/features that are not backward compatible. Thank you.
The Alteryx Python tool currently throws an error if the inbound record set has zero rows (screenshot 1).
In order to manage that - you need to create try-except block around the Alteryx.read that instead creates an empty record set data frame. (screenshot 2). This is inefficient because every time you change the canvas before the python tool, you need to re-code a static field list into the try-except block (i.e. you can no-longer deal with variable fields)
Please could you change the Alteryx.read method to create a zero-record dataframe with the correct column names if the input is zero-length?
Environment variables act as a shortcut so that different computers can be configured in different ways, but a particular path will still point to the right place.
For example if you open up explorer and go to %TEMP%\ - you will open up whichever folder is set up as Temp on this machine. This is super useful so that you can use a particular logical folder without knowing the actual placement on every machine (for example the Windows Directory)
This works partially in the Directory / input - when you put in the environment variable, it is able to search possible subdirectories (screenshot 1) but it does not work once you run the workflow (screenshot 2).
It seems as if the designer hits the Windows API directly, but it does not work within the engine.
Please could you alter the engine to be able to make full use of the environment variables on the machine in question in the directory path or input tool path?
With an increasing number of different projects, involving different machine learning models, it's becoming difficult to manage different package versions across workflows. Currently, the Python tool has a single virtual environment, so we need to develop models in different projects always using the same Python and package versions as the Python tool venv. While this doesn't bother the code itself too much, it becomes a problem as soon as we store and load pickled models, which are sensitive to even minor changes in packages.
This is even more so a problem when we are working on the Alteryx server, where different teams might use different packages. Currently, there is only the server admin who can install packages on the server and there can only be one version per package.
So, a more robust venv management in the Python tool would be much appreciated!
Hello! I use Alteryx for lots of spatial data blending, and 99% of the time that works perfectly. However, when I try to analyze data in Puerto Rico or some other USA territories, Alteryx cannot read the data and I have to reproject it in another tool so it can. Can support for all common projections be added to Alteryx so the data can be read in natively? The GDAL Python code data file "gcs.csv" contains all of the projections I think would be needed.
It appears that the Workflow Dependencies window does not report dependencies from all tools. In the example image, you can see that the file input from the Amazon S3 Download tool is not listed. Some tools may have dependencies that do not easily fit the current field structure of the window, but maybe the input/download tools could be listed with an asterisk or partial reference.
I know cache-related ideas have already been posted (cache macros; cache tools), but I would like it if cache were simply built into every tool, similar to the way it is on the Input Tool.
During workflow development, I'll run the workflow repeatedly, and especially if there is sizeable data or an R tool involved, it can get really time consuming.
I can see where managing cache could be tricky: in a large workflow processing a lot of data, nobody would want to maintain dozens of copies of that data. But there may be ways of just monitoring changes to the workflow in order to know if something needs to be rebuilt or not: e.g. suppose I cache a Predictive Tool, and then make no changes to any tool preceeding it in the workflow... the next time I run, the engine should be able to look at "cache flags" and/or "modified tool flags" to determine where it should start: basically start at the "furthest along cache" that has no "modified tools" preceeding it.
I would like to see Global Variable being made available in Alteryx. I have seen the Global Constant being made available under Workflow "User" configuration. But this is constant and needs to be defined at Design time.
How about a Process Id that needs to be auto genearted and the same needs to be available across the formula tools used with in the workflow.
In order to perform audit-trail logging - it would be valuable to have 2 new capabilities
a) environment variables which show the workflow name; filepath; version; run start date and time; etc. For any worklows we build, we need to have a solid audit trail to be SOX compliant, so having this detail available as a data field to write and manipulate is essential
b) A logging component. What would be great is a component that you can drop on a workflow, not connected to anything, which is able to trap the start; end; runtime; version; etc of a workflow; and commit this to any output data format (CSV or ODBC etc). This logging tool would need to be able to capture the full runtime, so it would need to be the last thing that runs (which means it may need to exist in parallel to the main workflow in some way). This is not currently possible with a complex workflow with outputs, because it's not possible to identify when the entire workflow ended; or the runtime (since output tools don't have an onward connector to pass flow-of-control to catch the final end-time)
Again, both of these are necessary to meet audit requirements for workflows and prodcution-quality ETLs for BI data warehouses.
A lot of popular machine learning systems use a computer's GPU to speed up some of the math to a huge degree. The header on this article on Medium shows a 15x difference from a high-end CPU vs a high-end GPU. It could also create an improvement in the spatial tools. Perhaps Alteryx should add this functionality in order to speed up these tools, which I can imagine are currently some of the slowest.
Transfer of records from Python SDK RecordRef seems to be slow sending large amounts of data to the Alteryx Engine (e.g. discussion here). Although unclear of the exact specifics, it seems that there's a copy and convert process in play.
Apache Arrow appears to be addressing this issue, and the roadmap and specs are impressive! It seems like (again I have no understanding of the Alteryx Engine specifics) that something like this would be excellent for expanding SDK use cases as well as for other connectors such as the Apache Spark connector.
And it looks like it'd be fun to build into Alteryx! 🙂
When working through a question with our team on how Excel & MS SQL represent dates, we did a quick test and confirmed that SQL and Excel are both storing dates & date-times as a number (technically the offset from a fixed date) which really helps for things like BI applications where a fact table may store a very large number of dates on each record (entered date/time; updated date/time; transaction date/time; etc)
However, when we look at the same in Alteryx, it seems to be storing these dates as plain text (see screenshot below) - meaning that instead of an 8 byte field for every date and datetime; which can be compressed using offset logic like in Parquet, these appear to be represented as a 19 byte field for date-time.
Would it make sense to change the internal representation to a number to make date-offsetting and processing easier (all date-logic then becomes simple addition / subtraction instead of string manipulation)?
Note: You can see this in the screenshot below. the date field has 10 bytes; and the date-time has 19 bytes (where both of these are stored and represented in MSSQL in 8 bytes in total)