This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
It would be helpful to be able to embed a macro within my workflows so in the end I have one single file.
Similar to how Excel becomes a macro enabled file, it would be great if the actual macro could be contained in the workflow. As it stands now, the macro that I insert into a workflow is similar to a linked cell in MS Excel that points to another file. If the macro is moved the workflow becomes broken. I often work on a larger workflow that I save locally while developing. Once it's complete, I then save the workflow to a network drive and have to delete the macros and reinsert these. It also makes it challenging if I were to send a workflow to someone else... I will have to give them instructions on which macros to insert and where. Similar to a container, they could be minimized so to speak to their normal icon, and then expanded/opened if any edits were needed....then collapsed when done.
It would be useful to be able to select a single container (containing a data input) or multiple containers using Shift, and run those and only those.
When building a new element to a larger workflow, I often enter a new Input in a new container, the ability to run just that container without having to turn off all my other containers would be really useful in speeding up the start of joining things together.
After using the Text to Columns tool, I generally find myself using a Select tool to get rid of the original field that I split up. Could an option be added in the config to automatically delete this field once it is split to columns?
With the 2019.3 release the summarize tool now includes prefixes for grouped fields. While a nice addition, in application it makes using this data downstream (like joining to other tables) more involved because of needing to remove this prefix.
It would be nice to have this as an option (a checkbox to add/remove prefixes maybe) or just revert back to pre-2019.3 behavior...thanks!
I often encounter situations where I need to apply the same formula to several columns. Doing this requires copy/pasting the formula several times and then updating the variable names in the formula for each output column. I wish there was a built in "Current Output Column" variable so that I could build one formula and use that for each column.
When training people on the use of action tools, something that I always have to hit on is that when you are telling the tool which piece of the XML that you are adjusting, it's sort of difficult to tell what you have selected, and super easy to accidentally select something else.
When you initially select the action to take it's this nice Blue Color. However, it still doesn't feel exactly like you have actually selected anything or told the Action Tool what to do, since it's so easy to just select any other one of these actions.
A slightly different problem is that if you are selecting an action that has been previously configured, it is just this light grey color. So it can be easy to accidentally change your settings because you may not realize it's actually set up.
Here is a recent community post that sort of outlines a few of these problems.
At present, Alteryx allows for users to run 2 versions of Alteryx at once - one installed using the "Admin Installer" and one via the "non-admin installer"
However, in corporate environments, only the Admin Installer can be used (all installers are repackaged for corporate environment / endpoint management)
This leads to a situation were we cannot run two or more different versions of Alteryx on one machine (like you can with Visual Studio or other platforms). This also prevents us from participating in the BETA program because the BETA version would overwrite the users's current version. Finally - this also makes version upgrades more risky since we cannot run the new version in parallel for a period to evaluate and identify any issues.
Request: Please can you change the installer for Alteryx to default to parallel install per version - so that a user can run 2019.1; 2019.2; and 2019.2 BETA on one machine in a way that is fully isolated (i.e. no shared components - have to be able to uninstall one instance cleanly and leave the others in a fully functional state).
I think it would be nice to be able to more easily reorder fields that you're joining by in the Join tool.
For example, I have already joined by CASS_Address and CASS_City. After I did this, I realized I wanted to go back and join on Name, too, and I want that to be first. How the tool is configured now, if I want Name to be first, I must redo all of the drop downs. I would like to be able to add Name to the next set of open drop downs then use some arrow buttons to be able to move them up in the order (similar to the Summarize tool).
The drop-down interface tool currently allows you to allow the user to select field names from a connected tool.
However, a very common use case is not currently supported - select VALUES from a connected too (i.e. the values in a specific column).
There are several workarounds (including chaining the app and using an alteryx DB or transposing values into fields) - however given how common this need is, it seems to be valuable to support this directly.
It will be great to make visibility of workflow execution results to other users in same subscription.
As of now, only schedules are visible to all users in a subscription, but not the workflow execution results executed by a user to other users in same subscription.
This will avoid duplicate execution of same workflow by multiple user in a team as it will provide option to cross check the execution results by other users, if executed already, before execution of same workflow.
When writing a good amount of code, it is easy to get lost in a sea of parentheses. Just when you think you're all done, you get an error that can force you to scour through your code to find the missing, extra, or misplaced parenthesis.
A common feature today is to highlight a parenthesis when its partner is clicked on. This instantly lets you know if you have the wrong number of them and where.
I didn't think this was that important early on in Alteryx, at least for me. Formulas were meant to be short and easily readable at a glance. Now as I dig deeper, there's R, Python, SQL and other text-heavy inputs.
I don't need a full-fledged text editor in Alteryx, but I would love some quality of life features like parentheses matching.
When the Python Tool operates, it seems to always ingest all the data before processing any of it (i.e. no batch processing). Python can handle this type of functionality with generators, can we update the tool so that it may do some preprocessing (like imports and data prep) and allow a defined generator function to be called repeatedly from a separate input handle and provide batch data frames on output for more parallel-like processing of data?
The Python Tool could be updated as such:
Multi-Input - Same functionality as now, and also allow this data to be used for preprocessing and setting up the Python functions and a single batch function.
Data Input - Ingests data in batches (as most other tools operate) where each batch passes in a dataframe (in this case, a subset of processed entries) into an existing Python function (with a name that is in globals()), and returns another dataframe with that desired output. This can give the option of adding/removing rows as necessary to a subset of the data.
Data Output - Partial set of data after data processing to allow tools further in the chain to process in parallel.
"On Complete" Multi-Outputs - Same functionality as now, to pass process-complete data to the next tool once all data ingested has been processed. Perhaps give the option to pass the complete set from Data Output.
A simple use-case, if a user wanted to use only the Python Tool:
Let's say a user wants to get all URLs from every post in a thread (containing millions of posts) that are in blacklisted domains.
Data prep that sends the list of blacklisted domains into the Python Tool's Multi-Input handle, and that data is transformed and stored in a set within the Python tool once.
A series of posts (strings) are sent in batches (let's say ~10000) to the Data Input of the Python Tool. The tool calls a defined Python function that extracts all the URLs, and filters those in the blacklist.
That data is then transformed into a DataFrame which is then sent to the Data Output of the Python Tool, and only contains results corresponding to the small batch of posts that were ingested. Alteryx can also use this to track progress during execution.
Once all posts have been processed, one of the Python Tool's Multi-Outputs can return a total count of URLs found that were NOT in the blacklist (sure this can be a part of the Data Output, but just for the sake of this example). Could also be used to trigger "on-complete events."
I know I used the term "generators" above, and the design could probably be simplified to instead call an Alteryx Python function that yields from a function to await input from the next batch to use actual Python generators. However, I feel my initial approach could be thought of as a simpler process since generators are more of an intermediate functionality.
I hope this makes sense and is elaborate enough to pursue. Thanks for the consideration!
I've been dealing with JSON since day one, and to be honest it isn't the best experience I've had.
Converting a hierarchical schema into a tabular one is't a straight forward process, but doing that everyday the old way is time and processing consuming.
What I'm proposing is a tool that can read JSON as input, then display a structural skeleton for the user, or the user can provide such skeleton for the tool, say let's say we have the following input:
now to parse this into a table of menuitems we need to use:
JSON Parse: convert JSON into one long key:value table
TextToColumns: split key into multiple columns
Filter: make sure we only get one level from the tree
CrossTab: Convert it back into a column based key values.
All of this will give us the most primitive table we can have as:
and now if we want to have the parent menu id along side with the menuitems, we will do that again as:
Filter: for parent values only
CrossTab: for parent values into a table
Join: to join Parents with Sub items and add the Parent.Id
Now all of this is done with Concatenating of child items, as cross tab will allow us to only do Concat/First/Last for items with the same grouping values.
And now if we want to process children, count them, or extract their data into another table, we have to add more Filters, more CrossTab and more Joining to get parent IDs for future linking.
So what's I'm proposing?
I'm thinking of a Tool with an interface that give me the ability to choose:
Target Branch: which is the main table to be extracted from the branches, in this case it would be menu->popup->menuitem.
Parent Values: what values to be appended from parents of the previous table, just like menu->Id and others if exist.
Children Data types: selecting the proper and expected data type for children instead of using strings or the existing different columns way.
Children Arrays Process: what to do with children branches? either stopping their process and return them as is (Stringify), exclude or do other process like count.
the tool may extract the structure or let the user input such config as the following:
Or Input the Structure as a YAML formatted config or any other way.
This will allow the user to have a quick native tool that does what he wants as it should, and user can use it as much as he want for children and nested values. you just Stringify and repeat and only parse what you need every time.
I hope you consider this for me to replace tens of macros and tools into single tools such so.
As well as using keyboard shortcuts, many of us are using a mouse / keyboard with program specific assignable shortcut buttons. It is a serious boost to productivity. The ability to instantly enable / disable would be a great tool large complex workflows. In general, it would be great to expand the keyboard shortcuts to offer more Alteryx specific advanced functions.
Alteryx does not currently have to email tool that is configurable to use SMTP Authentication for Microsoft Office 365 or any server requiring authentication. Our office printer can authenticate over SMTP and with TLS enabled why not my Alteryx mail tool - 'mic drop!'.
Further explained, Alteryx is a tool that needs to live within abide by the policies and security standards in the organization not vice versa. Therefore, it shouldn't be a big surprise, or a big ask for that matter, that a mail client should have the ability to authenticate prior to sending email of SMTP. I'm very surprised this tool is so arcane. Please implement quickly. Thank you
At the moment, I have a lovely formatted XLS with corporate branding, logos, filled cells, borders etc. The data from the Alteryx output needs to start in cell B6. I have tried the output tools to this named range, but Alteryx destroys all the Excel formatted cells in the data block.
As a workaround on the forums, many Alteryx users pump out to a hidden "Output" tab, and then code =OutputA1 in the formatted sheet. This looks messy to the users who then go hunting for the hidden tab. Personally I end up pumping the workflow out to a temporary CSV file. Then opening that in Excel, selecting all, and then pasting values in the pretty Excel file.
This is fine for one file, but I need to split the output report block by a country field and do this 100s of time for each month end.
Please can we have a output tool that does the same as my workaround. Outputs directly from a workflow to a range in Excel that doesnt destroy the workbook's formatting.
Many software & hardware companies take a very quantitative approach to driving their product innovation so that they can show an improvement over time on a standard baseline of how the product is used today; and then compare this to the way it can solve the problem in the new version and measure the improvement.
- Database vendors have been doing this for years using TPC benchmarks (http://www.tpc.org/) where a FIXED set of tasks is agreed as a benchmark and the database vendors then they iterate year over year to improve performance based on these benchmarks
- Graphics card companies or GPU companies have used benchmarks for years (e.g. TimeSpy; Cinebench etc).
How could this translate for Alteryx?
- Every year at Inspire - we hear the stats that say that 90-95% of the time taken is data preparation
- We also know that the reason for buying Alteryx is to reduce the time & skill level required to achieve these outcomes - again, as reenforced by the message that we're driving towards self-service analytics & Citizen-data-analytics.
Wouldn't it be great if Alteryx could say: "In the 2019.3 release - we have taken 10% off the benchmark of common tasks as measured by time taken to complete" - and show a 25% reduction year over year in the time to complete this battery of data preparation tasks?
One proposed method:
Take an agreed benchmark set of tasks / data / problems / outcomes, based on a standard data set - these should include all of the common data preparation problems that people face like date normalization; joining; filtering; table sync (incremental sync as well as dump-and-load); etc.
Measure the time it takes users to complete these data-prep/ data movement/ data cleanup tasks on the benchmark data & problem set using the latest innovations and tools
This time then becomes the measure - if it takes an average user 20 mins to complete these data prep tasks today; and in the 2019.3 release it takes 18 mins, then we've taken 10% off the cost of the largest piece of the data analytics pipeline.
What would this give Alteryx?
This could be very simple to administer; and if done well it could give Alteryx:
- A clear and unambiguous marketing message that they are super-focussed on solving for the 90-95% of your time that is NOT being spent on analytics, but rather on data prep
- It would also provide focus to drive the platform in the direction of the biggest pain points - all the teams across the platform can then rally around a really deep focus on the user and accelerating their "time from raw data to analytics".
- A competitive differentiation - invite your competitors to take part too just like TPC.org or any of the other benchmarks
What this is / is NOT:
This is not a run-time measure - i.e. this is not measuring transactions or rows per second
This should be focussed on "Given this problem; and raw data - what is the time it takes you, and the number of clicks and mouse moves etc - to get to the point where you can take raw data, and get it prepped and clean enough to do the analysis".
This should NOT be a test of "Once you've got clean data - how quickly can you do machine learning; or decision trees; or predictive analytics" - as we have said above, that is not the big problem - the big problem is the 90-95% of the time which is spent on data prep / transport / and cleanup.
Loads of ways that this could be administered - starting point is to agree to drive this quantitatively on a fixed benchmark of tasks and data