This is a QoL-request, and I love me some QoL-updates!
While I'm developing I often need the output of a workflow as input for the next phase of my development. For example: an API run returns job location, status, and authentication ids. I want to use these in a new workflow to start experimenting what'll work best. Because of the experimenting part, I always do this in a new workflow and not cache and continue in my main flow.
Writing a temporary output file always feels like unnescesary steps, and tbh I don't want to write a file for a step that'll be gone before it reaches production. Esp if there is sensitive information in it.
I surprisingly couldn't find this anywhere else as I know it's been discussed in person on many occasions.
Basically the Formula tool needs to be smarter in many ways, but this particular post focuses on the Data Type component.
The formula tool, should not always default to V_String as the data type when entering data or a formula into the formula tool, it should look at the data type and estimate the most likely option.
I know there are times where the logical type might not be consistent in all fields, but the Data Preview and the Function of the formula should be used to determine the most likely option.
E.G. If I type a number or a date directly into the formula tool, then Alteryx should be smart enough to change the data type from the standard V_String to Int, Double or date.
This is an extension to the ideas posted here:
I often need to create a record ID that automatically increments but grouped by a specific field. I currently do it using the Multi-Row Formula tool doing [Field-1:ID]+1 because there is no group by option in the Record ID tool.
Also, sometimes I need to start at 0 but the Multi-Row Formula tool doesn't allow this so I have to use a Formula tool right after to subtract 1.
So adding a group by option to the Record ID tool would allow the user not to use the multi-row formula to do this and to start at any value wanted.
We have 'CountDistinct' and 'Concatenate' options within Summarize tool.
But 'Concatenate' displays all the instances of value for a Grouped field, this might include lot of duplicates.
It would be great to have an option like 'ConcatDistinct'.
For example -
Group by 'Branch' and 'ConcatDistinct' Customer should result as Figure 1 instead of Figure 2 -
While this is achievable in different ways currently with a set of tools, but it gets tedious when number of fields is large from which distinct values are to be captured.
As each version of Alteryx is rolled out, it would be much easier for our users and admin team to validate the new version, if Alteryx allowed parallel installs of many different versions of the software.
So - our team is currently on 11.3 - if we could roll out 11.5 in parallel then we could very easily allow users to revert to 11.3 if there are issues, or else remove 11.3 after 2-3 weeks if no issues.
The same goes for versions which are in BETA.
This would be a huge help!
When creating a workflow I generally open a "TEMPLATE" first and then immediately save it to the "NEW WORKFLOW NAME". My template includes all my preferences that aren't set naturally within the user settings and won't get RESET by them either. It has a comment box and containers as well as logos and copyrights. It would be nice to have ready access to this feature. Maybe others have standards that they want applied to all users and their workflows too.
Standard In-DB connection configuration for PostgreSQL / Greenplum makes "Datastream-In" In-DB tool to load data line by line instead of using Bulk mode.
As a result, loading data in a In-DB stream is very slow.
100 000 lines are sent to Greenplum using a "Datastream-in" In-DB tool.
This is a demo workflow, the In-DB stream could be more complex and not replaceable by an Output Data In-Memory.
Load time : 11 minutes.
It's slow and spam the database with insert for each lines.
However, there is a workaround.
We can configure a In-Memory connection using the bulk mode :
And paste the connection string to the "write" tab of our In-DB Connection :
Load time : 24 seconds.
It's fast as it uses the Bulk mode.
This workaround has been validated by Greenplum team but not by Alteryx support team.
Could you please support this workaround ?
Tested on version 2021.3.3.63061
In help, we can read that :
Update/Delete is currently only supported for SQL Server ODBC connections.
I don't know about you but SQL Server is well used in transactional workload but in analytics... well... I have only used once in several dozens of context !
Maybe it would be cool to make it work on many more database?
Can we get the R tools/models to work in database for SNOWFLAKE.
I understand that Snowflake currently doesn't support R through their UDFs yet; therefore, you might be waiting for them to add it.
I hear Python is coming soon, which is good & Java already available..
However, what about the ‘DPLYR’ package? https://db.rstudio.com/r-packages/dplyr/
My understanding is that this can translate the R code into SQL, so it can run in-DB?
Could this R code package be appended to the Alteryx R models? (maybe this isn’t possible, but wanted ask).
Currently Alteryx does not support writing to SharePoint document libraries.
However there are success sometimes but not at other times.
Please see attachment where we ran into an issue.
See this link for additional information.
We need official support for reading and writing to SharePoint document libraries.
It's an important Output target, and will becoming more so, as Alteryx enhances its reporting capabilities.
According to wikipedia :
CROSS JOIN returns the Cartesian product of rows from tables in the join. In other words, it will produce rows which combine each row from the first table with each row from the second table. Example of an explicit cross join: SELECT * FROM employee CROSS JOIN department; Example of an implicit cross join: SELECT * FROM employee, department; The cross join can be replaced with an inner join with an always-true condition: SELECT * FROM employee INNER JOIN department ON 1=1;
For us, alteryx users, it would be very similar to Append Fields but for in-db.
It would be great to have the below functionality in Alteryx.
A workflow is built in Alteryx and button click in Alteryx can be used to generate SQL code that can be ran on a specific database platform, such as SQL Server to run external editors such as SQL Server Management Studio. Thanks.
Currently, when one uses the Google BigQuery Output tool, the only options are to create a table, or append data to an existing table. It would be more useful if there was a process to replace all data in the table rather than appending. Having the option to overwrite an existing table in Google BigQuery would be optimal.
Despite a few limitations, Alteryx is great when you work with full table (i.e when you rewrite entirely the table). But in real life, very few workflows work like that :
Here are some real life use cases that should be easy to deal with on Alteryx :
-delta on a key
-delta on a key + last record based on a date
-start_date and end_date for a value
Dear Users, Fans, Compatriots, and Fellow Alteryx Nerds:
One of my favourite parts of using Alteryx is that in all the in-memory tools, there is a quick-and-dirty count in each of your tools' output nodes. You know, you use these all the time and when you switch back into SQL, you get frustrated with having to run the query two or three times just to see the count in each of your join outputs.
One thing I'm missing as an INDB user is that I have to employ a manual workaround to see what is happening. INDB tools are a bit black-box in that we don't see the counts.
I've been using this workaround for a little over a year now and I haven't found it to be incredibly taxing on my resources, so I'm wondering if Alteryx may be able to look into doing this on the back end to make the INDB experience that much closer to the in-memory experience. I just want those numbers above; I don't need to know the byte count, just the record count.
Now, I imagine this is not implemented already for a Very Good Reason. But, enough is enough! Let's shoot for the moon and make this tool all that much better!! Anyone with me?
Currently loading large files to Postgres SQL(over 100 MB) takes an extremely long time. For example writing a 1GB file to Postgres SQL takes 27 minutes! This is serious impacting our ability to use Alteryx as an ETL tool for loading our target Postgres Data Warehouse. We would really like to see the bulk load capacity to Postgres supported by Alteryx to help alleviate the performance issues.
We use the pre-sql statement of the input to set some parameters of connections. Sadly, we cannot do that in a in-db workflow. This would be a total game-changing feature for us.
From Wikipedia :
In a database, a view is the result set of a stored query on the data, which the database users can query just as they would in a persistent database collection object. This pre-established query command is kept in the database dictionary. Unlike ordinary base tables in a relational database, a view does not form part of the physical schema: as a result set, it is a virtual table computed or collated dynamically from data in the database when access to that view is requested. Changes applied to the data in a relevant underlying table are reflected in the data shown in subsequent invocations of the view. In some NoSQL databases, views are the only way to query data. Views can provide advantages over tables: Views can represent a subset of the data contained in a table. Consequently, a view can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table. Views can join and simplify multiple tables into a single virtual table. Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc.) and presents the calculated results as part of the data. Views can hide the complexity of data. For example, a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table. Views take very little space to store; the database contains only the definition of a view, not a copy of all the data that it presents. Depending on the SQL engine used, views can provide extra security.
I would like to create a view instead of a table.
The Tableau Hyper API supports regular SQL queries, see https://help.tableau.com/current/api/hyper_api/en-us/reference/sql/index.html and https://help.tableau.com/current/api/hyper_api/en-us/docs/hyper_api_reference.html for more information. Being able to use the In-database tools for querying Hyper would let us take advantage of Hyper's internal optimizations just like other databases.
TIBCO Data Virtualization is a Data Virtualization product focused on creating a virtual data store consolidating data from throughout the enterprise. It can be accessed via a SQL query engine, and has a variety of supported connectors, including an ODBC driver.
This data source can be connected to via ODBC in Alteryx today, but error messaging is unclear/unhelpful, and attempting to use the Visual Query Builder causes Alteryx to crash.
Adding TIBCO Data Virtualization as a supported ODBC connection would empower business users to leverage this product and easily utilize this enterprise data store, enhancing the value of the Alteryx platform as a consumer of this data.
Please could you enhance the Alteryx download tool to support SFTP connections with Private Key authentication as well. This is not currently supported and all of our SFTP use cases use PK.
Alteryx has the ability to connect to data sources using fat clients and ODBC but not JDBC. If the ability to use JDBC could be added to the product it could remove the need to install fat clients.
Enable Gallery Server Connections as Input for In-DB Tools. Currently, we can only create file connections, and we'd like to centralize all connections to our Gallery Connections.
It would be awesome if there was a cross tab in DB option because right now I have to stream out millions of records to build a cross tab.
The idea is to store credentials, login/pw in a "credential alias".
Then, those credential aliases can be used in :
-in database aliases/connection
-on user aliases for connected controllers/gallery
The idea is that I only have to change the credentials once for all the connection type (on Hive, I have the in db alias, the traditional alias and even an HDFS alias using exactly the same credentials !! and I have to change all that manually).
Where it stands now, only a file input tool can be used to pull data from Google BigQuery tables. The issue here is that the data is streamed and processed locally, meaning the power of BigQuery processing isn't actually being leveraged.
Adding BigQuery In-Database as a connection option would appeal to a wide audience. BigQuery is also standard SQL compliant with the SQL 2011 standard, so this may make for an even easier integration.