Alteryx Designer Desktop Ideas

Kevin_VANCAPPEL · 09-08-2021

Hello,

After used the new "Image Recognition Tool" a few days, I think you could improve it :

> by adding the dimensional constraints in front of each of the pre-trained models,

> by adding a true tool to divide the training data correctly (in order to have an equivalent number of images for each of the labels)

> at least, allow the tool to use black & white images (I wanted to test it on the MNIST, but the tool tells me that it necessarily needs RGB images) ?

Question : do you in the future allow the user to choose between CPU or GPU usage ?

In any case, thank you again for this new tool, it is certainly perfectible, but very simple to use, and I sincerely think that it will allow a greater number of people to understand the many use cases made possible thanks to image recognition.

Thank you again

Kévin VANCAPPEL (France ;-))

Thank you again.

Kévin VANCAPPEL

simonaubert_bd · ‎10-20-2020

Hello all,

Some Database, including Hive, support natively scheduled queries (yes, the scheduling configuration is inside the database, not through etl/dataprep system). I think this would be an interesting feature for in-db workflow output : you play the worflow once and then only have to run it when it changes, the database do the scheduling.

https://cwiki.apache.org/confluence/display/Hive/Scheduled+Queries

Intro

Executing statements periodically can be usefull in

Pulling informations from external systems
Periodically updating column statistics
Rebuilding materialized views

Best regards,

Simon

dharamyiyer29590 · ‎03-02-2023

There should be an option where an existing SQL query or a complex logic is converted by Alteryx intelligently into an Alteryx high level workflow with tools suggestion which can be modified by the developers.

For e.g. Salesforce Einstein Analytics has an option where an existing dataflow (traditional way of performing data prep.) can be converted to a recipe (premium version of a dataflow with advanced features) using a single click. It gives an option for the user to make additional modifications/enhancements on top of it.

thedr9wningman · ‎09-14-2020

Dear Users, Fans, Compatriots, and Fellow Alteryx Nerds:

One of my favourite parts of using Alteryx is that in all the in-memory tools, there is a quick-and-dirty count in each of your tools' output nodes. You know, you use these all the time and when you switch back into SQL, you get frustrated with having to run the query two or three times just to see the count in each of your join outputs.

One thing I'm missing as an INDB user is that I have to employ a manual workaround to see what is happening. INDB tools are a bit black-box in that we don't see the counts.

All I want...

I've been using this workaround for a little over a year now and I haven't found it to be incredibly taxing on my resources, so I'm wondering if Alteryx may be able to look into doing this on the back end to make the INDB experience that much closer to the in-memory experience. I just want those numbers above; I don't need to know the byte count, just the record count.

What I need to do to get it

Now, I imagine this is not implemented already for a Very Good Reason. But, enough is enough! Let's shoot for the moon and make this tool all that much better!! Anyone with me?

-Cedric Justice

Cambia Healthcare

Samuel_To · ‎04-07-2021

Hi Alteryx Team,

Now, Connect In-DB cannot use the data connection in gallery.

User need to input those DB info as well as the login and password.

I suggest to enhance the Connect In-DB tool, so that can select/use the gallery data connection.

From enterprise point of view:

1. No database credentials and connection properties be shared to designer user. It can reduce the risk from abnormal access.

2. Easy to manage the access control by Alteryx Admin in gallery. Can assign the data connections to different group of users. More convenience for audit.

3. Easy to maintain the data connection by Alteryx Admin in gallery. For example, reset the database password or update the connection properties .

On the other hand, it is better to setup in-DB data connection in gallery.

Best regards,

Samuel

saubert · ‎03-23-2018

Bulk Load for Vertica (especially with the gzip compressed format) is very powerful, I can upload several dozens of millions of rows in a few minutes. Can we have it please?

https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/AdministratorsGuide/BulkLoadCOPY/UsingCOP...

paul_houghton · ‎10-29-2015

There are a number of requests for bulk loaders to DBs and Im adding MySQL to the list.

Really every DB connection (on prem and cloud) need some bulk loader capabilities to be added (if they don't have it already)

sraynal · ‎11-09-2021

Hi,

Standard In-DB connection configuration for PostgreSQL / Greenplum makes "Datastream-In" In-DB tool to load data line by line instead of using Bulk mode.

As a result, loading data in a In-DB stream is very slow.

Exemple

Connection configuration

Workflow

100 000 lines are sent to Greenplum using a "Datastream-in" In-DB tool.

This is a demo workflow, the In-DB stream could be more complex and not replaceable by an Output Data In-Memory.

Load time : 11 minutes.

It's slow and spam the database with insert for each lines.

However, there is a workaround.

We can configure a In-Memory connection using the bulk mode :

And paste the connection string to the "write" tab of our In-DB Connection :

Load time : 24 seconds.

It's fast as it uses the Bulk mode.

This workaround has been validated by Greenplum team but not by Alteryx support team.

Could you please support this workaround ?

Tested on version 2021.3.3.63061

adrianloong · ‎10-12-2015

In-database enables large performance benefits on big datasets, it would be great to incorporate multirow and multifield formulas for use within the in-database funcions for redshift

simonaubert_bd · ‎06-30-2021

Hello all,

Despite a few limitations, Alteryx is great when you work with full table (i.e when you rewrite entirely the table). But in real life, very few workflows work like that :

Here are some real life use cases that should be easy to deal with on Alteryx :

-delta on a key

-delta on a key + last record based on a date

-update records

-start_date and end_date for a value

etc

Best regards,

Simon

Csand · ‎05-08-2020

Enable Gallery Server Connections as Input for In-DB Tools. Currently, we can only create file connections, and we'd like to centralize all connections to our Gallery Connections.

dataprep · ‎04-18-2018

The designing interface is very slow when we design an in-db workflow.

The reason of that is that Alteryx connects everytime he needs to refresh the data. Example on Hive :

Mar 20 15:28:49.453 DEBUG 6048 HardyConnection::Connect: Default branding specific auth mech: 2
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveClientFactory::CreateClient: Create HS2 client.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveClientFactory::GetBackendCxnPool: Create session manager.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveClientFactory::GetBackendCxnPool: Create backend connection pool.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveCxnPool::GetHS2Cxn: Create HS2 connection.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveCxnPool::GetCxnFactory: Create backend connection factory.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveCxnFactory::CreateHS2Cxn: Create HS2 HTTP transport.
Mar 20 15:28:49.453 DEBUG 6048 HardySessionManager::GetSession: Getting new session handle.
Mar 20 15:28:50.399 DEBUG 6048 HardyTCLIServiceThreadSafeClient::OpenSession: TOpenSessionReq
    client_protocol = HIVE_CLI_SERVICE_PROTOCOL_V1

Maybe we could have an option on the IN DB Connection configuration to stay connected while designing (maybe with a limit time).

(PS : we also tried the option to Disable Auto Configure, it's clearly not he solution)

dataprep · ‎04-18-2018

As you may know, the interrogation of Hive to get the Metadata is actually very slow on Alteryx

A first step of improvement (at least in the Visual Query Builder) has been proposed here

Smartest VQB

But the real issue for Hive is that the way Alteryx queries the Metadata : it passes "Show table" queries for all the databases. On our cluster, it means more than 400 queries that last each avout 0.5 seconds. The user has to to wait about 4 minutes.

A solution : using an API in java to ask the Hive metastore if it exists (it may be an other tab in the In database configuration). Our cluster admin has an example of a Thrift API in java that we can give you.

Result : 2 seconds for a 38700 tables in more than 500 databases !!

zdavis · ‎02-10-2016

There is a need when visualizing in-Database workflows to be able to visualize sorted data. This sorting could be done 1 of 2 ways: In a browse tool, or as a stand-alone Sort tool. Either would address the need. Without such a tool being present, the only way to sort the data is to "Data Stream Out" and then visualize the data in Alteryx. However, this process violates the premise of the usefulness of the in-DB toolkit, which is to keep your data in-DB and process using the DB engine. Streaming out big data in order to add a sort is not efficient.

Granted, the in-DB processing doesn't care whether data is sorted or not. However, when attempting to find extreme values after an aggregation, or when trying to identify something as simple as whether null values are present in a field, then a sort becomes extremely useful, and a necessary tool for human consumption of data (regardless of the database's processing needs).

Thanks very much for hearing my idea!

charlep_dup_424 · ‎07-21-2019

Currently we can't use any PaaS MongoDB products (MongoDB Atlas / CosmosDB) as Alteryx Gallery doesn't support SSL for connecting to the MongoDB back end.

SSL is good security practice when splitting the MongoDB onto a different machine too.

SeanAdams · ‎05-14-2017

Not sure if any of you have a similar issue - but we often end up bringing in some data (either from a website or a table) to profile it - and then an hour in, you realise that the data will probably take 6 weeks to completely ingest, but it's taken in enough rows already to give us a useful sense.

Right now, the only option is to stop (in which case all the profiling tools at the end of the flow will all give you nothing) and then restart with a row-limiter - or let it run to completion. The tragedy of the first option is that you've already invested an hour or 2 in the data extract, but you cannot make use of this.

It feels like there's a third option - a option to "Stop bringing in new data - but just finish the data that you currently have", which terminates any input or download tools in their current state, and let's the remainder of the data flush through the full workflow.

Hopefully I'm not alone in this need 🙂

Treyson · ‎01-14-2016

When converting data types while In-DB, it would be really helpful if I could change the data type with the "Select In-DB" tool in a similar manner to the "Select" tool. Currently, we are having to use the "Formula In-DB" tool in order to create a "Cast" Statement.

andrewdatakim · ‎03-15-2019

I would like to see In-DB batch macros, currently we are joining tables with 30 million+ records and we are having to run it through standards tools because we are unable to process via In-DB, which has a 20% improvement in processing speed based on the peformance profiling.

brandonculver · ‎03-28-2017

While I strongly support the S3 upload and download connectors, the development of AWS Athena has changed the game for us. Please consider opening up an official support of Athena compute on S3 like support already show for Teradata, Hadoop Hive, MS SQL, and other database types.

PeterGoldey · ‎06-15-2017

Not sure what detail needs to be added. This is obviously a widely used RDBMS.

simonaubert_bd · ‎12-21-2019

Hello,

As of today, if you want to add a PostgreSQL in database connection, you may feel embarrased :

However, the help states that PostgreSQL is supported by in-database.

https://help.alteryx.com/current/In-DatabaseOverview.htm

Whaaaaaaaaat?

oh, I forgot to mention : with a little luck, you can find tis help page : https://help.alteryx.com/current/DataSources/PostgreSQL.htm

Yep, you have to configure a "greenplum" connection if you want to use a PSQL.

i think this is not user-friendly and can lead to mistake, errors, frustration and even lack of sales for Alteryx :

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Do-you-use-generic-in-database-connect...

Also, Greeenplum and PSQL will have separate features so I think having two separate entries in the menu is pertinent.

Best regards,

Simon

Alteryx Designer Desktop Ideas

Submitting an Idea?

Featured Ideas

Image Recognition Tools

Support for scheduled queries

Alteryx option to convert SQL to an Alteryx workflow

INDB record counts popup

Use gallery data connection in Connect In-DB tool

Bulk Load Vertica

MySQL bulk loader

Support this Workaround to use Bulk in PostgreSQL/Greenplum "Datastream-In" In-DB tool

In-database Multirow and multifield formula

Better Delta /historization Management for IN-DB

Enable Gallery Server Connections as Input for In-DB Tools

In-db : Stay connected while designing

Hive : how to get faster the metadata

In-Database Sort Tool

Allow SSL connections for the MongoDB Alteryx Database

Just finish what you've eaten so-far

Converting Data Types w/ Select In-DB

In-DB Batch Macro

AWS Athena Compute Support

In Database for MySQL

Separate entry in in-db configuration for Postgresql and Greenplum

Connect Azure SQL Database with Azure AD AAD inter...

Tableau Sever Publish Data Source Name / Project

Alteryx tool / macro to open Hyper files directly ...

Custom C++ Functions in AMP

Amazon S3 Upload - Virtual Hosted URL management