community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!

1 Review

Our submission guidelines & status definitions before getting started

2 Search

The community for a solution or existing idea before posting

3 Vote

By clicking the star in the top left corner of an idea you support

4 Submit

A new idea to suggest a product enhancement or new feature


Suggest an idea

As simple as the title :

 

Just a Multi-Field Formula in-db. It's a nightmare to write sometime 50 or 100 times the same SQL formula and then maintain it.

 

Please.

 

Here is a téléchargement.jpg

From Wikipedia :

In a database, a view is the result set of a stored query on the data, which the database users can query just as they would in a persistent database collection object. This pre-established query command is kept in the database dictionary. Unlike ordinary base tables in a relational database, a view does not form part of the physical schema: as a result set, it is a virtual table computed or collated dynamically from data in the database when access to that view is requested. Changes applied to the data in a relevant underlying table are reflected in the data shown in subsequent invocations of the view. In some NoSQL databases, views are the only way to query data.

Views can provide advantages over tables:

    Views can represent a subset of the data contained in a table. Consequently, a view can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table.
    Views can join and simplify multiple tables into a single virtual table.
    Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc.) and presents the calculated results as part of the data.
    Views can hide the complexity of data. For example, a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table.
    Views take very little space to store; the database contains only the definition of a view, not a copy of all the data that it presents.
    Depending on the SQL engine used, views can provide extra security.

I would like to create a view instead of a table.

  • Category In Database

Alteryx creates a Livy Session when connecting to Spark Direct livy_session_1.png

 

I just want to identify easily the session.

  • Category In Database

The idea is to store credentials, login/pw in a "credential alias".

 

Then, those credential aliases can be used in :

-traditional aliases/connection

-in database aliases/connection

-hdfs aliases/connection

-API

-on user aliases for connected controllers/gallery

...etc.

 

The idea is that I only have to change the credentials once for all the connection type (on Hive, I have the in db alias, the traditional alias and even an HDFS alias using exactly the same credentials !! and I have to change all that manually).

 

Statistics are tools used by a lot of DB to improve speed of queries (Hive, Vertica, etc...). It may be interesting to have an option on the write in db or data stream in to calculate the statistics. (something like a check box for )

 

Example on Hive : analyse {table} comute statistics; analyse {table} compute statistics for columns;

As simple as the title : an In-Database Block Until Done would be a pretty nice feature to control the execution of a workflow.

  • Category In Database

As of today, you must use a data stream out and then a hdfs tool to write a table in the hdfs in csv. Giving that the credentials are the same and that the adress in the DSN is the adress of the hdfs, it seems possible to keep the data in Hadoop and just putting it from the base to the HDFS.

  • Category In Database

I noticed through the ODBC driver log that Alteryx doesn't care about the kind of base I precise. It tests every single kind of base to find the good one and THEN applies the queries to get the metadata info.

 

Here an example. I have chosen an Hive in db connection. If  I read the simba logs, i can find those lines :

Mar 01 11:37:21.318 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select USER(), APPLICATION_ID() from system.iota

Mar 01 11:37:22.863 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select USER as USER_NAME from SYSIBM.SYSDUMMY1

Mar 01 11:37:23.454 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select * from rdb$relations

Mar 01 11:37:23.546 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select first 1 dbinfo('version', 'full') from systables

Mar 01 11:37:23.707 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select #01/01/01# as AccessDate

Mar 01 11:37:23.868 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: exec sp_server_info 1

Mar 01 11:37:24.093 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select top (0) * from INFORMATION_SCHEMA.INDEXES

Mar 01 11:37:24.219 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: SELECT  SERVERPROPERTY('edition')

Mar 01 11:37:24.423 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select DATABASE() as `database`, VERSION() as `version`

Mar 01 11:37:24.635 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select * from sys.V_$VERSION at where RowNum<2

Mar 01 11:37:25.230 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select cast(version() as char(10)), (select 1 from pg_catalog.pg_class) as t

Mar 01 11:37:25.415 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select NAME from sqlite_master

Mar 01 11:37:25.756 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select xp_msver('CompanyName')

Mar 01 11:37:26.156 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select @@version

Mar 01 11:37:26.376 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select * from dbc.dbcinfo

Mar 01 11:37:26.522 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: SELECT @@VERSION;

 

 

 

 

I can understand that when Alteryx doesn't know the kind of base he tries everything.. (eg : in memory visual query builder) but here, I have selected the Hive database and I have to loose more than 5 seconds for nothing.

Hello,

 

Here a use case :

I work on the projects A and B with Alteryx inj IN DB mode.

 

My coworker works only on project B and have no rights to the data of project A.

 

When using temporary table in Alteryx, we both create the temporary tables in the default database. The issue is my coworker can see my temporary data of project A, which is not safe.

Solution : allow me to specify the database/schema when I create my temporary table.

  • Category In Database

We face a big issue for our performances since we cannot as of today create tables in orc.


Connexion parameter for write :

param_ehivedb.png

Without option text file (default parameter in Simba) :

 without_option.png

without_option-2.pngwithout_option-3.png

 

With the option, the WF doesn't fail but :

 with_option-1.pngwith_option-2.pngwith_option-3.png


We want :
-to use the hdfs to write the data with data stream in
-to write the new tables with the write-indb in ORC

Hello,

 

We use the pre-sql statement of the input to set some parameters of connections. Sadly, we cannot do that in a in-db workflow. This would be a total game-changing feature for us.

 

Best Regards,

 

Simon

Hello.

 

The category word  is a SQL Keyword (at least on hive). However it is put in quote ( this quote ` ) and the workflow will work without a single issue to the end. The blue color may be misleading to some users.

Bud%20Category.png

  • Category In Database

The designing interface is very slow when we design an in-db workflow.

 

 

The reason of that is that Alteryx connects everytime he needs to refresh the data. Example on Hive :

Mar 20 15:28:49.453 DEBUG 6048 HardyConnection::Connect: Default branding specific auth mech: 2
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveClientFactory::CreateClient: Create HS2 client.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveClientFactory::GetBackendCxnPool: Create session manager.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveClientFactory::GetBackendCxnPool: Create backend connection pool.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveCxnPool::GetHS2Cxn: Create HS2 connection.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveCxnPool::GetCxnFactory: Create backend connection factory.
Mar 20 15:28:49.453 DEBUG 6048 HardyHiveCxnFactory::CreateHS2Cxn: Create HS2 HTTP transport.
Mar 20 15:28:49.453 DEBUG 6048 HardySessionManager::GetSession: Getting new session handle.
Mar 20 15:28:50.399 DEBUG 6048 HardyTCLIServiceThreadSafeClient::OpenSession: TOpenSessionReq
    client_protocol = HIVE_CLI_SERVICE_PROTOCOL_V1

Maybe we could have an option on the IN DB Connection configuration to stay connected while designing (maybe with a limit time).


(PS : we also tried the option to Disable Auto Configure, it's clearly not he solution)

As you may know, the interrogation of Hive to get the Metadata is actually very slow on Alteryx

 

A first step of improvement (at least in the Visual Query Builder) has been proposed here

Smartest VQB

 

But the real issue for Hive is that the way Alteryx queries the Metadata : it passes "Show table" queries for all the databases. On our cluster, it means more than 400 queries that last each avout 0.5 seconds. The user has to to wait about 4 minutes.

A solution : using an API in java to ask the Hive metastore if it exists (it may be an other tab in the In database configuration). Our cluster admin has an example of a Thrift API in java that we can give you.

Result : 2 seconds for a 38700 tables in more than 500 databases !!

use case : much of our users copy paste a formatted query to the Alteryx tools such as Connect in Db or Input Data (especially to reduce the data).

 

However, some of the formatting such a Carrier Return does not work

 

select 
 * from 
`re7_kerberos_za_e_*******_ee01pub_edd`.`v_*******_flux_20170511` 

 

 bbbbb.pngaaaaaa.png

Not sure what detail needs to be added.  This is obviously a widely used RDBMS.

  • Category In Database

Alteryx has the ability to connect to data sources using fat clients and ODBC but not JDBC.  If the ability to use JDBC could be added to the product it could remove the need to install fat clients.

Where it stands now, only a file input tool can be used to pull data from Google BigQuery tables. The issue here is that the data is streamed and processed locally, meaning the power of BigQuery processing isn't actually being leveraged.

Adding BigQuery In-Database as a connection option would appeal to a wide audience. BigQuery is also standard SQL compliant with the SQL 2011 standard, so this may make for an even easier integration.

While I strongly support the S3 upload and download connectors, the development of AWS Athena has changed the game for us. Please consider opening up an official support of Athena compute on S3 like support already show for Teradata, Hadoop Hive, MS SQL, and other database types.

The In-Database component don't allow to process a simple UPDATE statement. It require to create a new table etc etc....

 

And as i'm dealing with 110 000 000 records it's a waste of resource to duplicate in another table... and/or seems stupid to download and manipulate that much of data on the Alteryx server while launching a simple UPDATE statement do the job ...

 

Anything on the roadmap for this simple need ?

  • Category In Database
Top Starred Authors