community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

1 Review

Our submission guidelines & status definitions before getting started

2 Search

The community for a solution or existing idea before posting

3 Vote

By clicking the star in the top left corner of an idea you support

4 Submit

A new idea to suggest a product enhancement or new feature


Suggest an idea

In-DB_idbx2_325x127

 

 

 

If you are connecting to an Oracle database you can get faster results with adding a short hint to your SQL query. For a query like this

 

 

SELECT customers.cust_first_name, customers.cust_last_name, 
MAX(QUANTITY_SOLD), AVG(QUANTITY_SOLD)
FROM sales, customers
WHERE sales.cust_id=customers.cust_id
GROUP BY customers.cust_first_name, customers.cust_last_name;

 

 

add /*+ PARALLEL(4) */  (or depending on your CPU size 8) and that SQL becomes this;

 

 

SELECT /*+ PARALLEL(4) */ customers.cust_first_name, customers.cust_last_name, 
  MAX(QUANTITY_SOLD), AVG(QUANTITY_SOLD)
FROM sales, customers
WHERE sales.cust_id=customers.cust_id
GROUP BY customers.cust_first_name, customers.cust_last_name;

 

 

There is a similar capability for SQL Server too...

 

SELECT *
FROM Sales.SalesOrderDetail
OPTION (MAXDOP)

 

 

I suggest adding this feature in in-db tools so that no one needs to alter the SQL query itself...

 

You can check out the inner workings for oracle here; https://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm

 

So it takes while to download data from in-db (say oracle table) and load it to another...

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/How-To-Migrate-Entire-Databases-Using-Altery...

 

Why not build an additional in-db tool to migrate data without downloading to local...

Oracle has this capability and here is the Syntax;

 

COPY Command Syntax

 

COPY {FROM database | TO database | FROM database TO database} {APPEND|CREATE|INSERT|REPLACE} destination_table [(column, column, column, ...)]
USING query

 

where database has the following syntax:

username[/password]@connect_identifier

 

Copies data from a query to a table in a local or remote database. COPY supports the following datatypes:

CHAR DATE LONG NUMBER VARCHAR2

Example;

COPY FROM HR/your_password@BOSTONDB -
TO TODD/your_password@CHICAGODB -
CREATE NEWDEPT (DEPARTMENT_ID, DEPARTMENT_NAME, CITY) -
USING SELECT * FROM EMP_DETAILS_VIEW
  • Category In Database

It would make life a bit easier and provide a more seamless experience if Gallery admins could create and share In-Database connections from "Data Connections" the same way in-memory connections can be shared.

 

I'm aware of workarounds (create System DSN on server machine, use a connection file, etc.), but those approaches require additional privileges and/or tech savvy that line-of-business users might not have.

 

Thanks!

From Wikipedia :

In a database, a view is the result set of a stored query on the data, which the database users can query just as they would in a persistent database collection object. This pre-established query command is kept in the database dictionary. Unlike ordinary base tables in a relational database, a view does not form part of the physical schema: as a result set, it is a virtual table computed or collated dynamically from data in the database when access to that view is requested. Changes applied to the data in a relevant underlying table are reflected in the data shown in subsequent invocations of the view. In some NoSQL databases, views are the only way to query data.

Views can provide advantages over tables:

    Views can represent a subset of the data contained in a table. Consequently, a view can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table.
    Views can join and simplify multiple tables into a single virtual table.
    Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc.) and presents the calculated results as part of the data.
    Views can hide the complexity of data. For example, a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table.
    Views take very little space to store; the database contains only the definition of a view, not a copy of all the data that it presents.
    Depending on the SQL engine used, views can provide extra security.

I would like to create a view instead of a table.

  • Category In Database

The idea is to store credentials, login/pw in a "credential alias".

 

Then, those credential aliases can be used in :

-traditional aliases/connection

-in database aliases/connection

-hdfs aliases/connection

-API

-on user aliases for connected controllers/gallery

...etc.

 

The idea is that I only have to change the credentials once for all the connection type (on Hive, I have the in db alias, the traditional alias and even an HDFS alias using exactly the same credentials !! and I have to change all that manually).

 

As simple as the title :

 

Just a Multi-Field Formula in-db. It's a nightmare to write sometime 50 or 100 times the same SQL formula and then maintain it.

 

Please.

 

Here is a téléchargement.jpg

Hello,

 

We use the pre-sql statement of the input to set some parameters of connections. Sadly, we cannot do that in a in-db workflow. This would be a total game-changing feature for us.

 

Best Regards,

 

Simon

Statistics are tools used by a lot of DB to improve speed of queries (Hive, Vertica, etc...). It may be interesting to have an option on the write in db or data stream in to calculate the statistics. (something like a check box for )

 

Example on Hive : analyse {table} comute statistics; analyse {table} compute statistics for columns;

When I create a new table in a in-Db workflow, I want to specify some contraints, especially the Primary Key/Foreign Key

 

For PK/FK, the UX could be either the selection of some fields of the flow or a free field (to let the user choose a constant).

From wikipedia :

In the relational model of databases, a primary key is a specific choice of a minimal set of attributes (columns) that uniquely specify a tuple (row) in a relation (table).[a] Informally, a primary key is "which attributes identify a record", and in simple cases are simply a single attribute: a unique id. 

So, basically, PK/FK helps in two ways :
1/ Check for duplicate, check if the value inserted is legit

2/ Improve query plan, especially for join

As of today, you must use a data stream out and then a hdfs tool to write a table in the hdfs in csv. Giving that the credentials are the same and that the adress in the DSN is the adress of the hdfs, it seems possible to keep the data in Hadoop and just putting it from the base to the HDFS.

  • Category In Database

Hello,

 

Here a use case :

I work on the projects A and B with Alteryx inj IN DB mode.

 

My coworker works only on project B and have no rights to the data of project A.

 

When using temporary table in Alteryx, we both create the temporary tables in the default database. The issue is my coworker can see my temporary data of project A, which is not safe.

Solution : allow me to specify the database/schema when I create my temporary table.

  • Category In Database

Hello,

 

My issue is very easy to solve. I want to use the generic ODBC In database for a specific base (monetdb here but it isn't important).

 

alteryx_monetdb1_.png

 

The connexion works just fine. However, I cannot create table because the data types are changed and does not even exist. Here is my data with some Date type :

 

alteryx_monetdb2_.png

 

 

And here the error in my data stream in give me this very interesting message  :

 

Error: Entrée du flux de données (2): Erreur lors de la création de la table "formation.temp1" : [MonetDB][ODBC Driver 11.31.11]Type (datetime) unknown in: "create  table "formation"."temp1" ("AccountNumber" varchar(255),"BillToAddressID"
syntax error, unexpected IDENT in: ""Freight""

CREATE  TABLE "formation"."temp1" ("AccountNumber" varchar(255),"BillToAddressID" float,"BusinessEntityID" float,"Comment
" float,"CreditCardApprovalCode" varchar(255),"CreditCardID" float,"CurrencyRateID" float,"CustomerID" float,"DueDate" datetime,"Freight" real,"OnlineOrderFlag" float,"OrderDate" datetime,"OrderDate_Month" varchar(255),"OrderDate_Year" float,"PurchaseOrderNumber" varchar(255),"RevisionNumber" float,"SalesOrderID" float,"SalesOrderNumber" varchar(255),"ShipDate" datetime,"ShipMethodID" float,"ShipToAddressID" float,"Status" float,"SubTotal" float,"TaxAmt" float,"TotalDue" float)

 

 

1/ My field is a date, why do you want to convert it in Datetime??

2/ Datetime is not even a usual field type in sql database (at least not supported by monetdb, vertica, postgresql, oracle, etc, etc...)... it should obviously be timestamp

 

Currently, this non-specific in database ODBC connexion cannot be used at all!

 

I noticed through the ODBC driver log that Alteryx doesn't care about the kind of base I precise. It tests every single kind of base to find the good one and THEN applies the queries to get the metadata info.

 

Here an example. I have chosen an Hive in db connection. If  I read the simba logs, i can find those lines :

Mar 01 11:37:21.318 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select USER(), APPLICATION_ID() from system.iota

Mar 01 11:37:22.863 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select USER as USER_NAME from SYSIBM.SYSDUMMY1

Mar 01 11:37:23.454 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select * from rdb$relations

Mar 01 11:37:23.546 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select first 1 dbinfo('version', 'full') from systables

Mar 01 11:37:23.707 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select #01/01/01# as AccessDate

Mar 01 11:37:23.868 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: exec sp_server_info 1

Mar 01 11:37:24.093 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select top (0) * from INFORMATION_SCHEMA.INDEXES

Mar 01 11:37:24.219 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: SELECT  SERVERPROPERTY('edition')

Mar 01 11:37:24.423 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select DATABASE() as `database`, VERSION() as `version`

Mar 01 11:37:24.635 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select * from sys.V_$VERSION at where RowNum<2

Mar 01 11:37:25.230 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select cast(version() as char(10)), (select 1 from pg_catalog.pg_class) as t

Mar 01 11:37:25.415 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select NAME from sqlite_master

Mar 01 11:37:25.756 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select xp_msver('CompanyName')

Mar 01 11:37:26.156 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select @@version

Mar 01 11:37:26.376 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: select * from dbc.dbcinfo

Mar 01 11:37:26.522 INFO  5264 HardyDataEngine::Prepare: Incoming SQL: SELECT @@VERSION;

 

 

 

 

I can understand that when Alteryx doesn't know the kind of base he tries everything.. (eg : in memory visual query builder) but here, I have selected the Hive database and I have to loose more than 5 seconds for nothing.

Where it stands now, only a file input tool can be used to pull data from Google BigQuery tables. The issue here is that the data is streamed and processed locally, meaning the power of BigQuery processing isn't actually being leveraged.

Adding BigQuery In-Database as a connection option would appeal to a wide audience. BigQuery is also standard SQL compliant with the SQL 2011 standard, so this may make for an even easier integration.

As you may know, the interrogation of Hive to get the Metadata is actually very slow on Alteryx

 

A first step of improvement (at least in the Visual Query Builder) has been proposed here

Smartest VQB

 

But the real issue for Hive is that the way Alteryx queries the Metadata : it passes "Show table" queries for all the databases. On our cluster, it means more than 400 queries that last each avout 0.5 seconds. The user has to to wait about 4 minutes.

A solution : using an API in java to ask the Hive metastore if it exists (it may be an other tab in the In database configuration). Our cluster admin has an example of a Thrift API in java that we can give you.

Result : 2 seconds for a 38700 tables in more than 500 databases !!

We face a big issue for our performances since we cannot as of today create tables in orc.


Connexion parameter for write :

param_ehivedb.png

Without option text file (default parameter in Simba) :

 without_option.png

without_option-2.pngwithout_option-3.png

 

With the option, the WF doesn't fail but :

 with_option-1.pngwith_option-2.pngwith_option-3.png


We want :
-to use the hdfs to write the data with data stream in
-to write the new tables with the write-indb in ORC

Alteryx has the ability to connect to data sources using fat clients and ODBC but not JDBC.  If the ability to use JDBC could be added to the product it could remove the need to install fat clients.

Not sure what detail needs to be added.  This is obviously a widely used RDBMS.

  • Category In Database

Alteryx creates a Livy Session when connecting to Spark Direct livy_session_1.png

 

I just want to identify easily the session.

  • Category In Database

Hello.

 

The category word  is a SQL Keyword (at least on hive). However it is put in quote ( this quote ` ) and the workflow will work without a single issue to the end. The blue color may be misleading to some users.

Bud%20Category.png

  • Category In Database
Top Starred Authors