community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

1 Review

Our submission guidelines & status definitions before getting started

2 Search

The community for a solution or existing idea before posting

3 Vote

By clicking the star in the top left corner of an idea you support

4 Submit

A new idea to suggest a product enhancement or new feature


Suggest an idea

Hello,

As I mentioned in this previous idea : https://community.alteryx.com/t5/Alteryx-Designer-Ideas/Generic-In-database-connection-please-stop-i...

 

field mapping in generic in-db connection is based on Microsoft Sql Server. Given the specificity of MSQL Server field types, I would like to change that in order to at least be able to use another database. Without that, this feature has no sense at all.

Best regards,

Simon

  • In Database

In-DB_idbx2_325x127

 

 

 

If you are connecting to an Oracle database you can get faster results with adding a short hint to your SQL query. For a query like this

 

 

SELECT customers.cust_first_name, customers.cust_last_name, 
MAX(QUANTITY_SOLD), AVG(QUANTITY_SOLD)
FROM sales, customers
WHERE sales.cust_id=customers.cust_id
GROUP BY customers.cust_first_name, customers.cust_last_name;

 

 

add /*+ PARALLEL(4) */  (or depending on your CPU size 8) and that SQL becomes this;

 

 

SELECT /*+ PARALLEL(4) */ customers.cust_first_name, customers.cust_last_name, 
  MAX(QUANTITY_SOLD), AVG(QUANTITY_SOLD)
FROM sales, customers
WHERE sales.cust_id=customers.cust_id
GROUP BY customers.cust_first_name, customers.cust_last_name;

 

 

There is a similar capability for SQL Server too...

 

SELECT *
FROM Sales.SalesOrderDetail
OPTION (MAXDOP)

 

 

I suggest adding this feature in in-db tools so that no one needs to alter the SQL query itself...

 

You can check out the inner workings for oracle here; https://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm

Hello,

As of today, the in db connexion window is divided into :
-write tab

-read tab

However, writing means two different thing : inserting and in-db writing. Alteryx has already 2 different tools (Data Stream In and Write Data).

Si what I propose is to divide the window into :
-read
-write
-insert

Best regards,

Simon

 

 

While In-db tools are very helpful and cut down the time needed to write complex SQL , there are some steps that are faster by directly writing SQL like window functions- OVER (PARTITION BY .....). In Alteryx, we need to create multiple joins and summaries to perform a window function. It would be immensely helpful if there was a SQL editor tool for in-db workflows where we can edit the SQL code at any point in the workflow, or even better, if they can add an "edit" function to every in-db tool where we can customize the SQL code generated and then send to the next tool.

 

This will cut down the time immensely and streamline the workflow to make Alteryx a true contender for the ETL solution space.

It would make life a bit easier and provide a more seamless experience if Gallery admins could create and share In-Database connections from "Data Connections" the same way in-memory connections can be shared.

 

I'm aware of workarounds (create System DSN on server machine, use a connection file, etc.), but those approaches require additional privileges and/or tech savvy that line-of-business users might not have.

 

Thanks!

As of today, for a full refresh, I can :

-create a new table

-overwrite a table. (will drop and then create the new table)

But sometimes, the workflow fails and the old table is dropped while the new one is not created. I have to modify the tool (setting "create a new table")to launch it again, which may be a complex process in companies. After that, I have to modify it again back to "overwrite".

What I want :

-create a new table-error if table already exists

-overwrite a table-error if table doesn't exist

-overwrite a table-no error if table doesn't exist (easy in sql : drop if exists...)

 

Thanks!

 

I reported this to the support team but was told it was by design and to post here.

 

In-DB Inefficient SQL

I would like to report that the In-DB tools are generating horribly inefficient SQL code for simple operations.  It seems no matter what tools you use every statement is starting with a nested 'Select * From'.

 

Example Simple workflow:

  Support1.jpgSupport2.jpg

 

This is a simple Select and Group by but the SQL Generated is:

 

SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit"

FROM (SELECT * FROM "_SYS_BIC"."shell.app.gsap.FL000_LSC.FL002_CTS.INT.RPT/CA_CTS_RPT_MAIN_001") AS "a"

GROUP BY "ShipTo", "ShipTo_Name"

 

This is taking a very long time to execute:

 

Statement 'SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit" FROM ...'

successfully executed in 15.752 seconds  (server processing time: 15.699 seconds)

 

Whereas if I take the same query and remove the nested Select *:

 

SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit"

FROM "_SYS_BIC"."shell.app.gsap.FL000_LSC.FL002_CTS.INT.RPT/CA_CTS_RPT_MAIN_001" AS "a"

GROUP BY "ShipTo", "ShipTo_Name"

 

It is very quick:

 

Statement 'SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit" FROM ...'

successfully executed in 1.211 seconds  (server processing time: 1.157 seconds)

 

So Alteryx is generating queries up to x13 slower than they should be thereby defeating the point of using In-DB.  As you can imagine in a workflow where we have multiple Connect In-DB tools this is a really substantial amount of time.  Example used above is from SAP HANA DB has 1.9m rows and ~90 columns but we have much bigger tables/views than this.

 

If you look you will see its same behaviour for all In-DB tools where each tool creates another nested Select with its particular operator.

 

MY SUGGESTION:

So my suggestion is that Alteryx should combine the SQL of the first few tools and avoid using SELECT * completely unless no Select tools have been used.  So it should combine:

- Connect In-DB + Select

- Connect In-DB + Filter

- Connect In-DB + Summarise

 

Preferably it should combine/flatten everything up until the first join or union.  But Select + Filter are a must!

 

Note it seems some DB's can cope OK with un-nesting these big nested queries in the query plans for some Tables but normally not for Views.  But some cannot cope at all and so the In-DB tools cannot even be used to Browse 100 records (due to select *).

  • In Database

I would like the ability to take custom geographies and write them to a table in Exasol. We visualize our data with Tableau and rely on live connections to Exasol tables rather than Tableau extracts. One shortcoming with spatial is that we have to output our custom geographies as a .shp file then make a Tableau Extract.  This would save us a few steps in sharing this data with our users. 

 

Thanks!

Hi,

 

we use a lot the in-db tools to join our database and filter before extracting (seems logic), but to do it dynamically we have to use the dynamic input in db, which allows to input a kind of parameter for the dates, calculated locally and easily or even based on a parameter table in excel or whatever, it would be great to be able to dynamically plug a not in db tools to be able to have some parameters for filters or for the connect in-db. The thing is when yu use dynamic input in-db, you loose the code-free part and it can be harder to maintain for non sql users who are just used to do simple queries.

You could say that an analytic application could do the trick or even developp a macro to do so, but it would be complicated to do so with hundreds of tables.

 

Hope it will be interesting for others!

As simple as the title :

 

Just a Multi-Field Formula in-db. It's a nightmare to write sometime 50 or 100 times the same SQL formula and then maintain it.

 

Please.

 

Here is a téléchargement.jpg

Unlike In-Memory Filter tool, the IN-DB Filter Tool does not process the NULL unless by default specified.

This results in the loss of data without warning.

I suggest you either fix it to match the in-memory filter, or display a warning somewhere, this creates many data quality issues...

Step 04.PNGNULL entry is NOT included!Step 03.PNGNULL entry is included

When I create a new table in a in-Db workflow, I want to specify some contraints, especially the Primary Key/Foreign Key

 

For PK/FK, the UX could be either the selection of some fields of the flow or a free field (to let the user choose a constant).

From wikipedia :

In the relational model of databases, a primary key is a specific choice of a minimal set of attributes (columns) that uniquely specify a tuple (row) in a relation (table).[a] Informally, a primary key is "which attributes identify a record", and in simple cases are simply a single attribute: a unique id. 

So, basically, PK/FK helps in two ways :
1/ Check for duplicate, check if the value inserted is legit

2/ Improve query plan, especially for join

Hello,

 

My issue is very easy to solve. I want to use the generic ODBC In database for a specific base (monetdb here but it isn't important).

 

alteryx_monetdb1_.png

 

The connexion works just fine. However, I cannot create table because the data types are changed and does not even exist. Here is my data with some Date type :

 

alteryx_monetdb2_.png

 

 

And here the error in my data stream in give me this very interesting message  :

 

Error: Entrée du flux de données (2): Erreur lors de la création de la table "formation.temp1" : [MonetDB][ODBC Driver 11.31.11]Type (datetime) unknown in: "create  table "formation"."temp1" ("AccountNumber" varchar(255),"BillToAddressID"
syntax error, unexpected IDENT in: ""Freight""

CREATE  TABLE "formation"."temp1" ("AccountNumber" varchar(255),"BillToAddressID" float,"BusinessEntityID" float,"Comment
" float,"CreditCardApprovalCode" varchar(255),"CreditCardID" float,"CurrencyRateID" float,"CustomerID" float,"DueDate" datetime,"Freight" real,"OnlineOrderFlag" float,"OrderDate" datetime,"OrderDate_Month" varchar(255),"OrderDate_Year" float,"PurchaseOrderNumber" varchar(255),"RevisionNumber" float,"SalesOrderID" float,"SalesOrderNumber" varchar(255),"ShipDate" datetime,"ShipMethodID" float,"ShipToAddressID" float,"Status" float,"SubTotal" float,"TaxAmt" float,"TotalDue" float)

 

 

1/ My field is a date, why do you want to convert it in Datetime??

2/ Datetime is not even a usual field type in sql database (at least not supported by monetdb, vertica, postgresql, oracle, etc, etc...)... it should obviously be timestamp

 

Currently, this non-specific in database ODBC connexion cannot be used at all!

 

I would like to see In-DB batch macros, currently we are joining tables with 30 million+ records and we are having to run it through standards tools because we are unable to process via In-DB, which has a 20% improvement in processing speed based on the peformance profiling.  

From Wikipedia

Druid is a column-oriented, open-source, distributed data store written in Java. Druid is designed to quickly ingest massive quantities of event data, and provide low-latency queries on top of the data.[1] The name Druid comes from the shapeshifting Druid class in many role-playing games, to reflect the fact that the architecture of the system can shift to solve different types of data problems.

Druid is commonly used in business intelligence/OLAP applications to analyze high volumes of real-time and historical data.[2] Druid is used in production by technology companies such as Alibaba,[2] Airbnb,[2] Cisco,[3] eBay,[4] Netflix,[5] Paypal,[2], Yahoo.[6] and Wikimedia Foundation [7] 

 

More and more companies are going from Hive to Druid for Dataviz needs, maybe it's time to look for Druid Integration with Alteryx?

  • In Database

DELETE from Source_Data Where ID in

SELECT ID from My_Temp_Table where FLAG = 'Y'

 

.... 

 

Essentially, I want to update a DB table with either an update or with the deletion of rows.  I can't delete all of the data.  My work around will be to create/insert into a table the keys that i want to delete and try to use a input/output tool with SQL that performs the delete.  Any other suggestions are welcome, but a tool is best.

 

Thanks,

Mark

The Transpose In-db stands in the "Laboraty" for years now. I understand Alteryx invested some time and money to develop that but sadly we still can't use that tool for sensitive workflows. Did you get some bugs on it? Can you please correct it and make this tool an "official" tool?

 

Thanks

  • In Database

I would like to suggest creating a fix to allow In-DB Connect tool's custom SQL to read Common Table Expressions. As of 2018.2, the SQL fails due to the fact that In-DB tools wrap everything in a select * statement. Since CTE's need to start with With, this causes the SQL to error out. This would be a huge help instead of having to write nested sub selects in a long, complex SQL code!

  • In Database

Spark ODBC is really faster than Hive, at least for GUI.

However, two things are missing :
1/ Append existing for the write date (exists his way on Hive)

 

2/ability to put "overwrite" even if the table does not exist (it works this way on Hive)

 

test_spark_sql.png

 

These two drawbacks limit severly th

  • In Database

I am pulling DB2 data into a workflow and I have a lot of joins that are slowing it down a lot.  For the SQL stuff I am doing, I can use in-db to fix this issue, but with DB2, in-db is not yet supported so the joins take a lot of time to all be pulled together.  I also think that just focusing on the in-DB tools in general would be a good thing.  These tools really can be game changers when it comes to large datasets.  

Top Starred Authors