Alert: There is a planned Community maintenance outage October 16th from approximately 10 - 11 PM PST. During this time the Alteryx Community will be inaccessible. Thank you for your understanding!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Faster pgadmin Bulk Loader Update ?

Alex_Intelling
6 - Meteoroid

Hi Alteryx Community, 

 

Scenario - We receive Customer extract file every day from Client with 80K records and 39 fields containg a flag (P,F or C) as an xlsx file . This data is written to a postgres table with a primary key field. 

 

Objective - To insert new records or update the existing flag in the Postgres table for previous records from the new file received.

 

Example: 

 

Day 1 : 80K records written into Posgtres table using File Format- Postgres SQL Bulk Loader in the output tool. ( This runs in 37 secs).

Day 2 : New extract file is provided and needs to update the flag in the table for records loaded in Day 1 or insert new records into the table - Used File format ODBC Database. This option is used as there is no insert/update option for bulk loader file format.

 

Issue - When running the Day 2 - it takes around half hour to insert/update the postgres table as its using ODBC. Performance gets impacted as the size of the table gets increased.

Day 2 file when written using insert/update output option took around half hour. 

 

Question : Is there a way we can achieve the fast loading performance of a postgres bulkloader in addition to the insert/update feature from ODBC in order to process the extract file faster everyday? Objective is to always have the latest flag status in the postgres table.  

 

Will it perform faster if i upgrade my version to the latest one? .

 

 

Alteryx version used: 

Alteryx Admin Designer

Version: 2021.1.3.22649

 

postgres 4 Database used:

Version 4.30

Python Version

3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]

Flask Version 1.0.2

Application Mode Desktop

5 REPLIES 5
BrandonB
Alteryx
Alteryx

What about an approach where you use a bulkloader to write data into a staging table, and run an INSERT INTO command in PostSQL to update the other table

 

BrandonB_0-1652390540539.png

 

 

 

Alex_Intelling
6 - Meteoroid

Hi Brandon,

 

The INSERT INTO command in the 'Post Create SQL Statement' will insert the day 2 data from the staging table to the main table after the workflow runs. 

 

eg: Day 1 had 6 records with flag 'P' for order number XXXXX which is written to the main table.
 Day 2 may have the 6 records with the same order number XXXXXX but with a different flag 'F'.

 

We are looking for solution to update the flag field in the main table after insertion.   Using the INSERT INTO command in the PostSQL solution - the main table will now contain 12 records in total - 6 with flag 'P' and 6 with Flag 'F' . 

 

The main table should have only 6 records with the latest flag ie order number XXXXX with 'F'

 

 

Only the file format -  ODBC datasource had update/insert if new option . But when processing 8000 records takes a massive time.  was wondering if there is an option with Bulk loader as we recieve around 8000 records every day.

Alex_Intelling_0-1652713829832.png

 

Thanks,

Alex 

 

 

  

BrandonB
Alteryx
Alteryx

@Alex_Intelling that's because it needs the ON CONFLICT part of the query when using an upsert functionality in postgres. You can see a tutorial here; https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-upsert/ 

Alex_Intelling
6 - Meteoroid

Hi Brandon , 

I looked at the link you provided, i was able to run a insert statement in Alteryx when I add a hardcoded value for @primaryKey and @wf_boundary_state in the query below. But I read in other forums the post create sql statement in alteryx cannot be parametrized to bring value for @primaryKey and @wf_boundary_state from the data stream. (ie the new file provided which has the updated Flags for the order number)

Alex_Intelling_0-1653478991779.png

 

Instead I figured we can use dynamic query to get the values from the data stream by replacing the specific string. I know it works for Select statements but when trying for updating string from data stream to a Insert statements - the tool treats it as a select and errors out. 

 

Alex_Intelling_1-1653479491734.png

 

 

Alex_Intelling_2-1653479514531.png

 

The initial way I attempted was to compare the first file and second file using the order number and if they are different - write the latest record only to a yxdb file.  But it was taking long time when checking against days worth of files. 

 

regards,

Alex

 

 

BrandonB
Alteryx
Alteryx

If the postsql needs to be dynamic, you can wrap the output tool in a macro and update the postsql value with the action tool as described here: https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Post-SQL-Create-amp-Changing-variable/... 

 

You are correct that in an Input Data tool it is expecting a SELECT statement because the purpose of that query is to feed data into the workflow. 

Labels