Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Bulk Loader Capability for Postgres SQL

Hi,

 

     Currently loading large files to Postgres SQL(over 100 MB) takes an extremely long time. For example writing a 1GB file to Postgres SQL takes 27 minutes! This is serious impacting our ability to use Alteryx as an ETL tool for loading our target Postgres Data Warehouse. We would really like to see the bulk load capacity to Postgres supported by Alteryx to help alleviate the performance issues.

 

Thanks,

Vijaya

60 Comments
Yuges81
5 - Atom

Hi,

 

If Alteryx doesn't help us, we can use Python to improve time :

import psycopg2
import os
from datetime import datetime
from simple_timer import Timer
import time

try:
    pg_user = "user"
    pg_pass = "pass"
    pg_host = "pg_address"
    pg_port = "pg_port"
    pg_dbname = "database"
    pg_schema = "schema"
    pg_table = '"table"'

    file_csv = 'path_to_csv_file'

    pg_connect = "user='%s' password='%s' host='%s' port='%s' dbname='%s'"%(pg_user, pg_pass, pg_host, pg_port, pg_dbname)

    # Log start
    m_timer = Timer()

    # Database connection
    connection = psycopg2.connect(pg_connect)

    cursor = connection.cursor()

    # Print PostgreSQL version
    cursor.execute("SELECT version();")
    record = cursor.fetchone()
    print("You are connected to : ", record,"\n")

    # SET schema
    cursor.execute("SET search_path TO %s" % pg_schema)

    # TRUNCATE Table
    pg_SQL = 'TRUNCATE %s;' % pg_table
    cursor.execute(pg_SQL)
    connection.commit()
    print("Exec OK : %s" % pg_SQL)

    # Open CSV & Skip first line
    print("Ouverture fichier CSV : %s" % file_csv)
    csvfile = open(file_csv, 'r')
    # Skip First Line (si entête)
    next(csvfile)
    
    # Import et Commit
    cursor.copy_from(csvfile, pg_table, sep=';')
    connection.commit()

    # Trace fin
    time.sleep(0.001)
    m_timer.stop()

    print("Job did in {:3f} sec.".format(m_timer.duration))

except (Exception, psycopg2.Error) as error :
    print ("Error while connecting to PostgreSQL", error)
finally:
    #closing database connection.
        if(connection):
            cursor.close()
            connection.close()
            print("PostgreSQL connection is closed")
alexisjensen
8 - Asteroid

Can we please get an update where this bulk loader is on the roadmap?

 

Regards,

Alexis

bora_perusic
Alteryx Alumni (Retired)
Status changed to: Accepted
 
bora_perusic
Alteryx Alumni (Retired)
Status changed to: Accepted
 
bora_perusic
Alteryx Alumni (Retired)

@alexisjensen we cannot commit to timelines but I hope we can start the development on this (PG bulkloader) early 2020, it is literally a next item on the roadmap. Please, stay tuned, we should have some good news relatively soon.

NicolasFabre81
8 - Asteroid

Good news !

Waiting for it, the Python workaround given by @Yuges81 works very well.

davidhenington
10 - Fireball

@bora_perusic Great to hear Bora thank you! 

MILKEYSUFC
5 - Atom

Excellent news!

 

Python workaround is good but will be so much better for our maintenance having this bulk load capability.

nickmiranda
5 - Atom

+1

ChristianLaws
5 - Atom

+1