Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

GDB files take over a day to open

jjbates88
5 - Atom

Hello,

 

I purchase tax parcel information organized and delivered by U.S. state in ESRI GeoDatabase format (gdb).  Each state file is about 6GB. 

 

I convert the files into .yxdb format as my first step.  I understand these files contain every parcel in each state (100,000s), but they each take 24-36 hours to load making the process unwieldy.  I update these files every quarter and the process to update 15 or so states takes weeks.  The waiting is ruining my timeline for getting work done. Since I am using Batch Macros, I can never tell if the machine is hung up, moving abnormally slow, or simply crunching away as expected. 

 

Questions:

Is there a better way to import/convert GDB to be used by Alteryx?  Reading the old discussions, it seems like the integration of the GDB format may have been a workaround to satisfy those asking for the functionality.

 

Should I be running the loading procedure in parallel and not in series with the Batch Macro?  Is the bottleneck not related to machine resources?  Is there a better way to see work progress than looking at the macro completion % (stays at 50%)?

 

My machine has a Xeon Processor with 10 real/20 virtual cores, 128 GB of RAM, and uses SSD drives.

 

Thanks for the help!  

 

 

 

 

3 REPLIES 3
apathetichell
19 - Altair

Can't tell without seeing your workflow. Do you incorporate the downloading (or downstream uploading) in your workflow? My major places to look would be:

1) network issues (ie downloading/uploading)

2) data cleanse. Don't do it.

 

That seems longer than I'd expect. And just an FYI - I tend to use Batch Macros here. I believe on large data they may slow down the processing but they do provide partial success and better tracking/testing.

jjbates88
5 - Atom

Thanks for the reply.

 

The files are all local on my SSD.  There is a Data Cleanse step in there, I'll take it out. 

 

I just started the macro and will see if taking Data Clease out helps.  I did notice that it always showed 96% until completion many hours later.  What a strange bug in what should be a straightforward function.  I may try smaller batches running in parallel to see if the system can handle more than one macro running at the same time.

  

apathetichell
19 - Altair

Data cleanse is a memory hog. Also - get rid of browses dispalying huge amounts of map points. I'm running some queries with/without browses and with/without datacleanse (with a single spatial field) to document.

 

1,000,000 rows (one spatial point/3 text fields).

 

- no data cleanse - no browse (.5 seconds)

- with data cleanse - no browse (2.8 seconds)

- with data cleanse - with browse (hit stop at the 4 minutes mark 345,000 records in)

- with browse - no data cleanse - no spatial object (.4 seconds)

-with browse - with data cleanse - no spatial object (.6 seconds)

 

 

this is a base workflow. Extra browses are creating extra in-memory map objects. drop them.

Labels