Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Custom Filtering yxdb file before importing

kangx322
5 - Atom

Hello, 


I have a very large yxdb file that takes a long time to import. 

 

Is there a way to apply filter when importing this file? I know when I connect input with database directly (i.e. Teradata), I can apply custom SQL for filtering. 

Is there something similar when I use yxdb file as input?

 

Or any tips on improving import performance with large file?

2 REPLIES 2
mbarone
16 - Nebula
16 - Nebula

No, the best you can do is select just the first X number of rows (but I'm guessing you knew that).  For optimization, you can create the YXDB in AMP mode (Alteryx Multithreaded Processing) and then read it into a module that's also set to AMP mode.  That actually exponentially increases the read time.  Other than that, just the usual - use an autofield tool right before the YXDB is created to make the fields as compressed as possible.

 

One other option that I've had good results with is instead of YXDB, you can output it as a SQLite table.  Even if you don't have SQLite on  your machine, it doesn't matter - you're machine won't recognize it if you tried to "open" it, but  Alteryx can create it and read it and it works like a traditional SQL database.

mst3k
11 - Bolide

Look into using a Calgary database (.cydb) instead of an Alteryx database file (.yxdb). See the Calgary toolset.

https://help.alteryx.com/designer/calgary

 

A Calgary database is a lot like a .yxdb, except you can put indexes on it. It will still be a flat file in a network folder, but there will also be a bunch of other files related to the indexes you build.

Put an index on whatever it is you're trying to filter, then use the Calgary Input tool to load it. It will use the index and only load the rows you need. If you're looking for a specific row, it's possible to look up a value in an instant out of a Calgary database with millions of records.

If the thing you're filtering needs to be a little more dynamic than the hard-coded filters you would put into the "query" of the Calgary Input tool, then look into using a Calgary Join which acts like an input but streams the fields you'll connect on from your data to join to the Calgary tool. Again, it will work very fast if the indexes you built are correct

mst3k_0-1631908291552.png

 

 

 

Labels