Alteryx Designer Desktop Discussions

kangx322 · ‎08-17-2021

Hello,

I have a very large yxdb file that takes a long time to import.

Is there a way to apply filter when importing this file? I know when I connect input with database directly (i.e. Teradata), I can apply custom SQL for filtering.

Is there something similar when I use yxdb file as input?

Or any tips on improving import performance with large file?

mbarone · ‎08-18-2021

No, the best you can do is select just the first X number of rows (but I'm guessing you knew that). For optimization, you can create the YXDB in AMP mode (Alteryx Multithreaded Processing) and then read it into a module that's also set to AMP mode. That actually exponentially increases the read time. Other than that, just the usual - use an autofield tool right before the YXDB is created to make the fields as compressed as possible.

One other option that I've had good results with is instead of YXDB, you can output it as a SQLite table. Even if you don't have SQLite on your machine, it doesn't matter - you're machine won't recognize it if you tried to "open" it, but Alteryx can create it and read it and it works like a traditional SQL database.

mst3k · ‎09-17-2021

Look into using a Calgary database (.cydb) instead of an Alteryx database file (.yxdb). See the Calgary toolset.

https://help.alteryx.com/designer/calgary

A Calgary database is a lot like a .yxdb, except you can put indexes on it. It will still be a flat file in a network folder, but there will also be a bunch of other files related to the indexes you build.

Put an index on whatever it is you're trying to filter, then use the Calgary Input tool to load it. It will use the index and only load the rows you need. If you're looking for a specific row, it's possible to look up a value in an instant out of a Calgary database with millions of records.

If the thing you're filtering needs to be a little more dynamic than the hard-coded filters you would put into the "query" of the Calgary Input tool, then look into using a Calgary Join which acts like an input but streams the fields you'll connect on from your data to join to the Calgary tool. Again, it will work very fast if the indexes you built are correct

Alteryx Designer Desktop Discussions

Custom Filtering yxdb file before importing

Re: Issue with “Block Until Done” and Multiple Out...

Re: Required the data in "Double format"

Re: Question anchor in Formula and Filter tool

Re: How to Filter out with certain value

Getting Started | Learning Workflows: Predictive T...