community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Fuzzy Match temp file size

Asteroid

Hi,

 

Is there any way to calculate the disk space required for a fuzzy match temp file?  I have a fuzzy match running with 12,000 records in 1 dataset and 1,000,000 in the other and at 78% complete it has hit 100GB.  I only have 100gb available for the temp file at the moment.  Rather than guess for the next attempt, I wondered if there is a way to calculate it?

 

Regards,

Alexis

Alteryx
Alteryx

Hi @alexisjensen,

 

Not sure that there is a way to calculate disk space prior to executing the workflow. However, you do have some options to optimize and better understand your workflow...

 

- Convert your input files to .yxdb as this file type is highly indexed with the Alteryx Engine

- Ensure data is prepped beforehand and if working with addresses, leverage the CASS functionality

- Always start with pre-configured match styles

- Join your data on exact matches before trying to run a Fuzzy Match

- Enable Performance Profiling: this option can be found by clicking a white area on the canvas, navigating to Runtime in the Configuration pane, and then selecting the check mark for 'Enable Performance Profiling.' This option willow allow you to see a milliseconds and percentage breakdown per tool in your workflow.

- There's also a great article on optimizing your workflow found here https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/How-can-I-make-my-module-run-faster...

 

Also something to keep in mind - this is an excellent use case to leverage Alteryx Server and is likely your best bet. With Alteryx Server, you can offload some of that heavy lifting at the desktop level to a server machine. You will also be able to schedule this process as well.

 

Thanks,

Mike

Labels