Dear users, thanks again to all of you who attended our 2016 Tips & Tricks sessions. Several of you shared your tips & tricks with me during the conference and I was able to present some of them during our T&T sessions and guess what? They were a HIT ! .... Thank you!
If you have a tip or trick you want to share with us, no tip or trick is too small or obvious, please share it with us by replying to this post. We will give you full credit if included on the Tips & Tricks booklet and inspire session.
If you see a tip and trick you like, please star it.
And as always, thank you for being the best users we could've possibly asked for.
Looking forward to seeing you at Inspire !!!
When working with interface tools (drop down, listbox, textbox etc.), you can use the %Question.UniqueToolName% syntax to call the user input in any tool. This allows for easier debugging (you can now run an app like a normal workflow and give it test data in the workflow config panel) and a cleaner look (no need for action tools).
I like to Right-Click > Paste to control where copied tools get placed on the canvas (rather than Ctrl-V).
We can use an Action tool without another Interface tool feeding into it, for example: https://community.alteryx.com/t5/Data-Preparation-Blending/Directory-Read-From-a-Different-Directory...
The new workflow navigation features
- Double click the scroll wheel on your mouse zooms to whole workflow.
- Right click drag selects an area of your workflow to zoom in to.
- Right click menu to zoom to specific tool containers.
This one should be a tip for folk - the Field Summary tool is fantastic! Saved me fortunes of time rather than the way I would have done this in SQL by doing count-distinct on fields etc.
Bonus tip - the "Tour": When you put a browse after it, the 3rd output (labelled "I") even has a Tour function so that you can learn how to use this info and this tool without having to revert to a different blog post or help-text. That's really thoughtful and useful and would be a great addition to other tools!
Disable and/or delete all browse tools before scheduling workflows to run on your server. This will help reduce memory consumption and overall server performance hindrances.
Here are some thoughts:
When working with large sets of data this is more apparent and the benefits can add up. Suppose you are looking at a National file and want to find all households with children that own their own home and live in a specific geography. This might look like:
Contains([CHILDREN: AGE 0-18 VERSION 3],"Y") AND
[HOMEOWNER: COMBINED HOMEOWNER-RENTER] == "H" AND
[FIPS ZIP CODE] == '49033'
You would actually be testing more conditions (in the US you would be testing 400+MM conditions) than if you first tested for Postal Code then filtered for Residency and/or Presence of Children separately. The Postal Code would net you a few thousand records. The subsequent filter/tests would get you down to the few records faster.
There are 169 households out of 139,709,868 records. With one filter the job runs in 3:09 minutes. With three filters the job runs in only 2:08 minutes.
I've seen this often:
IF
[LENGTH OF RESIDENCE] == 0 THEN 'Less than 1 year' ELSEIF
[LENGTH OF RESIDENCE] >= 1 AND
[LENGTH OF RESIDENCE] <= 5 THEN '1 to 5 years' ELSEIF
[LENGTH OF RESIDENCE] >= 6 AND
[LENGTH OF RESIDENCE] <= 10 THEN '6 to 10 years' ELSEIF
[LENGTH OF RESIDENCE] >= 11 AND
[LENGTH OF RESIDENCE] <= 15 THEN '11 to 15 years' ELSEIF
[LENGTH OF RESIDENCE] >= 16 AND
[LENGTH OF RESIDENCE] <= 20 THEN '16 to 20 years' ELSEIF
[LENGTH OF RESIDENCE] >= 21 AND
[LENGTH OF RESIDENCE] <= 25 THEN '21 to 25 years' ELSEIF
[LENGTH OF RESIDENCE] >= 26 AND
[LENGTH OF RESIDENCE] <= 30 THEN '26 to 30 years'
ELSE
'More than 30 years'
ENDIF
When processing the IF statement, you can shortcut the coding and run time with this logic:
IF
[LENGTH OF RESIDENCE] == 0 THEN 'Less than 1 year' ELSEIF
[LENGTH OF RESIDENCE] <= 5 THEN '1 to 5 years' ELSEIF
[LENGTH OF RESIDENCE] <= 10 THEN '6 to 10 years' ELSEIF
[LENGTH OF RESIDENCE] <= 15 THEN '11 to 15 years' ELSEIF
[LENGTH OF RESIDENCE] <= 20 THEN '16 to 20 years' ELSEIF
[LENGTH OF RESIDENCE] <= 25 THEN '21 to 25 years' ELSEIF
[LENGTH OF RESIDENCE] <= 30 THEN '26 to 30 years'
ELSE
'More than 30 years'
ENDIF
Once the first test is passed, you need not ask the lower range question again.
Timings for "Before" on 135,708,968 records are: 3:24 minutes versus 3:06 minutes. The code is easier to create this way, but you could actually improve the performance by knowing your data. If you know where your skew is in residency, you could code to asking that question first.
Join is AMAZING if Left and Right inputs have the same record count and order. Suppose that you've used a select tool and done some functions to create new columns of data that you want to join back to the original data. You can use the join and use the "Join by Record Position" configuration (radio button). You can't get any faster than that. Your data is NOT sorted.
If you are joining a Large set of data to a small set of data (Large is large and small is < 32,000 rows) the Find Replace offers you the ability to put the small data into memory and avoid the expensive sort. Imagine with those 138 million records using a TILE tool and then looking up the eight (8) length descriptors from a text input tool. The JOIN version runs in 3:40 minutes and the FIND REPLACE version runs in 3:09 minutes. Adding to that difference, your data is now in LOR order so you might have to resort the data later.
FIND REPLACE also allows for case insensitive and partial matching. If you're ever applying descriptions or labels to values, this is the way that I'd recommend you doing that substitution.
Cheers,
Mark
My nice trick that I use all the time - how to compare two or more records.
Method A: Highlight two records in your data set using 'CTRL'+Click/Drag. Once the desired fields are highlighted, go to the upper-right corner of the results and click the new window (selected records)
Method B: Highlight two records in your data set using 'CTRL'+Click/Drag. Once the desired fields are highlighted, go the up and down arrows on the top bar of the results window and use them to toggle between all your selected records.