Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Filtering a Calgary dataset to only check against records that match serveral input fields

MikeN72
7 - Meteor

Hello

 

First, apologies if I have put my question into the wrong section.

 

My question is on how to do filtering of a dataset to only pull records that match to 1 or more input fields (i.e. minimising the records pulled from a dataset to be checked against).

 

I have a macro that takes in a string.

This string is split up into serveral new fields (input_fname, input_lname, input_email).

The macro takes these new fields and needs to check against a dataset. This dataset holds about 50,000 records.

I want the workflow to only pull the records from the dataset where the dataset's:

data_fname Like the input_fname

OR

data_lname is like the input_lname

OR

data_email is like the input_email

 

And so only take these records from the dataset then match against.

Say that the input_lname = "Bob" and the input_lname = "Todd" and the input email = "bob@email.com" then I only want records matching these criteria pulled from the dataset, i.e. all records from the dataset where

data_fname LIKE "Bob%"

OR data_lname LIKE "TODD%"

OR data_email LIKE "bob@email.com%"

 

 

I know that this can be done using a Calgary Join - in the Additional Query Criteria - Calgary XML Query section on the tool , but am not sure of how to write the syntax.

I want something like this:

<Or>
  <Field name="data_fname" value=indata_fname type="begin" /> //data_fname field like the indata_fname field
  <Or>
    <Field name="data_lname" value=indata_lname type="begin" /> //data_lname field is like the indata_lname field
  </Or>

<Or>
    <Field name="data_email" value=indata_email type="begin" /> //data_email field is like the indata_email field
  </Or>
</Or>

 

My aim is to minimise the data rows from the data set that are being pulled out and checked against, so it might be only several rows that the workflow checks agains instead of 50,000 rows - making the macro of course much faster :)

 

I have checked the Calgary Join documentation and had a quick look in these forums with no luck.

 

The nearest help I have found (https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Querying-a-Calgary-DB-File-to-Select-and-Lim...) shows only how to limit the rows pulled to a fixed string, and not the value from a field in the workflow that feeds into the Calgary Join.

 

I am pretty certain that I saw it done a few years ago but I can't remember the exact syntax to use in the query.

 

Any help would be great

 

Thanks

 

3 REPLIES 3
JessicaS
Alteryx Alumni (Retired)

Hi @MikeN72,

 

Have you checked out this reference to calgary XML queries?  

 

Thanks,

Jess Silveri
Manager, Technical Account Management | Alteryx
MarqueeCrew
20 - Arcturus
20 - Arcturus

@MikeN72,

 

I'll throw you a bone here.  I wouldn't try it that way.  Instead, you might want to consider using a Calgary Join (or many).  I might suggest using a calgary join that matches ALL conditions (even though a begins... search on email doesn't seem logical to me).  By first looking for all, you'll avoid getting all BOB records.  Then I'd search for first and last names from the unmatched records.  Then you can get the individual field matches.

 

Capture.PNG

 

That's my take on the question at-hand.  Will this work for you?  By the way, I would also consider transposing the data search fields into NAME + VALUE Pairs (with a key) and running the search against a calgary set that has the lookup fields transposed as well.  Then you could lookup all values at once. by EXACT on NAME (fname, lname, email) and BEGINS on VALUE.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Yongcan
8 - Asteroid

Calgary Join won't work here, as it required at least one field be choosen at "Query Criteria" and it's AND condition along with "Additonal Query Criteria"

 

 

data_fname LIKE "Bob%"

OR data_lname LIKE "TODD%"

OR data_email LIKE "bob@email.com%"

 

To get with All OR condition, Calgary Input tool provide the or condition at Query section. 

Calgary_Input_Or_Condition_2023-01-11_15-45-45.png

 

Then click the "Switch to Manual Editor" it will pop with "Calgary XML Query" code.

Calgary_Input_Or_Condition_XML_2023-01-11_15-45-45.png

 

 

Now you can create a Batch Macro and pass in the cydb file path and the concatenated  Calgary XML Query code like below:

 

<Or><Field name="data_fname" value="Bob" type="begin" /><Field name="data_lname" value="TODD" type="begin" /><Field name="data_email" value="bob@email.com" type="begin" /></Or>

 

Batch Macro:

Batch_Macro_Calgary_XML_query_update_2023-01-11_16-15-31.png

 

 

 

 

And overall test workflow like below. ( as the workflow contain some personal info i upload screenshot only)

Calgary_Input_Or_Condition_Demo_code_2023-01-11_15-45-45.png

Labels