Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Custom Join Conditions Driving Me Crazy! There a Solution for Large Datasets?

Yogesh1795
8 - Asteroid

I have two input in my workflow, In one Input I have Tech Name, Period Start Date, Period End Date. This is the fortnightly dates, and a rate column, that tells that during that period what was the hourly rate of the technician.

In another table I have Actual date of completion of job, and job number with the name of tech

What I want is, join the two tables, such that In a new column I can get the Rate from table one, in a new column.

When I join the two, without applying custom condition, due to size of data, Alteryx give me error.

In SQL we can simple write a where condition on date like [actual date] >= [period start date] AND [actual date] <= [period end date]

Can we do something like that In Alteryx?




6 REPLIES 6
Bren_Spill
12 - Quasar
12 - Quasar

@Yogesh1795 - you can join on tech name, so you have all dates in one row. Then create a formula similar to what you outlined above for a new field called "Rate Used" that identifies where the date of completion falls between the start and end dates.

 

From there you can filter out the records that don't return a result in your formula. I.e., where date of completion is not between start and end dates

Yogesh1795
8 - Asteroid

My dataset is too big so join field don't return any result.

Bren_Spill
12 - Quasar
12 - Quasar

How big is the dataset?

Qiu
21 - Polaris
21 - Polaris

I agree with @Bren_Spill , alteryx should be able to handle the dataset for reasonally large dataset.

Can you try to take some sample data from two streams and join?
maybe there is something not consistent in the Name field?

KGT
12 - Quasar

I've processed billions of rows of data in Alteryx many times and so I assume you mean that it takes too long rather than doesn't return a result.

 

The normal way I do this when joining on date ranges is to use a generate rows and create a record for each date, then join those datasets on date (and whatever else). It will multiply the records in your first table by 14.

 

If you have really large ranges, you can always create a month column then join on [Tech Name] && [Month] before then trimming further inside those subsets, that way the first join is roughly 1/30th of the records and you only create daily records within a month. But in this case, your ranges are already fortnightly.

Raj
16 - Nebula

@Yogesh1795 
can you share the snip of error message.

Labels
Top Solution Authors