Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Append IF [B_Date] Between DateTimeDiff([A_Date],-11,'month') and [A_Date]?

Matthew
11 - Bolide

i've built a workflow where i need to join two tables, but the join is unusual.

 

table A has a list of dates, and i need to join table B onto it anywhere the date on table B is somewhere within the past 11 months of the date on each row of table A

 

i solved it by using the Append tool, and then following it with a filter tool to remove rows where the B_Date falls outside of the range i'm looking for

 

after i finish the append, i can then summarise the data the way i need it

 

i can get away with this because my tables are fairly small.. but it's still waaay bigger than it should be.. the append tool warns you if you go over 16 rows.. i'm appending 623,000 rows onto 200 rows... (meaning i wind up with 124,000,000 rows).. that's a manageable number.. But what if i needed to append 623,000 rows onto 44 million rows?

This solution works this time, but there has to be a more computationally efficient way to do this... right?

 

NothingButThyme_0-1637708198462.png

 

 

10 REPLIES 10
Qiu
20 - Arcturus
20 - Arcturus

@Matthew 
Maybe we can narrow it down to Year first?
A sample set of data will be better.

JarekSkudrzyk
11 - Bolide

Hi,

I agree that posting some test data and workflow itself would be helpful.

 

Have you tried filtering the data before joining? eg. filter out the dates in B that are earlier than min date from A

regards

Matthew
11 - Bolide

@JarekSkudrzyk 

that's pretty much exactly what i'm struggling with. i can't think of a way to properly filter it.. i need to keep (and join) all rows in B that are within the past 11 months of the date in A, but every row in A has a different date

 

i have managed to reduce the size of the tables by grouping by the month instead of the date, but i think that's as far as i can reduce it. 

 

so now we're looking at 200 rows x 42,000 rows = 8,400,000 rows

Matthew
11 - Bolide

@Qiu @JarekSkudrzyk 

 

Thanks for the idea, ive grouped the inputs by month. That has brought this down to 8 million instead of 124 million, but i still think this is an inefficient method. it'll only work with small data sets

 

Ive attached a snippet of my workflow with scrubbed input data

 

Ultimately the goal is to sum two values from B onto A, but i only know which values on table B to sum by first joining the tables together..

JarekSkudrzyk
11 - Bolide

I was able to reduce the time of appending fields from 1972ms to around 324ms by applying preliminary filter to set B before appending to set A.

 

By preliminary I mean: filtering out the dates that are not within the following range:

minimum month of set A for the given market/series minus 11 months

maximum month of set A for the given market/series


I used batch macro to achieve that - see attached workflow.

 

Hope this is helpful for you.

JarekSkudrzyk
11 - Bolide

Btw when solving your case I have learned that large text input tools tend to crash my Alteryx (when opening configuration window for them).

You may want to consider using e.g. yxmd input files and exporting the workflow together with input files into yxzp package.

 

regards

Qiu
20 - Arcturus
20 - Arcturus

@Matthew 
I agree with @JarekSkudrzyk , the large text input tends to freeze, if not crashing.
I have replaced the Appending Tool with theJoin tool, which will not enlarge the data stream.
so the running time is reduced from 2.6s to 0.5s in my PC.
Hope this result is acceptable for you.

1125-NothingButThyme.PNG

Matthew
11 - Bolide

@Qiu I'm impressed, that is a much more logical solution. thanks for the help, i'll definitely go with this solution!

Alteryx is a very useful program, but sometimes it's difficult to rewire the way you think about specific problems.

 

@JarekSkudrzyk i agree, i shouldn't have used the text input in my example, that was my bad.. the actual inputs are coming from a database stream

JarekSkudrzyk
11 - Bolide

Qiu - Brilliant idea, congrats!

Labels