Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Advice on calculating 'open' records.....

danloz
8 - Asteroid

Hi,

 

This is a fun one.  I can think of a few ways to approach this problem, the trouble is I'm not sure if any are particularly efficient so I wanted to ask the pros.  

 

Here's the set up:  

 

I have a table that looks like this.  

 

[Record ID] [Open_Date_Time] [Close_Date_Time]

 

My objective is to create an output that looks like this:  

 

[Date_Time_By_Hour] [Open_Records]

 

I'm pretty sure I'm already on the right path in that I've already created a table of every hour between the min and max record using gen rows.  I assume regardless of the solution, I'd be using this as a frame to compute against.  

danloz_0-1648092593450.png

The way I see it:  

 

A)  Append the frame table to the fact data, creating a distinct record for every possible date/time in the frame and every unique record in the fact data.  Then simple formula (if frame_date_time >= [Open_Date_Time] and blah blah) and then I could summarize the results of the if then formula and I get my result.  This "works" but my 'hello world' sample for this problem is about 20k records, after appending  it's about 4 million records.  That seems like if I expand the fact data I'm creating quite a few records and I worry about performance and stability.  This might be unfounded, because I've never actually created a workflow that had issues with performance or stability - but it sure seems like creating a bajillion records on a uncontrolled append would do it.... 

 

B)  I guess also, I could append a field for each of the combinations in the frame table and just perform the same test maybe using the field name and the multi field formula.  But I think this would create the same problem as A only now just by adding an absurd number of columns.  I'm still not totally clear if Alteryx prefers to be abused in terms of record count or column count so I'm not sure if there's any advantage to this over A.  

 

C)  I think i could implement A with a batch macro and keep the record a little more tidy, but I think I still run the risk of creating something that could be unwieldy.  

 

 

Are there any other methods that I'm thinking of?  

3 REPLIES 3
clmc9601
13 - Pulsar
13 - Pulsar

Hi @danloz,

 

What an interesting problem!

Here are a few more ideas:

- instead of generating all the possible hours, just generate the list of hours each row fits into, then Summarize (group by, count)

- there's probably a way to represent open time spatially, then use the spatial match tool... turn unix seconds into lat/lon, maybe? then connect those to the universe anchor and put the list of hours (also represented spatially) in the T anchor... this one's a stretch :)

danilang
19 - Altair
19 - Altair

Hi @danloz 

 

Use a generate rows on your fact data to give one record for each hour between Start and End inclusively.   Summarize this by hour counting the number of records.  Join this summarized hourly fact table to your hourly frame on hour and union the joined records and any extra frame hours.  Replace the null counts with 0

 

This should avoid any row explosion issues since you're joining on hour as opposed to appending the two datasets.   

 

Edit:  After rereading @clmc9601 reply, I realized that I just expanded on her first point.  Sorry for the duplication. 

 

Dan

danloz
8 - Asteroid

Gang,

 

Thanks so much for the insight and advice.  I really appreciate it.  

Labels