Alteryx Designer Desktop Discussions

mercurial_maverick · ‎11-12-2020

I've a dataset output in Spark SQL which has multiple columns like defrostStatus, sensorTS, etc. What I'm trying to do here is whenever defrostStatus = 1, the following timestamps that fall within the range (sensorTS < following timestamps < sensorTS + 30 minutes) need to be highlighted in a new column and excluded later.

For example, in the image attached: row 14 has defrostStatus = 1 at sensorTS = 2020-11-08T02:10:00.000+0000. So rows 15,16,17 (reason why these rows is because they fall withhin the sensorTS < following stamps < sensorTS + 30 minute logic) need to be compared for the above logic and needs to be highlighted. This iteration needs to stop once we get to a row where the following timestamps > sensorTS+ 30 minutes.

In this case, row 18 has sensorTS = 2020-11-08T02:50:00.000+0000 which is > than 2020-11-08T02:10:00.000 + (30 minutes) and hence the above iteration shouldn't be applied to this row or any other rows than fall within this range. Please let me know how this needs to be done.

Qiu · ‎11-12-2020

@mercurial_maverick
It seems not so difficult, but having difficult to understand the point here.

1. by ' following timestamps' you mean the Next_TS column or the sensor_TS of next row?

2. "In this case, row 18 has sensorTS = 2020-11-08T02:50:00.000+0000 which is > than 2020-11-08T02:10:00.000 + (30 minutes) "

Which cell is this "2020-11-08T02:10:00.000" coming from?

mercurial_maverick · ‎11-12-2020

Hello @Qiu Thank you for your response. Let me clarify your questions.

1. Yes, I'm referring to Next_TS column. Let me explain the whole thing in a different way.

So whenever defrostStatus =1 (calling it "defrost row" for referencing purpose), that particular timestamp when it attained that status is called "defrost row timestamp" (for referencing purpose I'm calling it so). So the next 30mins from that particular TS needs to be highlighted. Which means every row that comes next after the "defrost row" needs to be validated for the criteria: (defrost row timestamp) < (every row timestamp after defrost row timestamp) < (defrost row timestamp) + 30 minutes. This validation needs to stop once we reach a row where the (row timestamp) > (defrost row timestamp) + 30 minutes.

2. Answering your 2nd question. The timestamp in question (2020-11-08T02:10:00.000) is coming from sensorTS in row 14 which is the "defrost row timestamp" as explained in previous point.

Please let me know if we need to schedule a zoom call if you need more clarifications.

Qiu · ‎11-12-2020

@mercurial_maverick
Thank you for the clarification. I understand a bit more.

So actuall "Next_TS column" is not in the play, right, it is all about the sensor_TS and its succeeding rows.

mercurial_maverick · ‎11-12-2020

@Qiu Yes this particular ask is around the sensorTS and succeeding rows whenever the mentioned defrostStatus condition is achieved. The Next_TS column is actually used for calculating the difference in timestamps but not relevant to this particular ask here.

@Qiu Can this be achieved in the query itself somehow? The datasource is Databricks which has millions of records and it errors out when run in alteryx

Qiu · ‎11-12-2020

@mercurial_maverick

Thanks.

Let me take a try.

mercurial_maverick · ‎11-12-2020

@Qiu Sure, thank you!

Qiu · ‎11-12-2020

@mercurial_maverick
First draft for your review.

mercurial_maverick · ‎11-13-2020

Hello @Qiu Thank you for your response. I'm unable to access the workflow package due to version error. My desktop alteryx version is 2019.4.8.22007. Could you please resend to comply with this version?

Also, looking at the image I've a point to make. The ask is "whenever defrostStatus =1 evaluate the following rows. There are 3 such occurrences (row 2,3,4 that have defrostStatus = 1). Since row 2 has value of 1, row 3,4,5 need to be highlighted since they fall within the 30min threshold from row 2 timestamp. Row 3 also has value of 1, so row 4,5,6 need to be highlighted since since they fall within the 30min threshold from row 3 timestamp. Same applied for row 4 that also has a value of 1. So subsequent rows following row 4 (until the row that falls within the 30min threshold from row 4 timestamp) needs to be highlighted. And the process starts again when the next row of defrostStatus value = 1 is achieved.

The highlights need to be in a new column. The ones that fall within threshold needs to be flagged as "Exclude" and the ones that do not fall within the threshold needs to be flagged as "Included" in this new column.

Apologies for the writeup, I just wanted to be sure to convey and clear out everything.

mercurial_maverick · ‎11-16-2020

Hello @Qiu , Any updates here?

Alteryx Designer Desktop Discussions

Highlight rows falling within a timestamp range meeting a condition