Hi,
Trust you all are well
I have a large dataset. That is with rows when transposed, have 26,874,045. A sample data is attached.
When we include Append option, the same is taking a lot of time. The workflow didn't even complete after 15 Hours.
The data have:
1. Brands
2. Selling Points
3. Date: from 01-10-2018 to 01-12-2020
4. Values: 1 - They ordered. 0- They Didn't ordered
Condition to take into Consideration:
To Calculate Lost Customer Rate per Month: Count of Lost Flag in that Particular Month / Count of Active Sell Points in the Previous Month
Explanation
1. Group by Brand then Group by Selling Points.
2. After 1, if 3 or more consecutive zeros come it is considered as 1 Lost flag. The month at which the pattern of consecutive 3 or more zero started is the first Lost Month and only that month is taken for Lost calculation. So in the below example:
The zeros should be Calculated for each brand under each sell point separately
For example: For sell point 20 under Brand A1, it should first check for the first one, then look for the zeros. For sell point 21 under Brand A1, the same way it should look for first one and then zeros. So as Brand and Sell Point Changes, it should again look for the first one the start counting zeros.
01/06/2020 is one Lost Flag and 01/02/2019 is another Lost Flag for the Same Brand (A2) in sell point (8). So in this example only 2 Lost Flag. We are not taking 01/07/2020, 01/07/2020, for Lost Calculation but they are still inactive Sell Point.
3. If there are less than 3 consecutive zeros, it should be considered as sales active months.
For Example: Here the two zero months should be considered as active month for Sell Point:
4. Calculate Count of Active Sell Points in the Previous Month: If we are calculating for April, we should get the total count of One in March (Please refer the first screenshot.
Once this flags are achieved we can ignore the group by Sell Point and bring in Group By Date, so that we can calculate it for particular month.
Sample data is attached.
Is it possible to create a workflow without using the Append Function?
Highly Appreciate your help!
Solved! Go to Solution.
Intresting usecase. Here is how you can do it.
Workflow:
1. Using multirow formula to get count for consecutive 0's
2. Using 2nd multirow formula to check if the consecutive 0's >= 3times and flag it.
EDIT: @varun86vgopal i have added the summarize by date part. I had missed it.
3. Using summarize tool groupby date and sum of flag.
Hope this helps 🙂
Yes this is indeed possible with a simple Multi-Row Formula.
Please see the attached workflow.
With the multi-row formula, you look at the previous row and the current row. The logic is as follows:
- If the previous row value is 1 and the current is 0, then the flag will be set to 1
- If the previous row value is 0 and the current is 0, then the flag will increment (i.e. it counts how many consecutive 0 values there are)
- If neither of the above are true, the flag is reset to 0
Finally, using a formula tool, set the Active Flag to 1 if the counter is less than 3 (i.e. more than 2 consecutive rows will be flagged as inactive).
Please let me know if this sorts the issue!
Ben
Thanks a ton @bensilv
This one was really helpful. The zeros should be Calculated for each brand under each sell point.
For example: For sell point 20 under Brand A1, it should first check for the first one, then look for the zeros. For sell point 21 under Brand A1, the same way it should look for first one and then zeros. So as Brand and Sell Point Changes, it should again look for the first one the start counting zeros.
Hope the problem statement is clear.
Thanks a ton @atcodedog05
This one was really helpful.
The zeros should be Calculated for each brand under each sell point separately
For example: For sell point 20 under Brand A1, it should first check for the first one, then look for the zeros. For sell point 21 under Brand A1, the same way it should look for first one and then zeros. So as Brand and Sell Point Changes, it should again look for the first one the start counting zeros.
Hope the problem statement is clear.
No problem @varun86vgopal!
If I am understanding correctly, you're looking to essentially apply this logic but group it by Brands and Sell Point?
e.g. in the below image, the counter should restart when sell point goes from 14 to 20, therefore sell point 20 should be active? Some clarification would be great, if A9 20 is 0 for 01/10/2018, 01/11/2018 would they be classes as active or not?
Groupby is already been taken into consideration refer the multi-row formula tool.
Please check and let me know
If my understanding is correct, that has indeed been covered by @atcodedog05
If we have solved the problem, please remember to mark as solved 🙂
Hi @atcodedog05,
Only the zeros after the one should be taken into consideration. So below, it shouldn't count the first zeros from A1 21 since the first One didn't happen. So this is the reason, I was using two different flows and append the data but the workflow is taking more than 18 to 20 hours.
Thanks a lot for your help.
Hi @bensilv
Only the zeros after the one should be taken into consideration. So below, it shouldn't count the first zeros from A1 21 since the first One didn't happen.
Thanks a lot for your help.