Hello gurus!
I've got some time series data of meter reading values. The only problem is that the source system doesn't always give you readings you might expect. It is 'designed' to give daily values of a meter reading, but that doesn't mean you'll alwyas get one row per day, instead you might get something that looks like the following:
Consumption Date Reading
2015-10-29 00:00:00.0 | 4491 |
2015-10-30 00:00:00.0 | 4516 |
2015-10-31 00:00:00.0 | 4537 |
2015-11-01 00:00:00.0 | 4543 |
2015-11-01 00:00:00.0 | 4551 |
2015-11-02 00:00:00.0 | 4557 |
2015-11-04 00:00:00.0 | 4561 |
2015-11-04 00:00:00.0 | 4568 |
2015-11-05 00:00:00.0 | 4572 |
2015-11-07 00:00:00.0 | 4578 |
2015-11-08 00:00:00.0 | 4595 |
2015-11-08 00:00:00.0 | 4597 |
2015-11-10 00:00:00.0 | 4604 |
2015-11-11 00:00:00.0 | 4608 |
2015-11-12 00:00:00.0 | 4612 |
2015-11-13 00:00:00.0 | 4616 |
2015-11-14 00:00:00.0 | 4620 |
So, everything is golden on 10.29,10.30,10.31, but then I get two reading values on 11.1, no reading for 11.3, no reading on 11.6, and again double readings on 11.8 and no reading on 11.9. Calculating a sum of consumption across a large timeframe would still give me a real value, but my ultimate BI solution wants to tie this data together at a *daily level* to things like weather and/or customer events. For that to work, I need one day for 11.1 / 11.8 instead of two, and one row for 11.3, 11.6, 11.9 instead of zero. Ideally I'd *also* find a way to split up my readings so that I can give an average consumption for the double days and missing days; i.e., just creating a row and giving it 0 consumption would make my output look awful.
I've been fighting with the multi-row tool a lot trying to figure this out and have solved one of the permutations, but was hoping that someone out there had a better way to slog through it.
Thanks!
brian
Solved! Go to Solution.
Hi Brian,
You are in the right Direction
not sure what was the challenge you faced but i have attached sample workflow using Multi row tool
hope you are expecting the final output something like this below :-)
Consumption Date | Reading |
10/29/2015 | 4491 |
10/30/2015 | 4516 |
10/31/2015 | 4537 |
11/1/2015 | 4543 |
11/2/2015 | 4551 |
11/3/2015 | 4557 |
11/4/2015 | 4561 |
11/5/2015 | 4568 |
11/6/2015 | 4572 |
11/7/2015 | 4578 |
11/8/2015 | 4595 |
11/9/2015 | 4597 |
11/10/2015 | 4604 |
11/11/2015 | 4608 |
11/12/2015 | 4612 |
11/13/2015 | 4616 |
11/14/2015 | 4620 |
Hey Brian!
I had to do this for a project myself. Attached is the workflow that you are looking for. It compares data and looks for missing dates in your series, then inputs those dates with 0 for those dates. It also keeps all fo the "dup" records so you can get an avg using the summaraize tool through another data stream, if you wanted to.
email me if you have questions!
jack.morgan@schwab.com
Oh, these would be the final outputs:
All data points:
Consumption Date Reading
2015-10-29 4491
2015-10-30 4516
2015-10-31 4537
2015-11-01 4551
2015-11-01 4543
2015-11-02 4557
2015-11-03 0
2015-11-04 4568
2015-11-04 4561
2015-11-05 4572
2015-11-06 0
2015-11-07 4578
2015-11-08 4597
2015-11-08 4595
2015-11-09 0
2015-11-10 4604
2015-11-11 4608
2015-11-12 4612
2015-11-13 4616
2015-11-14 4620
Daily Averages:
Consumption Date Avg_Reading
2015-10-29 4491
2015-10-30 4516
2015-10-31 4537
2015-11-01 4547
2015-11-02 4557
2015-11-03 0
2015-11-04 4564.5
2015-11-05 4572
2015-11-06 0
2015-11-07 4578
2015-11-08 4596
2015-11-09 0
2015-11-10 4604
2015-11-11 4608
2015-11-12 4612
2015-11-13 4616
2015-11-14 4620
Hi Jack -
Thank you for the rapid response and sample workbook. I'm poking around with it now and it is pretty impressive. +100 karma awarded.
Hi S_pichaipillai -
Thank you for the rapid response and detailed workbook. Great stuff.
brian