Hi,
I have a specific condition that I'm struggling with and wondering if anyone is able to help.
I have a set of input data as below:
(I have a bunch of date in that Date field (repeating multiple times) and ID field that also repeats multiple times
Project ID | Date | Status |
:00027408 | 11/7/2019 | |
:00027408 | 11/14/2019 | |
:00027408 | 11/21/2019 | |
:00027408 | 11/28/2019 | |
:00027408 | 12/5/2019 | Not Rated |
:00027408 | 12/12/2019 | Not Rated |
:00027408 | 12/19/2019 | Not Rated |
:00027408 | 1/9/2020 | Not Rated |
:00035276 | 11/7/2019 | Not Rated |
:00035276 | 11/14/2019 | Not Rated |
:00035276 | 11/21/2019 | Not Rated |
:00035276 | 11/28/2019 | Not Rated |
:00035276 | 12/5/2019 | Not Rated |
:00035276 | 12/12/2019 | Not Rated |
:00035276 | 12/19/2019 | Not Rated |
:00035276 | 1/9/2020 | Not Rated |
:00037253 | 11/7/2019 | Not Rated |
:00037253 | 11/14/2019 | Not Rated |
:00037253 | 11/21/2019 | Not Rated |
:00037253 | 11/28/2019 | Not Rated |
:00037253 | 12/5/2019 | Not Rated |
:00037253 | 12/12/2019 | Not Rated |
:00037253 | 12/19/2019 | Not Rated |
:00037253 | 1/9/2020 | Not Rated |
:00037263 | 11/7/2019 | Not Rated |
:00037263 | 11/14/2019 | Not Rated |
:00037263 | 11/21/2019 | Not Rated |
:00037263 | 11/28/2019 | Not Rated |
:00037263 | 12/5/2019 | Not Rated |
:00037263 | 12/12/2019 | Not Rated |
:00037263 | 12/19/2019 | Not Rated |
:00037263 | 1/9/2020 | Not Rated |
:00038804 | 11/7/2019 | Yellow |
:00038804 | 11/14/2019 | Yellow |
:00038804 | 11/21/2019 | |
:00038804 | 11/28/2019 | Yellow |
:00038804 | 12/5/2019 | Yellow |
:00038804 | 12/12/2019 | Yellow |
:00038804 | 12/19/2019 | Yellow |
:00038804 | 1/9/2020 | Yellow |
:00039836 | 11/7/2019 | |
:00039836 | 11/14/2019 | Not Rated |
:00039836 | 11/21/2019 | Not Rated |
:00039836 | 11/28/2019 | Not Rated |
:00039836 | 12/5/2019 | Not Rated |
:00039836 | 12/12/2019 | Not Rated |
:00039836 | 12/19/2019 | Not Rated |
:00039836 | 1/9/2020 | Not Rated |
I initially had to use transpose to get the date field turn into their own columns and Status field as their records (which Im able to do successfully)
then i was asked to also count the number of occurance/frequency of the "Yellow" from that status field. (Only when there's Yellow in consetutive rows and no other values appear in between, if there is (like my example below, then it will start counting for Yellow again)
I need to also figure out how many times that Yellow was repeated for the particular ID in the particular date.
Below is the expected results:
Project ID | W | 11/7/2019 | 11/14/2019 | 11/21/2019 | 11/28/2019 | 12/5/2019 | 12/12/2019 | 12/19/2019 | 1/9/2020 |
:00027408 | 0 | Not Rated | Not Rated | Not Rated | Not Rated | ||||
:00035276 | 0 | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated |
:00037253 | 0 | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated |
:00037263 | 0 | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated |
:00038804 | 5 | Yellow | Yellow | Yellow | Yellow | Yellow | Yellow | Yellow | |
:00039836 | 0 | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated | Not Rated |
you will see in the example for ID 00038804 that Yellow counts as 5 rather than 7 because there was a blank (or could be any other values) in between it.
Please let me know if my explanation is confusing.
Thanks in advance!
Solved! Go to Solution.
I'm not very familiar with generate row field. But would I just use the summarize took to find the min and max?(would these be in one summarize or two?) And how would I apply these to the generate tool?
Sorry for asking a lot of questions!
And thank you so much for all the help
Hi @hal_dal, further looking at your data I realized Generate Rows won't be a good fit (since you don't have continuous dates). Instead I think what you are looking for is to bring it empty rows for IDs with missing dates, correct? e.g. :00033559 only has one date row and you are looking to introduce additional dummy rows for the dates it is missing. If yes, in the attached approach I compute unique dates and IDs and append them to each other to produce a universe that has every possible combination of ID + Date. This is then joined back to the original dataset to fill-in the missing gaps in data. The rest of the workflow is the same as the previous proposals.
Does this get you that much closer to your destination state? Please feel comfortable asking as many questions as you need.