Hello, hoping someone could help with the following I've been messing around with this for a little while and cant seem to figure out a solution...
I have a list of accounts that I need to fill in the annual income based on the most recent date (if applicable). there a few rows of examples I have on my sheet I added a column (MSTR_ANN_INCOME) to try to get a single value for the asked & not answered and Not asked as those would need to be replaced as well.
For Examples Row 1 is the header column…
The 1st thing I would want to look for is if the group (of MSTR_NUM) all have no then the annual income would be filled in with the number
Example Rows 2-4 – because 2 & 4 are NO it would fill in 2-4 with 50k – 100k
The 2nd thing I would want to look for is if there are 2 different MSTR_ANN_INCOME (besides the NO) that it picks the most recent date and fills in all with the most recent date Annual income.
Example Rows 5-7 because 6&7 have different annual incomes but row 6 has a more recent date rows 5-7 would be filled in with 25k – 50k
The 3rd thing I would want to look for is if there are no dates (NULL) then it would pick the highest amount and fill in with the highest.
Example Rows 19 – 21 because 20&21 are missing dates I would want it to take the 250k – 500k and fill in 19-20 with 250k – 500k
Rows 8-18 are additional data
Below is the starting chart
MSTR_NUM | ANNUAL_INC_DESC | ACCT_OPEN_DT | NAF_or_BROK | MSTR_ANN_INCOME |
178439 | 1894 | NO | ||
178439 | 50K - 100K | 10/30/2017 | JV737 | 50K - 100K |
178439 | ASKED & REFUSED | 93857 | NO | |
21958 | 2/18/2014 | 3312 | NO | |
21958 | 25K - 50K | 6/19/2019 | H70198 | 25K - 50K |
21958 | 50K - 100K | 4/9/2002 | 2092 | 50K - 100K |
84756 | 7/12/2018 | 4754 | NO | |
84756 | 5/20/2020 | 5816 | NO | |
84756 | 25K - 50K | 7/23/2007 | 9631 | 25K - 50K |
84756 | ASKED & REFUSED | 7/26/2017 | 4090 | NO |
84756 | NOT ASKED | 10/6/2015 | F54818 | NO |
37372 | 16830 | NO | ||
37372 | 250K - 500K | 5/31/2019 | HV60705 | 250K - 500K |
37372 | 500K+ | 6/10/2019 | 5183387 | 500K+ |
68673 | 445608 | NO | ||
68673 | 100K - 250K | 1/24/2020 | JA87640 | 100K - 250K |
68673 | ASKED & REFUSED | 42376 | NO | |
94837 | 5/19/2008 | 156868 | NO | |
94837 | 100K - 250K | 49464 | 100K - 250K | |
94837 | 250K - 500K | 1768791 | 250K - 500K |
Below the last row (IDEAL COLUMN) is what i would want the results to be (the MSTR_ANN_INCOME row could be deleted)
MSTR_NUM | ANNUAL_INC_DESC | ACCT_OPEN_DT | NAF_or_BROK | MSTR_ANN_INCOME | IDEAL COLUMN |
178439 | 1894 | NO | 50K - 100K | ||
178439 | 50K - 100K | 10/30/2017 | JV737 | 50K - 100K | 50K - 100K |
178439 | ASKED & REFUSED | 93857 | NO | 50K - 100K | |
21958 | 2/18/2014 | 3312 | NO | 25K - 50K | |
21958 | 25K - 50K | 6/19/2019 | H70198 | 25K - 50K | 25K - 50K |
21958 | 50K - 100K | 4/9/2002 | 2092 | 50K - 100K | 25K - 50K |
84756 | 7/12/2018 | 4754 | NO | 25K - 50K | |
84756 | 5/20/2020 | 5816 | NO | 25K - 50K | |
84756 | 25K - 50K | 7/23/2007 | 9631 | 25K - 50K | 25K - 50K |
84756 | ASKED & REFUSED | 7/26/2017 | 4090 | NO | 25K - 50K |
84756 | NOT ASKED | 10/6/2015 | F54818 | NO | 25K - 50K |
37372 | 16830 | NO | 500K+ | ||
37372 | 250K - 500K | 5/31/2019 | HV60705 | 250K - 500K | 500K+ |
37372 | 500K+ | 6/10/2019 | 5183387 | 500K+ | 500K+ |
68673 | 445608 | NO | 100K - 250K | ||
68673 | 100K - 250K | 1/24/2020 | JA87640 | 100K - 250K | 100K - 250K |
68673 | ASKED & REFUSED | 42376 | NO | 100K - 250K | |
94837 | 5/19/2008 | 156868 | NO | 250K - 500K | |
94837 | 100K - 250K | 49464 | 100K - 250K | 250K - 500K | |
94837 | 250K - 500K | 1768791 | 250K - 500K | 250K - 500K |
Any help would be greatly appreciated.
thank you.
Solved! Go to Solution.
Hi @Nmassarone ,
This is somewhat convoluted but I think I've built the required logic.
I've attached the workflow and I get the correct results.
I hope this helps,
M.
@mceleavey Thank you so much I've been spinning my head over this for hours.... just 1 quick thing the results for MSTR_NUM 21958 looks like its pulling the lowest day (furthest back date) where we would want to pull the most recent date. is there a way to just flip flop instead of pulling the furthest back date the most recent date then i think this would be 100 percent perfect for what i need.
For example below the IDEAL_COLUMN should have the ANNUAL_INC_DESC of the most recent date (6/19/2019) so the 25k - 50k would be in the ideal column for this.
Again thank you so much for this i really appreciate your effort on this. if there is just something i can adjust in the workflow to pull most recent date instead of furthest back just let me know
thanks again
@mceleavey @Qiu thank you both for the replies i put both into test to see and it seems like I'm definitely getting close to the final result I think there's just a few tweaks...
The date if there are multiple of the same MSTR_NUM with different ANN_INCOME_DESC if it could pull the most recent date i think @Qiu thank you i ran yours i think its doing the same thing where it might not always pull in the most recent date (the most recent date might not always have the highest annual income) it could be lower.
i attached a new table with additional information and rows to test if i could get in the most recent date if there are multiples. i think if this had that 1 tweak this would be perfect.
thank you both for the response and your time on it its really appreciated.
example starting table
MSTR_NUM | ANNUAL_INC_DESC | ACCT_OPEN_DT | NAF_or_BROK | MSTR_ANN_INCOME |
178439 | 1894 | NO | ||
178439 | 50K - 100K | 10/30/2017 | JV737 | 50K - 100K |
178439 | ASKED & REFUSED | 93857 | NO | |
21958 | 2/18/2014 | 3312 | NO | |
21958 | 25K - 50K | 6/19/2019 | H70198 | 25K - 50K |
21958 | 50K - 100K | 4/9/2002 | 2092 | 50K - 100K |
84756 | 7/12/2018 | 4754 | NO | |
84756 | 5/20/2020 | 5816 | NO | |
84756 | 25K - 50K | 7/23/2007 | 9631 | 25K - 50K |
84756 | ASKED & REFUSED | 7/26/2017 | 4090 | NO |
84756 | NOT ASKED | 10/6/2015 | F54818 | NO |
37372 | 16830 | NO | ||
37372 | 250K - 500K | 5/31/2019 | HV60705 | 250K - 500K |
37372 | 500K+ | 6/10/2019 | 5183387 | 500K+ |
68673 | 445608 | NO | ||
68673 | 100K - 250K | 1/24/2020 | JA87640 | 100K - 250K |
68673 | ASKED & REFUSED | 42376 | NO | |
94837 | 5/19/2008 | 156868 | NO | |
94837 | 100K - 250K | 49464 | 100K - 250K | |
94837 | 250K - 500K | 1768791 | 250K - 500K | |
444978 | 12/27/2013 | 354298 | NO | |
444978 | 10/18/2016 | 4123807 | NO | |
444978 | 0 - 25K | 8/1/2014 | AC960067 | 0 - 25K |
444978 | 100K - 250K | 5/7/1998 | 205487 | 100K - 250K |
444978 | 50K - 100K | 4/29/2020 | V089780 | 50K - 100K |
1148939 | 9/11/1997 | 856359 | NO | |
1148939 | 50K - 100K | 1/17/2019 | JV793208 | 50K - 100K |
1148939 | 25K - 50K | 7/23/2021 | 1708665 | 25K - 50K |
Desired outcome table (as you can see MSTR_NUM 444978 the most recent date 4/29/2020 has an annual income of 50K - 100K which would be in the ideal column but isnt always the highest or lowest number then MSTR_NUM 1148939 i added as additional data as well.
MSTR_NUM | ANNUAL_INC_DESC | ACCT_OPEN_DT | NAF_or_BROK | MSTR_ANN_INCOME | IDEAL_COLUMN |
178439 | 1894 | NO | 50K - 100K | ||
178439 | 50K - 100K | 10/30/2017 | JV737 | 50K - 100K | 50K - 100K |
178439 | ASKED & REFUSED | 93857 | NO | 50K - 100K | |
21958 | 2/18/2014 | 3312 | NO | 25K - 50K | |
21958 | 25K - 50K | 6/19/2019 | H70198 | 25K - 50K | 25K - 50K |
21958 | 50K - 100K | 4/9/2002 | 2092 | 50K - 100K | 25K - 50K |
84756 | 7/12/2018 | 4754 | NO | 25K - 50K | |
84756 | 5/20/2020 | 5816 | NO | 25K - 50K | |
84756 | 25K - 50K | 7/23/2007 | 9631 | 25K - 50K | 25K - 50K |
84756 | ASKED & REFUSED | 7/26/2017 | 4090 | NO | 25K - 50K |
84756 | NOT ASKED | 10/6/2015 | F54818 | NO | 25K - 50K |
37372 | 16830 | NO | 500K+ | ||
37372 | 250K - 500K | 5/31/2019 | HV60705 | 250K - 500K | 500K+ |
37372 | 500K+ | 6/10/2019 | 5183387 | 500K+ | 500K+ |
68673 | 445608 | NO | 100K - 250K | ||
68673 | 100K - 250K | 1/24/2020 | JA87640 | 100K - 250K | 100K - 250K |
68673 | ASKED & REFUSED | 42376 | NO | 100K - 250K | |
94837 | 5/19/2008 | 156868 | NO | 250K - 500K | |
94837 | 100K - 250K | 49464 | 100K - 250K | 250K - 500K | |
94837 | 250K - 500K | 1768791 | 250K - 500K | 250K - 500K | |
444978 | 12/27/2013 | 354298 | NO | 50K - 100K | |
444978 | 10/18/2016 | 4123807 | NO | 50K - 100K | |
444978 | 0 - 25K | 8/1/2014 | AC960067 | 0 - 25K | 50K - 100K |
444978 | 100K - 250K | 5/7/1998 | 205487 | 100K - 250K | 50K - 100K |
444978 | 50K - 100K | 4/29/2020 | V089780 | 50K - 100K | 50K - 100K |
1148939 | 9/11/1997 | 856359 | NO | 25K - 50K | |
1148939 | 50K - 100K | 1/17/2019 | JV793208 | 50K - 100K | 25K - 50K |
1148939 | 25K - 50K | 7/23/2021 | 1708665 | 25K - 50K | 25K - 50K |
thank you again for both your help appreciate it.
Hi @Nmassarone ,
are you sure you were looking at mine?
I just ran it again and I get this:
This is the most recent value.
M.
Hi @mceleavey that's so weird so when i run it with the original table it seems to kick out with the 25k-50k but when i load in the other table that i had in the follow up (posted below) it kicks out with 50k-100k .... i guess another thing to note as well too when i load the workflow it says it was created with a newer version of Alteryx i seem to have an older version and i do get these warnings when running it not sure if that has something to do with it.
are you able to use the below table to see if you get the same results you have been getting?
this is the warning i get
When i run the original i do get 25k-50k
but when i run my actual sheet and this secondary sheet it comes out with the 50k-100k
i tried switching around the date on my secondary sheet because i noticed my dates started out as YYYY-MM-DD so i converted them to MM/DD/YYYY but seem to still be getting the same results im not sure what is throwing this off.
are you able to use this secondary sheet and see if you still get the same results on your workflow maybe im just messing something up on mine
Secondary Sheet
MSTR_NUM | ANNUAL_INC_DESC | ACCT_OPEN_DT | NAF_or_BROK | MSTR_ANN_INCOME |
178439 | 1894 | NO | ||
178439 | 50K - 100K | 10/30/2017 | JV737 | 50K - 100K |
178439 | ASKED & REFUSED | 93857 | NO | |
21958 | 2/18/2014 | 3312 | NO | |
21958 | 25K - 50K | 6/19/2019 | H70198 | 25K - 50K |
21958 | 50K - 100K | 4/9/2002 | 2092 | 50K - 100K |
84756 | 7/12/2018 | 4754 | NO | |
84756 | 5/20/2020 | 5816 | NO | |
84756 | 25K - 50K | 7/23/2007 | 9631 | 25K - 50K |
84756 | ASKED & REFUSED | 7/26/2017 | 4090 | NO |
84756 | NOT ASKED | 10/6/2015 | F54818 | NO |
37372 | 16830 | NO | ||
37372 | 250K - 500K | 5/31/2019 | HV60705 | 250K - 500K |
37372 | 500K+ | 6/10/2019 | 5183387 | 500K+ |
68673 | 445608 | NO | ||
68673 | 100K - 250K | 1/24/2020 | JA87640 | 100K - 250K |
68673 | ASKED & REFUSED | 42376 | NO | |
94837 | 5/19/2008 | 156868 | NO | |
94837 | 100K - 250K | 49464 | 100K - 250K | |
94837 | 250K - 500K | 1768791 | 250K - 500K | |
444978 | 12/27/2013 | 354298 | NO | |
444978 | 10/18/2016 | 4123807 | NO | |
444978 | 0 - 25K | 8/1/2014 | AC960067 | 0 - 25K |
444978 | 100K - 250K | 5/7/1998 | 205487 | 100K - 250K |
444978 | 50K - 100K | 4/29/2020 | V089780 | 50K - 100K |
1148939 | 9/11/1997 | 856359 | NO | |
1148939 | 50K - 100K | 1/17/2019 | JV793208 | 50K - 100K |
1148939 | 25K - 50K | 7/23/2021 | 1708665 | 25K - 50K |
When i load the secondary sheet into Alteryx it loads in with YYYY-MM-DD
this is the Input tool sheet
Thank you
Hi @Nmassarone ,
I get the same as before:
No problems at my end, it seems you might need to upgrade your version of Alteryx.
M.
ok yeah i think that might be my issue then because i get these warnings that it cannot convert 50k-100k to date/time which i think are the rows that are screwing up
unfortunately I'm on a company computer and cannot update to the latest version until the company gets the software so im stuck on 2020.4 for now.
i will say the majority of my full list converted correctly its only a handful and i should be able to identify them so i really really appreciate you taking the time to go through this thank so much.
actually @mceleavey looking at your 2nd one see the very last group 1148939 ... the most recent date 7/23/2021 is annual income of 25k-50k but in the ideal column its showing 50k-100k
thats what im a little confused on.
also the group 444978 the most recent date is 4/29/2020 which is 50k-100k but displays 100k - 250k in the ideal column