I am working on a workflow which requires me to take the median value of data field. This is possible with the Transform>Summarize tool but not avaliable in the In-Database>Summarize In-DB tool. Due to the size of data I am working with, In-Database calculations will be needed. Does anyone know a workaround to allow In-DB median summarization calculations? For note, we are using a Teradata database.
Solved! Go to Solution.
I dont know teradata particularly but could you use a in-db formula to add a row number partitioned and sorted in the same way as the summarize you want.
You can then use the summarize in-db to get the maximum row number and then find the median by selecting joining back to half this value.
Thanks for the suggestion. I was able to get it working by using the using the formula tool and ROW_NUMBER function to implemente the row number partition and sort. The summarized tool then found the max row number followed by a formulat tool to divid this by two. I forced this new field to be an integer to ensure I only had whole numbers. I then joined back with an inner join and it worked. Thanks for the quick answer.